Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long make calls lock table metadata #1170

Open
ethho opened this issue Aug 21, 2024 · 6 comments · May be fixed by #1171
Open

Long make calls lock table metadata #1170

ethho opened this issue Aug 21, 2024 · 6 comments · May be fixed by #1171
Assignees
Labels

Comments

@ethho
Copy link
Contributor

ethho commented Aug 21, 2024

Bug Report

Description

A client locks table metadata for the entire duration of a make function call. When other clients attempt to drop or declare child tables, the call is blocked until the first client finishes make. This approach scales poorly with number of clients and number of child tables.

Reproducibility

Include:

Proposed Solution

As an alternative to writing a Computed.make function, allow user to write three functions:

  1. make_fetch for reading inputs
  2. make_compute, which is not run in a transaction, and is passed the return value of make_fetch
  3. make_insert, which inserts computed values using the same transaction semantics as make.

In pseudocode, these three functions will be used in the following routine as such:

if hasattr(table, "make"):
    return make()
else:
    assert hasattr(table, "make_fetch")
    assert hasattr(table, "make_compute")
    assert hasattr(table, "make_insert")
    input = make_fetch()
    conn.disconnect() # I assume this disconnect step is to ensure that make_compute cannot insert?
    result = make_compute(input)
    tx = conn.start_transaction()
    input2 = make_fetch()
    if hash(serialize(input2)) == hash(serialize(input)):
        result = make_insert(result)
        tx.commit()
        return result
    else:
        print("ERROR: inputs have changed")
        tx.abort()
        return None

Additional Research and Context

Related Issues


cc: @dimitri-yatsenko @ttngu207 @CBroz1 @samuelbray32 @peabody124

@ethho ethho added the bug label Aug 21, 2024
@ethho ethho added this to the Release 0.15.0 milestone Aug 21, 2024
@ethho ethho self-assigned this Aug 21, 2024
@dimitri-yatsenko
Copy link
Member

This will be inside populate and will follow all the conventions of populate.

Yes, it looks correct. If we want to be fancy, we can prohibit insert calls in make_fetch, insert and fetch calls from make_compute, and fetch operators from make_insert.

@dimitri-yatsenko
Copy link
Member

dimitri-yatsenko commented Aug 21, 2024

@ethho, our blob serialization serializes most types of data into binary strings. You can use a hash on the serialized data for comparing input to input2

@horsto
Copy link
Contributor

horsto commented Aug 30, 2024

I am following this. I see the #1171. Can this issue here be updated regularly when this is implemented / in a testable state? Thanks for taking care of this!

@dimitri-yatsenko
Copy link
Member

This is a high priority for multiple labs.

@horsto
Copy link
Contributor

horsto commented Jan 16, 2025

This has not been merged / solved yet, right?

@CBroz1
Copy link
Contributor

CBroz1 commented Jan 17, 2025

This has not been merged / solved yet, right?

Looks like the last commit was 5m ago

We gave users a 'check threads' tool to check for hold-ups and see whose process might be slowing things down

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants