-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the functionality of the Iceberg rewrite_manifests
procedure (e.g. in OPTIMIZE)
#14821
Comments
The optimize should do this, I'm not yet convince we need a separate procedure |
rewrite_manifests
procedurerewrite_manifests
procedure
I'm not sure yet either. Updated the description. |
rewrite_manifests
procedurerewrite_manifests
procedure (e.g. in OPTIMIZE)
Taken from Iceberg Spark Procedures Docs Here is a relative lengthy article about Iceberg which includes the reasoning behind using
On the light of the above arguments, I'm inclined to say that this metadata related functionality would need an own procedure, instead of squeezing it under |
Thank you for looking into this! This is something I am interested in as well. In my particular use case my write pattern for back populating a table does not align with the read pattern and update pattern. Rewriting the manifests is something that I think would increase read performance. I see that this was assigned to @ebyhr and am curious the current state / roadmap for this feature? |
Relates to: #9340
The Spark implementation is documented here.
When using the
append
operation rewriting manifests is done automatically at a set size defined bycommit.manifest.min-count-to-merge
, defaulting to 100. However, if write latency is important, a user may want to skip the automatic compaction and run it async to the writers.This may be done as a separate procedure, or as a part of the
OPTIMIZE
commandThe text was updated successfully, but these errors were encountered: