Checkpoints not used by write_deltalake? #2555
Unanswered
VLomonovskis
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello.
I have several processes that uses the same code and append data to the same Delta Table. Those processes run in parallel. I append data using write_deltalake and use rust engine to merge schema.
As several processes add data, performance degrading and upload takes more time. As I understand it happens because increases number of transaction log files. However, when I create checkpoint ( using delta_table.checkpoint() ), it does not improve performance and looks like write_deltalake reads all the logs before checkpoint. Can this behaviour be changed?
I did see discussions about checkpoint, but they where about checkpoint creation. In my case, checkpoints not used even when created.
Beta Was this translation helpful? Give feedback.
All reactions