-
-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CHECKPOINT;
to batch deletes for SQL Server
#702
Comments
Hello @RudeySH , I will ask my developer to look at it. I'm not sure yet if that's something we really want to add or not in our library. As for your other questions, if you enable a BatchSize, the Best Regards, Jon Sponsorship Performance Libraries Runtime Evaluation |
Using CHECKPOINT is helpful in scenarios where batch deletes (or batch updates) are deterministic and can easily be retried. Using checkpoints means giving up the atomicity ("either all occurs, or nothing occurs") of the batch operation, but you gain better performance in return. I'd love to be able to use checkpoints when I can easily retry the operation without fear of losing data or consistency. For example, I have a background job that runs daily and it deletes old records from a table. In the event the batch delete fails, I don't need it to rollback everything, I don't need it to leverage the transaction log to keep track of what I've been removing. I just want it to delete records as performant as possible. When the batch delete fails, that's OK, partially deleting old records is better then deleting no records in this scenario. When the batch delete doesn't fail, it was faster because of checkpoints keeping the transaction log small. Win-win! I understand this library is not catered to SQL Server specifically. Perhaps the proposed |
Hello @RudeySH , We checked this request and I don't think it should really belong to that feature. Otherwise, we will need to add it to every other method as well and that's currently impossible at this moment. I believe the simplest solution should be on your side to create a method that opens a connection and then create a command to enable your |
Checkpoints are a global side effect to the transaction log, so it's unclear it's a good idea. What happens if many connections issue checkpoints at the same time due to high load scenarios where lots of processes delete something? What happens if someone uses your repository method thinking it only does bulk delete, and they parallelize the bulk delete at a higher level of abstraction? etc Also, what happens if the checkpoint times out? If such an approach were to be considered, it should be a DbContextExtension method called simply |
I ran into an issue where a batch delete became extremely slow because I'm deleting millions of records. My transaction log grew very big, causing the slowdown. After reproducing the issue in SSMS, I realized that adding a CHECKPOINT statement in the WHILE loop improved performance tremendously.
I propose that a boolean option (e.g.
EnableCheckpoint
) should be added to batch deletes, making my suggestion opt-in. In some cases using CHECKPOINT is not desirable, because when using it and the batch delete fails, it will be partially committed already. I don't think the default behavior should be changed.Are batch updates implemented similarly to batch deletes? (Does this library generate a WHILE loop in the SQL for batch updates?) If so, I propose a similar option should be added to batch updates as well.
You could take it one step further and add an additional option to force the batch delete to run under Simple Recovery Mode. This apparently can make it faster compared to Full Recovery Mode. See this for more information about that: https://stackoverflow.com/a/21100285/1185136. I'm not sure this approach is safe though, I wonder what happens when multiple batch deletes are running simultaneously and both of them are changing the recovery mode back and forth...
The text was updated successfully, but these errors were encountered: