-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry on 503 #1408
Retry on 503 #1408
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the dbt-bigquery contributing guide. |
1 similar comment
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the dbt-bigquery contributing guide. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a self-review for context. Also, given this whole PR is retrying transient errors, what is expected from a testing effort? We already have some unit tests that verify that we retry errors as expected, but they don't test them in the context of other calls. e.g. we don't verify that if we try to load a file into BQ and we get a 503, that we in fact retry. Would testing this be going too far?
dbt/adapters/bigquery/retry.py
Outdated
from google.api_core.future.polling import DEFAULT_POLLING | ||
from google.api_core.retry import Retry | ||
from google.cloud.bigquery.retry import DEFAULT_RETRY | ||
from google.cloud.exceptions import BadGateway, BadRequest, ServerError | ||
from google.cloud.bigquery.retry import DEFAULT_JOB_RETRY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, we're retrying jobs, we should use this default instead of DEFAULT_RETRY
. We still use DEFAULT_RETRY
for the client factories where it makes more sense.
deadline=self._job_deadline, | ||
on_error=_create_reopen_on_error(connection), | ||
|
||
retry = DEFAULT_JOB_RETRY.with_delay(maximum=3.0).with_predicate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were always attaching a deadline here, which could be None
if the user did not set one. I believe this would cause the retry to go on for quite a while, depending on what job_retries
is set to. We should instead inherit the default from DEFAULT_JOB_RETRY
.
minimum=1.0
is the default.
The default for maximum
is 60
. Should we adjust this?
4887a89
to
6e27cbb
Compare
We should wait until #1431 is merged and rebase this on that. They address similar things and that PR covers a lot of what was initially implemented here. |
Marking this as a draft until #1431 is merged. |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.9.latest 1.9.latest
# Navigate to the new working tree
cd .worktrees/backport-1.9.latest
# Create a new branch
git switch --create backport-1408-to-1.9.latest
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 a219818c5a38339568bfb4e561405cfe8f6732eb
# Push it to GitHub
git push --set-upstream origin backport-1408-to-1.9.latest
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.9.latest Then, create a pull request where the |
1 similar comment
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.9.latest 1.9.latest
# Navigate to the new working tree
cd .worktrees/backport-1.9.latest
# Create a new branch
git switch --create backport-1408-to-1.9.latest
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 a219818c5a38339568bfb4e561405cfe8f6732eb
# Push it to GitHub
git push --set-upstream origin backport-1408-to-1.9.latest
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.9.latest Then, create a pull request where the |
* add default retry on all client factories, which includes 502 and 503 errors * update retries to use defaults and ensure that a timeout or deadline is set (cherry picked from commit a219818)
* Retry on 503 (#1408) * add default retry on all client factories, which includes 502 and 503 errors * update retries to use defaults and ensure that a timeout or deadline is set (cherry picked from commit a219818) * remove hatch.toml --------- Co-authored-by: Mike Alfare <[email protected]>
resolves #682
Problem
We were missing some retry scenarios that BigQuery added over time. We also were not retrying all client factories.
Solution
Checklist