-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ADAP-1051] [Feature] Drop the temporary table upon a successful incremental refresh #1036
Comments
From a user perspective, it would be awesome if dbt just drops temporary table like you mentioned. @mikealfare What do you think about the following acceptance criteria? Proposed acceptance criteria
We can re-use this acceptance criteria for other types of materializations as well (not just incremental models). |
Questions about temp table vs. expiring permanent table vs. non-expiring permanent table@mikealfare is there any way to create the intermediate table so that it is droppable? Ideally, we'd avoid any intermediate objects persisting beyond the materialization (even if they they expire after X hours). Is there something about adding an expiration to a permanent table that makes it harder to drop? Or are permanent tables inherently hard to drop in BigQuery? Is using a real BigQuery Current implementationTo the best that I can tell, we are not creating temp tables. Rather, it looks like the intermediate table is a permanent table with an expiration of 12 hours (like you mentioned). Side note: To make the code easier to read, it would be nice to replace references to "temp" or "temporary" to be either "intermediate" or "expiring" instead (to the extent either/or are accurate). Full code traceOptions for acceptance criteriaThese are a mix of both behavior and implementation detail. Option 1 includes the ideal state that all intermediate objects are dropped by the end of the materialization. Options 2 and 3 explore what we could do if Option 1 isn't possible or practical for some reason. Option 1
Alternatively, use a (non-permanent) temporary table somehow. Option 2
Option 3Add detail to one of the GitHub issues that we tried dropping the "temp" table and it didn't work (for some unknown reason).
|
@dbeatty10 @mikealfare |
Thanks for this insight @vinit2107 🧠 It seems like there are a couple things dbt developers would like to avoid as it relates to these intermediate tables:
I think Option 1 outlined above would achieve both of these. And I think Option 1 is the thing we should pursue with full intent and effort unless it is "impossible" somehow. But we can definitely add your idea as another alternative. Here's the two options that we haven't numbered yet, in no particular order: Option 4Be able to define the dataset in which the intermediate table(s) gets materialized within. Option 5Make the expiration duration of the intermediate table so that it is not hard-coded to 12 hours but can be configured instead. To fit in with the naming convention of This is the option originally outlined in #1036 |
Per @Fleid in an internal Slack conversation here, we don't have the bandwidth to tackle this. So labeling this as Ideally, a PR that solves this would:
See here for an implementation idea. And see here for ideal acceptance criteria. |
Hi @dbeatty10 , Please let me know if you need me to update anything else. Thanks!! |
Is this your first time submitting a feature request?
Background context
The current behavior is for
dbt-bigquery
to create a "temporary" table that expires after 12 hours when performing an incremental model update. This duration is hard-coded here.Describe the feature
Drop the temporary table upon a successful incremental refresh. There are other issues in this repo that speak to that approach. And using that approach would resolve this issue.
Describe alternatives you've considered
An alternative approach would be to make the expiration duration configurable, particularly for incremental refreshes that occur more frequently than every 12 hours.
Who will this benefit?
This benefits folks who refresh incremental models more frequently than every 12 hours. It would also benefit folks who are conscious of incremental spend as a result of these tables.
Are you interested in contributing this feature?
No response
Anything else?
See also: #1032
The text was updated successfully, but these errors were encountered: