-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ADAP-1027] [CT-3367] [Feature] BigQuery Foreign / Primary Key Support #1016
Comments
Thanks for reaching out @ckchow ! In dbt v1.5, we added the ability to define constraints like Wanna check it out and see if it meets the needs that you are intending? DetailsYou can read more here:
The general form of a primary key constraint looks like the following. models:
- name: <model_name>
# required
config:
contract:
enforced: true
# model-level constraints
constraints:
- type: primary_key
columns: [<first_column>, <second_column>, ...] |
It's great that this seems to be supported but the requirement to turn on full contract enforcement to enable adding constraints is a pain point, as that comes with a much heavier set of requirements. May be eased if/when partial contracts turn up.
|
Also, with constraints, there's no possibility to define multiple primary keys. create or replace table xyz
id NUMERIC not null primary key not enforced,
api_token STRING,
email STRING,
name STRING,
created_at DATETIME,
updated_at DATETIME,
password STRING,
valid_from DATETIME,
valid_to DATETIME,
is_latest BOOLEAN,
partition_date DATE And if you'd try to add PK on an additional column, for example, the |
@mitalauskas how are you defining it? I noticed that the output was different defining model level constraints vs column level constraints, does it work if you define your multiple PKs in the model level constraints rather than column level constraints on each of the PKs? |
@elyobo agreed that it is definitely is a heavier set of requirements and is a pain point if you just want to manage adding primary keys or foreign keys. But we're not planning on adding constraints separately at this time, so I'm going to close this as "not planned". @mitalauskas you can configure a compound primary key like this by specifying it at the model level (rather than the column level): models:
- name: my_model
config:
contract:
enforced: true
# model-level constraints
constraints:
- type: primary_key
columns: [pk1, pk2]
columns:
- name: pk1
data_type: int64
- name: pk2
data_type: string |
Thanks for the update @dbeatty10. For those wanting the constraints for other reasons (e.g. join optimisations on BigQuery), I haven't tried but assume it could be wired up in post hooks. |
Is this your first time submitting a feature request?
Describe the feature
BigQuery recently added the ability to define (unenforced!) PK/FK relationships between tables. https://cloud.google.com/blog/products/data-analytics/join-optimizations-with-bigquery-primary-and-foreign-keys
dbt should be able to manage the expression of these constraints on BigQuery models so that the BigQuery analyzer can take advantage of optimizations that these relations enable.
Describe alternatives you've considered
It's possible to manually insert these references manually, but they will be eliminated every time the warehouse update runs.
Who will this benefit?
This will significantly benefit most BigQuery users of dbt. Dbt by default generates view models, and the key constraints will allow the analyzer to perform optimizations on complex queries that are executed by joining multiple views that are themselves joins.
Are you interested in contributing this feature?
I could help, I'm not really versed in how the dbt codebase is organized however.
Anything else?
Bigquery documentation for key DDL: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#alter_table_add_primary_key_statement
The text was updated successfully, but these errors were encountered: