-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Bigquery Emulator settings to be set #1017
base: main
Are you sure you want to change the base?
Conversation
Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA. In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR. CLA has not been signed by users: @OTooleMichael |
Does anything need to be done to retrigger now that the CLA is signed? |
I need the same functionality, |
Hey @orlevii - I saw that. Furthermore the Bigquery emulator works happily with any creds, in the end of the day it just ignores them. I think it suggests that way in its demo code, because in a vacuum where one must pick a Credential type it makes the most sense (DBT though has the other auth's implemented). And further to that again the emulator you mentioned wouldn't be the only (and is not my only target), so if that is needed it could be a follow up PR. hopefully the team reviews the PR and I can add or not according to their desires / whatever will get it merged quickest. :) |
Hi, I would love to see this merged ! |
Hi, got a really similar use case, looking forward getting this merged! |
… should be passed on to BQ
@OTooleMichael Thanks for the PR! @MichelleArk and I tried taking this for a spin alongside goccy/bigquery-emulator. We found a few issues while using the two together:
I don't think (1) + (2) are hard blockers — we could manually create the schema/dataset, and we just avoided using seeds — but I do think (3) makes it impossible to use dbt with the emulator we tried. Questions:
|
@jtcohen6 and @MichelleArk, thank you both sincerely for taking the time to review the PR, and I apologise for the lingering issue with one of the unit tests. Rest assured, I'll address it promptly. In response to your queries: Overall, I believe the PR aligns with our objectives and should be merged. It seems there might be some slight misinterpretation or oversight regarding its purpose. :) This PR essentially enables users to utilise an Emulator; the specifics of the emulator's functionality aren't within DBT's purview. To draw an analogy, consider if BigQuery were to malfunction, like running its SUM() function incorrectly. As another example, which is currently possible, a DBT user employing Postgres could opt for an in-memory PG emulator by changing the host, irrespective of its full functionality (which is often limited). If the emulator fails to replicate BigQuery's behaviour, it's an issue for the emulator's developers to address. Moreover, there are additional use cases to consider. My primary motivation was facilitating an in-house emulator and proxy setup. Both of these are made achievable with minimal effort through this PR. In my view, the internals of non-DBT server elements don't fall under the direct responsibility of the DBT team. I've developed a parser that validates SQL post-Jinja processing, effectively identifying references and syntax errors without direct access to BigQuery or data movement. This approach significantly expedites CI/CD processes, often obviating the need for a live connection. At the moment the Snowflake DBT connector's endpoint override feature is serving as workaround - the CI profile is set as Snowflake dialect and the endpoint pointed at the emulator server, then the emulator does the extra work of translating the DBT queries back from the Snowflake dialect to BQ before starting its true validation work. In a previous project involving Snowflake, I employed similar techniques using in-house SQLFluff rules for security and design linting post-Jinja processing. Additionally, various proxying use cases emerge, where a server intermediates requests to and from BigQuery, implementing checks for deprecation warnings, security, permissions, and monitoring. For instance:
In essence, there are numerous reasons to redirect queries to different endpoints, applicable across CI/CD, development, and production environments. Some are directly related to DBT, while others are broader system requirements. Simplifying wider CI setups by patching a single ENV variable (e.g., BQ_URL) for the entire system, including DBT, Airflow, etc., underscores the versatility and value of this PR. I'm happy to hop on a call / go through more examples / code if needed |
@jtcohen6 @MichelleArk I'm disappointed that we have not seen this merged, or at least a proper reply to @OTooleMichael 's well-written reply. There seems to be enough desire from the community to get this one through and as pointed out already there are numerous reasons why a proxy would make sense. For connectors that need the hostname specified we can already do this so it's hard to appreciate well the arguments against it. |
Hello, our team needs this feature to perform dbt unit tests using a BigQuery emulator locally. We hope this feature will be released soon ! |
Hi, Any update on this? When will it be available? We have a use case for using the emulator and are keen to get the updated adapter to allow this to happen. |
Looking at the tumbleweeds from dbt team's side here I guess we can only hope. But it is getting a bit weird at this stage, I have to say. |
Hey all - I just responded to @OTooleMichael in the dbt Community Slack yesterday:
@MartinSahlen @kyungsoochoi984 @nrushforth Have you been able to test this branch locally, in your own environments, along with a BigQuery emulator? Or do you have other concrete use cases for a BQ proxy today? |
resolves #358
docs dbt-labs/docs.getdbt.com/#
This expands out optionally allowing to
api_endpoint
to be set. This is supported Biqquery way of overriding the http endpoint, similar to Snowflake. This is needed to connecting to an emulator/proxy - in a similar way to Snowflake. Issue 358 references this.Problem
Solution
This simply adds a key to the config and sets the connection option if set.
Checklist