Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doubling of escaped single quote for Snapshot generated SQL produces failed application #250

Open
dschoel-pr opened this issue Dec 18, 2024 · 4 comments

Comments

@dschoel-pr
Copy link

Running with dbt=1.8.8
Registered adapter: fabric=1.8.8

In a snapshot using dbt_utils.generate_surrogate_key, the snapshot execution fails with a SQL based error:
('42000', "[42000] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Incorrect syntax near ''. (102) (SQLMoreResults)")

A debug level execution shows that the SQL produced to generate a staging temp view for the source data with double the amount of escaped single quotes:

EXEC('create view "silver"."mcp_InvoiceDetails__hist_snapshot_staging_temp_view" as

select 
  
    lower(convert(varchar(50), hashbytes(''''md5'''', ...
');

Some tracing shows that the following logic in the snapshot.sql macro, combined with downstream logic that packages the SQL into a string for EXEC are duplicating the attempt to add an escape single quote.

  -- Create a temporary view to manage if user SQl uses CTE
  {% set temp_snapshot_relation_sql = model['compiled_code'].replace("'", "''") %}
  {{ adapter.drop_relation(temp_snapshot_relation) }}

  {% call statement('create temp_snapshot_relation') -%}
    {{ get_create_view_as_sql(temp_snapshot_relation, temp_snapshot_relation_sql) }}
  {%- endcall %}

Tests show that removing the .replace on the top line allows the snapshot to function correctly.

Why is the replace in the logic? Can it be removed safely?

@prdpsvs
Copy link
Collaborator

prdpsvs commented Dec 21, 2024

@dschoel-pr , that's because the temp_snapshot_relation_sql is wrapped up in a view .

The view in Fabric DW is created using EXEC('create view ....'). A single quote needs to be replaced by two single quotes to execute the view correctly.

Here is an example, I tested with and there seems to be no problem.
`
EXEC('create view "dbo"."orders_snapshot__dbt_tmp_vw" as

select *,
    
CONVERT(VARCHAR(32), HashBytes(''MD5'', 
    coalesce(cast(order_id as varchar(8000)), '''')  + ''|'' + 

    coalesce(cast(order_date as varchar(8000)), '''') 
), 2)

as dbt_scd_id,
order_date as dbt_updated_at,
order_date as dbt_valid_from,
nullif(order_date, order_date) as dbt_valid_to
from (
select * from "dbt-local-tests"."dbo"."orders_snapshot_snapshot_staging_temp_view"
) sbq

;');
`

Can you share a repro with deps and commands you executed? If possible, share the logs file in the zip to validate your issue.
BTW, the dbt_utils is not fully supported. you may have to use t-sql utils.

@muhammadbokhari
Copy link

Hi @prdpsvs ,

I am facing same issue. I have got the following query in the ephemeral linked to snapshot

select {{ dbt_utils.generate_surrogate_key(['NaturalKey']) }} AS SilverBuyByPgKey, Pg, SYSDATETIME() SilverCreateDateTime, BronzeLoadDateTime from ( select CAST( Pg AS INT) AS Pg, __BronzePipelineRunDateTime AS BronzeLoadDateTime, Pg AS NaturalKey, ROW_NUMBER() OVER (PARTITION BY Pg ORDER BY __BronzePipelineRunDateTime DESC) AS RowNum from {{source ('stocks', 'table_stock')}} ) Main_Query WHERE RowNum = 1

The snapshot fails with the following error:

% dbt snapshot --select table_stock
08:02:52 Running with dbt=1.9.1
08:02:52 [WARNING]: Deprecated functionality

User config should be moved from the 'config' key in profiles.yml to the 'flags' key in dbt_project.yml.
08:02:53 Registered adapter: fabric=1.9.0
08:02:53 Unable to do partial parsing because a project dependency has been added
08:02:56 Found 21 models, 15 snapshots, 15 sources, 639 macros
08:02:56
08:02:56 Concurrency: 1 threads (target='fabric-dev')
08:02:56
08:02:58 1 of 1 START snapshot dbo.table_stock ............................ [RUN]
08:02:58 Unhandled error while executing
('42000', "[42000] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Incorrect syntax near ''. (102) (SQLMoreResults)")
08:02:58 1 of 1 ERROR snapshotting dbo.table_stock ........................ [ERROR in 0.15s]
08:02:58
08:02:58 Finished running 1 snapshot in 0 hours 0 minutes and 2.55 seconds (2.55s).
08:02:58
08:02:58 Completed with 1 error, 0 partial successes, and 0 warnings:
08:02:58
08:02:58 ('42000', "[42000] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Incorrect syntax near ''. (102) (SQLMoreResults)")
08:02:58
08:02:58 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

In dbt.log its generating the following code:

` EXEC('create view "dbo"."table_stock_snapshot_staging_temp_view" as with __dbt__cte__table_stock as (

select
lower(convert(varchar(50), hashbytes(''''md5'''', coalesce(convert(varchar(8000), coalesce(cast(NaturalKey as VARCHAR(8000)), ''''dbt_utils_surrogate_key_null'''')), '''''''')), 2))
AS SilverBuyByPgKey,
Pg,
SYSDATETIME() SilverCreateDateTime,
BronzeLoadDateTime
from
(
select
CAST( Pg AS INT) AS Pg,
__BronzePipelineRunDateTime AS BronzeLoadDateTime,
Pg AS NaturalKey,
ROW_NUMBER() OVER (PARTITION BY Pg ORDER BY __BronzePipelineRunDateTime DESC) AS RowNum
from "Bronze_dbt"."dbo"."table_stock"
) Main_Query
WHERE
RowNum = 1
) select * from __dbt__cte__ep_table_stock;');
`

When I run this using Fabric Warehouse, it gives me the following error:

Started executing on line 1
Incorrect syntax near ''.
Incorrect syntax near 'Main_Query'.

When I remove quotes from MD5 function, it works perfectly fine.

@dschoel-pr where you able to find any solution for it ? Any help would be appreciated.

@muhammadbokhari
Copy link

I have also used hash function which is part of dbt-core, same error, double quotes creating issues.

@muhammadbokhari
Copy link

muhammadbokhari commented Jan 16, 2025

@prdpsvs Another example of the double quotes issue in snapshots.

I created an incremental & snapshot with same code, snapshot view has errors, incremental doesn't.
Incremental code:

{{
    config(
        materialized='incremental',
        unique_key =['DCKey'],
        incremental_strategy='delete+insert',
        on_schema_change='fail',]
    )
}}

WITH
source_query AS
(
    SELECT
        CAST( DC.ID AS BIGINT) AS DCKey,
        DC.ID AS DCCode,
        DC.Name AS DCName,
        DC.Address + ISNULL(', ' + DC.Suburb,'') + ISNULL(', ' + DC.[State],'') + ISNULL(', ' + CAST(DC.Postcode AS VARCHAR(20)),'') AS PhysicalAddressFull
    FROM
        {{ ref('dc_stg')}} DC
    WHERE 
        LOWER(dc_name) like '% dc%' AND 
        dbt_valid_to is null

    
    {% if is_incremental() %}
        and LoadDateTime > ( select max(LoadDateTime) load_date from {{ this }}  )
    {% endif %}
    
)
select * from source_query

This generates the following view:

EXEC('create view "dim"."DimDC__dbt_tmp__dbt_tmp_vw" as 

WITH
source_query AS
(
    SELECT
        CAST( DC.ID AS BIGINT) AS DCKey,
        DC.ID AS DCCode,
        DC.Name AS DCName,
        DC.Address + ISNULL('', '' + DC.Suburb,'''') + ISNULL('', '' + DC.[State],'''') + ISNULL('', '' + CAST(DC.Postcode AS VARCHAR(20)),'''') AS PhysicalAddressFull,
        LoadDateTime
    FROM
        database.schema.DC_stg" DC
    WHERE 
        LOWER(Name) like ''% dc%'' AND 
        dbt_valid_to is null
     
        and LoadDateTime > ( select max(LoadDateTime) load_date from database."dim"."DimDC"  )
        
)
select * from source_query;');

Following is the code for snapshot:
yml file:

snapshots:
 - name: DimDC2
   relation: ref ('ep_DimDC2')
   config:
     unique_key: DCKey
     strategy: check
     check_cols: all

& referenced ephemeral:

{{ config(materialized='ephemeral')}}

WITH
source_query AS
(
    SELECT
        CAST( DC.ID AS BIGINT) AS DCKey,
        DC.ID AS DCCode,
        DC.Name AS DCName,
        DC.Address + ISNULL(', ' + DC.Suburb,'') + ISNULL(', ' + DC.[State],'') + ISNULL(', ' + CAST(DC.Postcode AS VARCHAR(20)),'') AS PhysicalAddressFull,
        LoadDateTime
    FROM
        {{ ref('dc_stg')}} DC
    WHERE 
        LOWER(Name) like '% dc%' AND 
        dbt_valid_to is null

    {% if is_incremental() %}
        and LoadDateTime > ( select max(LoadDateTime) load_date from {{ this }}  )
    {% endif %}
    
)
select * from source_query

This generates the following code to create view:

EXEC('create view schema."DimDC2_snapshot_staging_temp_view" as with __dbt__cte__ep_DimDC2 as (

WITH
source_query AS
(
    SELECT
        CAST( DC.ID AS BIGINT) AS DCKey,
        DC.ID AS DCCode,
        DC.Name AS DCName,
        DC.Address + ISNULL('''', '''' + DC.Suburb,'''''''') + ISNULL('''', '''' + DC.[State],'''''''') + ISNULL('''', '''' + CAST(DC.Postcode AS VARCHAR(20)),'''''''') AS PhysicalAddressFull,
        LoadDateTime
    FROM
        database.schema."dc_stg" DC
    WHERE 
        LOWER(Name) like ''''% dc%'''' AND 
        dbt_valid_to is null    
)
select * from source_query
) select * from __dbt__cte__ep_DimDC2;
');

As you can see above, snapshot code generation is creating so many quotes compared to incremental.

Can you please prioritise this issue as its a major blocker for us.

Regards,
Muhammad

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants