Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Intermittent errors when unit testing versioned models #11139

Open
2 tasks done
CarrotPapa opened this issue Dec 12, 2024 · 4 comments
Open
2 tasks done

[Bug] Intermittent errors when unit testing versioned models #11139

CarrotPapa opened this issue Dec 12, 2024 · 4 comments
Labels
bug Something isn't working model_versions unit tests Issues related to built-in dbt unit testing functionality

Comments

@CarrotPapa
Copy link

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Encountering an intermittent error during unit testing of versioned models. The error resolved after removing the associated unit tests.

❯ poetry run dbt compile
05:50:12 Running with dbt=1.9.0
05:50:14 Registered adapter: bigquery=1.9.0
05:50:14 Unable to do partial parsing because saved manifest not found. Starting full parse.
05:50:17 Encountered an error:
Parsing Error
Unit test 'test_model' references a model that does not exist: model.my_project.model

Expected Behavior

When it worked, it looks like the following:

❯ poetry run dbt compile
05:53:33 Running with dbt=1.9.0
05:53:34 Registered adapter: bigquery=1.9.0
05:53:35 Unable to do partial parsing because saved manifest not found. Starting full parse.
05:53:39 Found 109 models, 2 operations, 34 data tests, 25 sources, 4 exposures, 1462 macros, 36 unit tests
05:53:39
05:53:39 Concurrency: 20 threads (target='default')

Steps To Reproduce

model.sql:

select * from {{ ref("table_a") }}

model_v2.sql:

select * from {{ ref("table_a") }}

schema.yml:

version: 2

models:
  - name: model
    latest_version: 1
    versions:
      - v: 2
        defined_in: model_v2
      - v: 1
        defined_in: model
        config:
          alias: model

test_model.yml:

unit_tests:
  - name: test_model
    model: model
    versions:
      include:
        - 2
    given:
      - input: ref('table_a')
        format: csv
        fixture: fixture_table_a
    expect:
      rows:
        - ...

Relevant log output

No response

Environment

- OS: macOS 14.7
- Python: 3.11.6
- dbt: 1.9.0

Which database adapter are you using with dbt?

bigquery

Additional Context

No response

@CarrotPapa CarrotPapa added bug Something isn't working triage labels Dec 12, 2024
@dbeatty10 dbeatty10 added model_versions unit tests Issues related to built-in dbt unit testing functionality labels Dec 12, 2024
@dbeatty10
Copy link
Contributor

@CarrotPapa Thanks for such a nice write-up with detailed listing of your project files, commands, and output 🤩

Intermittent errors are tough to debug! I tried out an example nearly the identical to the one you provided, but I wasn't able to trigger the error. See below for the files I used.

I suspect it is some kind of issue related to cached files in your target directory. If you either rename your target directory to stash it to the side (or delete it entirely), do you still get that error to crop up?

Project files

models/table_a.sql

select 1 as id

models/model.sql

select * from {{ ref("table_a") }}

models/model_v2.sql

select * from {{ ref("table_a") }}

models/schema.yml

version: 2

models:
  - name: model
    latest_version: 1
    versions:
      - v: 2
        defined_in: model_v2
      - v: 1
        defined_in: model
        config:
          alias: model

models/test_model.yml

unit_tests:
  - name: test_model
    model: model
    versions:
      include:
        - 2
    given:
      - input: ref('table_a')
        rows:
          - {id: 2}
    expect:
        rows:
          - {id: 2}

Ran this command:

dbt compile 

Got this output:

(dbt_1.9) $ dbt compile                     
01:36:03  Running with dbt=1.9.0
01:36:04  Registered adapter: duckdb=1.9.1
01:36:04  Found 3 models, 422 macros, 1 unit test
01:36:04  
01:36:04  Concurrency: 1 threads (target='duckdb')
01:36:04  
(dbt_1.9) $ 

@CarrotPapa
Copy link
Author

Thanks @dbeatty10 for taking a look at the issue.
I forgot to mention that I actually deleted the target directory between each dbt compile.
I will try to set up a clean project and see if I'm able to reproduce the issue.

@CarrotPapa
Copy link
Author

I have figured out how to reliably reproduce the intermittent errors. It seems the key is to place the unit tests in a separate directory (e.g. tests).

❯ rm -rf target; poetry run dbt compile
06:28:49  Running with dbt=1.9.0
06:28:50  Registered adapter: bigquery=1.9.0
06:28:50  Unable to do partial parsing because saved manifest not found. Starting full parse.
06:28:51  Found 3 models, 487 macros, 1 unit test
06:28:51  
06:28:51  Concurrency: 20 threads (target='default')
06:28:51  
❯ rm -rf target; poetry run dbt compile
06:29:00  Running with dbt=1.9.0
06:29:01  Registered adapter: bigquery=1.9.0
06:29:01  Unable to do partial parsing because saved manifest not found. Starting full parse.
06:29:02  Encountered an error:
Parsing Error
  Unit test 'test_model' references a model that does not exist: model.my_dbt_project.model

@dbeatty10
Copy link
Contributor

Thanks for this info @CarrotPapa !

After a bit of experimentation, I was able to observe that same parsing error. Read below for details.

Suspected root cause

I'm guessing that the underlying root cause is the order in which dbt is parsing the files (which is in turn affected by the order in which the filesystem is returning the files, which can be non-deterministic).

Reprex

Create these files

models/model.sql

select 1 as id

models/model_v2.sql

select 1 as id

_models.yml.template

models:
  - name: model
    latest_version: 1
    versions:
      - v: 2
        defined_in: model_v2
      - v: 1
        defined_in: model
        config:
          alias: model

_unit_tests.yml.template

unit_tests:
  - name: test_model
    model: model
    versions:
      include:
        - 2
    given: []
    expect:
        rows:
          - {id: 1}

Run these commands

Start with these commands to create some empty YAML files:

rm -rf models/schema_yml_files
mkdir -p models/schema_yml_files
touch models/schema_yml_files/_1234.yml
touch models/schema_yml_files/_4567.yml

Next, we'll put YAML content in each of the files -- one file with the unit test, and the other file with the model versions.

Then I'd expect one (and only one!) of the following dbt compile commands to work, and I'd expect the other to display the error you got.

cat _unit_tests.yml.template > models/schema_yml_files/_1234.yml
cat _models.yml.template > models/schema_yml_files/_4567.yml
dbt compile --no-partial-parse
cat _unit_tests.yml.template > models/schema_yml_files/_4567.yml
cat _models.yml.template > models/schema_yml_files/_1234.yml
dbt compile --no-partial-parse

For me, the first one gave the error and the second one worked. But for you the order might be different! Either way, it looks to me like we'd need to make a fix so that the dbt parsing is unaffected by the order of the listing from the filesystem.

Workaround

The following workaround worked for me:

  • combine the contents of _models.yml and _unit_tests.yml into a single YAML file (rather than separate YAML files).

For example, using the project files above, run these commands:

rm -rf models/schema_yml_files
mkdir -p models/schema_yml_files
cat _models.yml.template _unit_tests.yml.template > models/schema_yml_files/_properties.yml
dbt compile --no-partial-parse

@dbeatty10 dbeatty10 removed the triage label Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working model_versions unit tests Issues related to built-in dbt unit testing functionality
Projects
None yet
Development

No branches or pull requests

2 participants