[WIP][SPARK-50982][SQL] Support more SQL/DataFrame read path functionality in single-pass Analyzer #49658

vladimirg-db · 2025-01-24T19:53:39Z

What changes were proposed in this pull request?

Support more SQL/DataFrame read path functionality in single-pass Analyzer:

Most of name resolution
Views
CTEs
UNIONs
Global aggregates
Most of the functions
LCAs in Project
LIMIT
Subtree resolution in extensions
Expression ID assignment
Generic type coercion

Also, remove TracksResolvedNodes, because it's based on comparing object addresses, which doesn't always make sense in Catalyst, because Catalyst reuses objects (e.g. literals or local/global limit expression trees).

Why are the changes needed?

To replace fixed-point Analyzer in Spark with a single-pass one.

Does this PR introduce any user-facing change?

No, single-pass Analyzer is still disabled.

How was this patch tested?

New test suites
Dual running two Analyzers and comparing logical plans with ANALYZER_DUAL_RUN_LEGACY_AND_SINGLE_PASS_RESOLVER.

Was this patch authored or co-authored using generative AI tooling?

Yes, copilot.

vladimirg-db · 2025-01-24T20:16:02Z

Thanks @mihailotim-db and @mihailoale-db for working on this!

vladimirg-db · 2025-01-24T22:21:05Z

Thanks @gotocoding-DB for UNIONs!

github-actions bot added the SQL label Jan 24, 2025

vladimirg-db force-pushed the vladimirg-db/single-pass-analyzer/more-functionality branch 3 times, most recently from 2d496ab to 750807c Compare January 24, 2025 20:15

vladimirg-db changed the title ~~[WIP][SPARK-50982][SQL] Support more advanced SQL/DataFrame read path functionality in single-pass Analyzer~~ [WIP][SPARK-50982][SQL] Support more SQL/DataFrame read path functionality in single-pass Analyzer Jan 24, 2025

vladimirg-db force-pushed the vladimirg-db/single-pass-analyzer/more-functionality branch 3 times, most recently from 1e10931 to c035fe8 Compare January 24, 2025 22:20

vladimirg-db force-pushed the vladimirg-db/single-pass-analyzer/more-functionality branch from c035fe8 to df0453f Compare January 24, 2025 22:23

More single-pass analyzer functionality

b62523a

vladimirg-db force-pushed the vladimirg-db/single-pass-analyzer/more-functionality branch from df0453f to b62523a Compare January 24, 2025 22:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][SPARK-50982][SQL] Support more SQL/DataFrame read path functionality in single-pass Analyzer #49658

[WIP][SPARK-50982][SQL] Support more SQL/DataFrame read path functionality in single-pass Analyzer #49658

vladimirg-db commented Jan 24, 2025 •

edited

Loading

vladimirg-db commented Jan 24, 2025

vladimirg-db commented Jan 24, 2025

[WIP][SPARK-50982][SQL] Support more SQL/DataFrame read path functionality in single-pass Analyzer #49658

Are you sure you want to change the base?

[WIP][SPARK-50982][SQL] Support more SQL/DataFrame read path functionality in single-pass Analyzer #49658

Conversation

vladimirg-db commented Jan 24, 2025 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

vladimirg-db commented Jan 24, 2025

vladimirg-db commented Jan 24, 2025

vladimirg-db commented Jan 24, 2025 •

edited

Loading