Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][SPARK-50982][SQL] Support more SQL/DataFrame read path functionality in single-pass Analyzer #49658

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vladimirg-db
Copy link
Contributor

@vladimirg-db vladimirg-db commented Jan 24, 2025

What changes were proposed in this pull request?

Support more SQL/DataFrame read path functionality in single-pass Analyzer:

  • Most of name resolution
  • Views
  • CTEs
  • UNIONs
  • Global aggregates
  • Most of the functions
  • LCAs in Project
  • LIMIT
  • Subtree resolution in extensions
  • Expression ID assignment
  • Generic type coercion

Also, remove TracksResolvedNodes, because it's based on comparing object addresses, which doesn't always make sense in Catalyst, because Catalyst reuses objects (e.g. literals or local/global limit expression trees).

Why are the changes needed?

To replace fixed-point Analyzer in Spark with a single-pass one.

Does this PR introduce any user-facing change?

No, single-pass Analyzer is still disabled.

How was this patch tested?

  • New test suites
  • Dual running two Analyzers and comparing logical plans with ANALYZER_DUAL_RUN_LEGACY_AND_SINGLE_PASS_RESOLVER.

Was this patch authored or co-authored using generative AI tooling?

Yes, copilot.

@github-actions github-actions bot added the SQL label Jan 24, 2025
@vladimirg-db vladimirg-db force-pushed the vladimirg-db/single-pass-analyzer/more-functionality branch 3 times, most recently from 2d496ab to 750807c Compare January 24, 2025 20:15
@vladimirg-db
Copy link
Contributor Author

Thanks @mihailotim-db and @mihailoale-db for working on this!

@vladimirg-db vladimirg-db changed the title [WIP][SPARK-50982][SQL] Support more advanced SQL/DataFrame read path functionality in single-pass Analyzer [WIP][SPARK-50982][SQL] Support more SQL/DataFrame read path functionality in single-pass Analyzer Jan 24, 2025
@vladimirg-db vladimirg-db force-pushed the vladimirg-db/single-pass-analyzer/more-functionality branch 3 times, most recently from 1e10931 to c035fe8 Compare January 24, 2025 22:20
@vladimirg-db
Copy link
Contributor Author

Thanks @gotocoding-DB for UNIONs!

@vladimirg-db vladimirg-db force-pushed the vladimirg-db/single-pass-analyzer/more-functionality branch from c035fe8 to df0453f Compare January 24, 2025 22:23
@vladimirg-db vladimirg-db force-pushed the vladimirg-db/single-pass-analyzer/more-functionality branch from df0453f to b62523a Compare January 24, 2025 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant