Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(FEATURE REQUEST) Parquet Metadata Parser #1833

Open
RahulDubey391 opened this issue Jan 20, 2025 · 0 comments
Open

(FEATURE REQUEST) Parquet Metadata Parser #1833

RahulDubey391 opened this issue Jan 20, 2025 · 0 comments

Comments

@RahulDubey391
Copy link

Hi, I am looking to raise a PR for enhancing the stat command with the ability to parse Parquet for collecting row/column stats so initial profiling can be done on data.

As of now, we have to create a BigQuery Table(External) first to be able to do analysis. Otherwise bq cli has to be used or DataCatalog service.

Wouldn't it be good if this feature would be part of gsutil itself? For future expansion, other file formats can be parsed as well.

I already have a parser ready, I am just working on integrating it with gsutil.

Let me know if I can raise a PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant