Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Native parquet source does not work #2975

Open
kousun12 opened this issue Jan 6, 2025 · 9 comments
Open

[Bug]: Native parquet source does not work #2975

kousun12 opened this issue Jan 6, 2025 · 9 comments
Labels
bug Something isn't working

Comments

@kousun12
Copy link

kousun12 commented Jan 6, 2025

Describe the bug

I'm trying to use the native parquet/csv source as described in the docs. My source dir looks like:

sources/worldbank
├── connection.options.yaml
├── connection.yaml
└── economies.parquet

connection.yaml looks like:

# This file was automatically generated
type: csv
name: worldbank
# Advanced Options, you probably don't want to change this
buildOptions:
  batchSize: 1000000

and trying a very minimal index.md:

---
title: Evidence test
---

```sql economies
select * from worldbank.economies
```

ends up in an error:

Data Table
Catalog Error: Table with name economies does not exist!
Did you mean "states"?
LINE 2: SELECT * FROM (select * from worldbank.economies
                                     ^

My npm run sources output:

  [Processing] worldbank
  economies ⚠ No results returned.
-----
  Evaluated sources, saving manifest
  ✅ Done!

Steps to Reproduce

Try a very simple project with a single source, having a single .parquet file in it and try to make a query that uses it.

Logs

System Info

Severity

blocking all usage of Evidence

Additional Information, or Workarounds

No response

@kousun12 kousun12 added bug Something isn't working to-review Evidence team to review labels Jan 6, 2025
@archiewood
Copy link
Member

are you able to supply or point to your economies.parquet file?

@archiewood archiewood removed the to-review Evidence team to review label Jan 6, 2025
@kousun12
Copy link
Author

kousun12 commented Jan 6, 2025

doesn't seem to matter which parquet file i try: here's another simple file that i tried:

https://huggingface.co/datasets/substrate-labs/owid/blob/main/data/owid_co2_data.parquet

@archiewood
Copy link
Member

Did you install the parquet connector and add it to your evidence.config.yaml?
(Currently this is a third party connector)

npm install evidence-connector-parquet

We will be moving this to a first party connector very soon

@kousun12
Copy link
Author

kousun12 commented Jan 6, 2025

Still not working.

I started a fresh project using

npx degit evidence-dev/template frontend

then npm i evidence-connector-parquet
and add "evidence-connector-parquet": { } to datasources in evidence.config.yaml

then add a dir sources/simple

with a connection.yaml and a foo.parquet

# This file was automatically generated
name: simple
type: parquet
options: {}
npm run sources

> [email protected] sources
> evidence sources

✔ Loading plugins & sources
-----
  [Processing] mypar
[ ! ] Error connecting to datasource mypar: Invalid Input Error: Values were not provided for the following prepared statement parameters:  

@archiewood
Copy link
Member

Can you add the parquet source via the UI in localhost:3000/settings - it looks like you are missing config

@kousun12
Copy link
Author

kousun12 commented Jan 7, 2025

Tried that as well by adding a new parquet source foo from /settings it generates more or less the same yaml files and still ends up with:

npm run sources

> [email protected] sources
> evidence sources

✔ Loading plugins & sources
-----
  [Processing] foo
[ ! ] Error connecting to datasource foo: Invalid Input Error: Values were not provided for the following prepared statement parameters: 
     

@kousun12
Copy link
Author

kousun12 commented Jan 7, 2025

What's interesting is that the first time i try to add a new parquet source from /settings i name it e.g. bar and click "Confirm Changes" then i see a small error banner at the bottom saying Failed to process bar.null

@archiewood
Copy link
Member

archiewood commented Jan 7, 2025 via email

@kousun12
Copy link
Author

kousun12 commented Jan 7, 2025

Are you able to repro? I've tried fresh a few times now, with the latest version. Looking at the code it seems identical to the csv plugin, which works fine for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants