Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RML-Core test cases are too dependent on RML-IO #87

Open
DylanVanAssche opened this issue Feb 15, 2024 · 3 comments
Open

RML-Core test cases are too dependent on RML-IO #87

DylanVanAssche opened this issue Feb 15, 2024 · 3 comments
Assignees
Labels
documentation Improvements or additions to documentation help wanted Extra attention is needed pending Waiting for another spec or issue to be ready proposal issue has a proposal to be solved test-cases Test cases need to be updated

Comments

@DylanVanAssche
Copy link
Collaborator

Problem

Engines implementing RML-Core should no bother with all different Source descriptions like CSV, XML, JSON, RDB, SPARQL, etc.
to be RML-Core compliant. However, the current test-cases exist in the different Source descriptions. Thus if an engine would cover RML-Core and do not support a certain format, it's coverage would drastically fall, even though it may have perfect RML-Core support.
Moreover, different source support is out of scope of RML-Core as it is part of RML-IO.

Proposal

Drop all source specific test-cases in RML-Core and add the different sources to RML-IO.
RML-IO currently focus on RML Logical Target tests, Logical Source is missing as it is covered by RML-Core.
Keep the CSV variant for all test-cases in RML-Core because we cannot test anything without input data. CSV is the easiest to support (no iterator) and can be loaded easily into a RDB. For RDB support loading the CSV + updating the Logical Source suffice.
Special features like datatype extraction from RDBs and possible other formats like integer, floats in JSON, could be added as specific test-cases in RML-IO.

Discussion

Let's discuss this properly! This is not a blocker for the KGCW Challenge as it does not involve a specification change, only a move and refactoring of the test-cases. Engines supporting RML-Core and RML-IO should still have the same coverage like now.

@DylanVanAssche DylanVanAssche added documentation Improvements or additions to documentation help wanted Extra attention is needed labels Feb 15, 2024
@chrdebru
Copy link
Collaborator

chrdebru commented Feb 15, 2024

I mostly agree. That said, we should have some JSON or XML cases for multi-valued expression maps. That is part of core,
right?

@bjdmeest
Copy link
Member

yes I agree too, but then to also keep some JSON or XML cases like @chrdebru proposed, I'm also thinking about, e.g. default datatypes when the data source has data types defined (e.g. JSON boolean). I'm guessing probably only JSON is enough, just to make the minimum as minimum as possible

@DylanVanAssche
Copy link
Collaborator Author

+1 to keep these things because the Core spec mentions the data type extraction stuff.
So keep CSV test-cases + add/keep a few with CSV-XML-JSON-RDB for data type extraction?

@DylanVanAssche DylanVanAssche added proposal issue has a proposal to be solved pending Waiting for another spec or issue to be ready labels Mar 4, 2024
@pmaria pmaria added the test-cases Test cases need to be updated label Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation help wanted Extra attention is needed pending Waiting for another spec or issue to be ready proposal issue has a proposal to be solved test-cases Test cases need to be updated
Projects
None yet
Development

No branches or pull requests

5 participants