Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include the validation of templates in the SHACL shapes #79

Open
chrdebru opened this issue Jan 31, 2024 · 7 comments
Open

Include the validation of templates in the SHACL shapes #79

chrdebru opened this issue Jan 31, 2024 · 7 comments
Assignees
Labels
proposal issue has a proposal to be solved shapes

Comments

@chrdebru
Copy link
Collaborator

chrdebru commented Jan 31, 2024

I believe that checking the validity of templates should be included in the shapes. I'm not sure whether SPARQL's regular expressions allow for recursion, but it can be achieved by:

  • removing the unescaped curly braces from the template
  • checking whether the resulting string matches ^[^\{\}]*(?:\{[^\{\}]+\}[^\{\}]*)*$ (balanced and not nested)

This can be achieved for a SPARQL constraint component.

@dachafra dachafra added enhancement New feature or request shapes labels Feb 1, 2024
@DylanVanAssche
Copy link
Collaborator

SHACL can do regex, once the listed regex is tested and test cases are provided with valid and invalid templates, we can add it.

@dachafra dachafra added proposal issue has a proposal to be solved and removed enhancement New feature or request labels Jul 3, 2024
@dachafra
Copy link
Member

dachafra commented Jul 3, 2024

Tasks required:

  • Test the regex provided and ensure that the constraint is correct
  • Define test-cases with valid templates (I think this is by default now and it's not required)
  • Define a few test-cases with invalid templates.

@DylanVanAssche could you take care of this?

@dachafra dachafra changed the title Validity of template Include the validation of templates in the SHACL shapes Jul 3, 2024
@DylanVanAssche
Copy link
Collaborator

Define a few test-cases with invalid templates.
AFAIK, these are the 'rules':

A string template is a format string that can be used to build strings from multiple components. It can apply [reference expressions](https://kg-construct.github.io/rml-core/spec/docs/#dfn-reference-expression) by enclosing them in curly braces ({ and }). The following syntax rules apply to valid [string templates](https://kg-construct.github.io/rml-core/spec/docs/#dfn-string-template):

    1. Pairs of unescaped curly braces MUST enclose valid [reference expressions](https://kg-construct.github.io/rml-core/spec/docs/#dfn-reference-expression).
   2. Curly braces that do not enclose [reference expressions](https://kg-construct.github.io/rml-core/spec/docs/#dfn-reference-expression) MUST be escaped by a backslash character (\). This also applies to curly braces within [reference expressions](https://kg-construct.github.io/rml-core/spec/docs/#dfn-reference-expression).
    3.Backslash characters (\) MUST be escaped by preceding them with another backslash character, yielding (\\). This also applies to backslashes within [reference expressions](https://kg-construct.github.io/rml-core/spec/docs/#dfn-reference-expression).
    4.There SHOULD be at least one pair of unescaped curly braces.

from spec.

Anything missing?
If not, we need to have 4 invalid test cases, one for each rule.

@dachafra
Copy link
Member

dachafra commented Jul 4, 2024

I don't think so, please proceed :-)

@DylanVanAssche
Copy link
Collaborator

Escaping is tricky, especially since we use the same escape character in reference formulations as Turtle. If we write in something else then Turtle, we only need to escape once, other 2x.

In non-Turtle you can write \{ and escape it
In Turtle you must write \\{ to escape it.

However in 2 you might be confused if the \\ is escaping the { or means \ as token. Spoiler alert: you need to do \\\ in Turtle for \ in a reference formulation.

@chrdebru
Copy link
Collaborator Author

Is escaping really that tricky? You just need to load the file into a graph and indicate the format, no? Only then do we need to test the regex. That said, implementers can rely on ill-founded assumptions, so the test cases should provide redundant ones for RDF/XML, N-Quads, and even JSON-LD.

@DylanVanAssche
Copy link
Collaborator

Is escaping really that tricky? You just need to load the file into a graph and indicate the format, no?

If you use tools, they take care of that. It is mostly tricky if you write mappings manually. This is purely a developer experience thing. Since Turtle and the template-syntax use the same escape character / it becomes tricky when writing by hand.

pmaria pushed a commit that referenced this issue Jan 19, 2025
* test-cases: RMLTC0024*-CSV: add invalid template test cases

Contributes to #79

* shapes: tests: add expected failures for invalid templates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal issue has a proposal to be solved shapes
Projects
None yet
Development

No branches or pull requests

3 participants