A "General Considerations" section #58

gregsdennis · 2021-03-09T06:39:22Z

JSON Schema 2020-12, Section 6, specifies a set of general considerations that I think we can "borrow" from. I'll copy it here for reference and add comments.

6. General Considerations

6.1. Range of JSON Values

An instance may be any valid JSON value as defined by JSON. JSON Schema imposes no restrictions on type: JSON Schema can describe any JSON value, including, for example, null.

JSON Schema uses the word "instance" to describe the JSON data that's being validated. We seem to have landed on the word "data." Regardless, I think the gist of this also holds true for JSON Path: input values can be of any JSON type (although only certain types may return results depending on the Path).

6.2. Programming Language Independence

JSON Schema is programming language agnostic, and supports the full range of values described in the data model. Be aware, however, that some languages and JSON parsers may not be able to represent in memory the full range of values describable by JSON.

This, I think, is of utmost importance. We definitely should not favor a specific language. Doing so would inhibit inclusivity and shun people who work in incompatible frameworks.

This statement also makes it clear that it's understood that some frameworks inherently have limitations that may prevent them from being able to implement the full expression of JSON Schema. Declaring this outright allows such frameworks to have partial "best effort" implementations and still be compliant with the specification.

6.3. Mathematical Integers

Some programming languages and parsers use different internal representations for floating point numbers than they do for integers.

For consistency, integer JSON numbers SHOULD NOT be encoded with a fractional part.

I'm not sure whether this would apply for us, except maybe for array indices.

6.4. Regular Expressions

Keywords MAY use regular expressions to express constraints, or constrain the instance value to be a regular expression. These regular expressions SHOULD be valid according to the regular expression dialect described in ECMA-262, section 21.2.1.

Regular expressions SHOULD be built with the "u" flag (or equivalent) to provide Unicode support, or processed in such a way which provides Unicode support as defined by ECMA-262.

Furthermore, given the high disparity in regular expression constructs support, schema authors SHOULD limit themselves to the following regular expression tokens:

individual Unicode characters, as defined by the JSON specification;

simple character classes ([abc]), range character classes ([a-z]);

complemented character classes ([^abc], [^a-z]);

simple quantifiers: "+" (one or more), "" (zero or more), "?" (zero or one), and their lazy versions ("+?", "?", "??");

range quantifiers: "{x}" (exactly x occurrences), "{x,y}" (at least x, at most y, occurrences), {x,} (x occurrences or more), and their lazy versions;

the beginning-of-input ("^") and end-of-input ("$") anchors;

simple grouping ("(...)") and alternation ("|").

Finally, implementations MUST NOT take regular expressions to be anchored, neither at the beginning nor at the end. This means, for instance, the pattern "es" matches "expression".

Not sure if we're planning on supporting regular expressions. It appears that some implementations do have some support, but it's all extension on the original syntax at this point. Still, this is a good declaration of support.

It also ties in closely to section 6.2 regarding framework limitations as not all frameworks support the same flavor of regular expression syntax.

6.5. Extending JSON Schema

Additional schema keywords and schema vocabularies MAY be defined by any entity. Save for explicit agreement, schema authors SHALL NOT expect these additional keywords and vocabularies to be supported by implementations that do not explicitly document such support. Implementations SHOULD treat keywords they do not support as annotations, where the value of the keyword is the value of the annotation.

Implementations MAY provide the ability to register or load handlers for vocabularies that they do not support directly. The exact mechanism for registering and implementing such handlers is implementation-dependent.

This is good to have because invariably, implementations will want to extend functionality beyond what's in the spec. It basically covers other implementations from also having to provide the same extensions, requiring only what is stated in the spec.

It also mentions "vocabularies," which are a spec-defined mechanism by which implementation can extend functionality via new keywords in such a way that they can optionally be supported in other implementations. Furthermore, this mechanism allows the other implementations to refuse to process a schema that requires a given vocabulary if the implementation doesn't understand it. This bit I think is good for later when we eventually get to spec-defined extension mechanisms, but I don't expect that'll be in the first draft.

That's it. Just some declarations that I think would be good to have. This is neither an exclusive nor "all or nothing" list. I think we should pick and choose as we see fit. If you think of something that's not in this list, let us know.

danielaparker · 2021-03-09T20:29:32Z

@gregsdennis wrote:

JSON Schema uses the word "instance" to describe the JSON data that's being validated. We seem to have landed on the word "data."

Goessner uses "root object", which I've always thought of as a JSON value (not restricted to be an object), and somewhat analogous to the JSON Schema "instance". In online JSONPath articles, "root" is often described as the "root object or array", which I understand, or "root member of a JSON structure", which I don't understand. The draft uses "root item" (once) and "root node" (five times), and talks about the "root node which is the input document." I'm not sure if it's trying to make a distinction between "root node" and the JSON value passed to a JSONPath evaluator, but from the quoted sentence, it doesn't sound like it.

I also note that it's unclear what the draft means by "node", which occurs 60 times in the draft. In section 3.2, it says "Each node holds a JSON value", but it doesn't say what else the node holds (a position or path to that point?) And then we have "root node which is the input document", which suggests the root node is a value. My own understanding of a node is a path/value pair, and I think the draft needs to be more clear about this term, and to distinguish between root and current nodes, and the corresponding root and current values.

I do think it would help to have a consistently used term to represent the thing that we pass to the evaluator. And avoid having text like "the JSON data item to which the query is applied to" embedded in a sentence.

Daniel

cabo · 2021-03-10T06:55:20Z

Not sure this is the right issue to discuss this, but the issues are currently all over the place. Indeed, consistent terminology is needed. I wonder whether the dichotomy between item and node is a useful editorial distinction. They certainly mean the same, but the term item focuses on the whole subtree while the term node focuses on the root of the subtree. So “root item” is a bit weird (although perfectly meaningful), while root node emphasises that we are talking about a position in a common tree. At the time JSONpath was written, JSON only allowed maps (“JSON objects”) and arrays as root items. That has since changed; any data item can serve as a root item. But at the time talking about “root object” was almost sensible, because in JavaScript arrays are almost “objects”. Grüße, Carsten

…

On 9. Mar 2021, at 21:29, Daniel Parker ***@***.***> wrote: @gregsdennis <https://github.com/gregsdennis> wrote: JSON Schema uses the word "instance" to describe the JSON data that's being validated. We seem to have landed on the word "data." Goessner <https://goessner.net/articles/JsonPath/> uses "root object", which I've always thought of as a JSON value (not restricted to be an object), and somewhat analogous to the JSON Schema "instance". In online JSONPath articles, "root" is often described as the "root object or array", which I understand, or "root member of a JSON structure", which I don't understand. The draft uses "root item" (once) and "root node" (five times), and talks about the "root node which is the input document." I'm not sure if it's trying to make a distinction between "root node" and the JSON value passed to a JSONPath evaluator, but from the quoted sentence, it doesn't sound like it. I do think it would help to have a consistently used term to represent the thing that we pass to the evaluator. And avoid having text like "the JSON data item to which the query is applied to" embedded in a sentence. Daniel — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#58 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAFUTRCXDJUTZU6PEQDGXLTCZZKXANCNFSM4Y24OPBQ>.

gregsdennis · 2021-03-10T07:27:36Z

Not sure this is the right issue to discuss this, but the issues are currently all over the place.

To be sure, this issue was to cover the necessity of this sort of section moreso than the specific declarations. If we agree that such a section is ideal or even required, I'm fine with that consensus for this issue and we can split out the specific topics to other issues.

gregsdennis · 2021-03-10T19:08:20Z

It looks like the terminology discussion is now happening over in #66. That's one topic split out.

cabo · 2022-01-17T23:12:41Z

Text on regular expressions is useful input to #70 , which now references this.
OBE otherwise, I'd say.

cabo closed this as completed Jan 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A "General Considerations" section #58

A "General Considerations" section #58

gregsdennis commented Mar 9, 2021 •

edited

Loading

6. General Considerations

6.1. Range of JSON Values

6.2. Programming Language Independence

6.3. Mathematical Integers

6.4. Regular Expressions

6.5. Extending JSON Schema

danielaparker commented Mar 9, 2021 •

edited

Loading

cabo commented Mar 10, 2021 via email

gregsdennis commented Mar 10, 2021 •

edited

Loading

gregsdennis commented Mar 10, 2021

cabo commented Jan 17, 2022

A "General Considerations" section #58

A "General Considerations" section #58

Comments

gregsdennis commented Mar 9, 2021 • edited Loading

6. General Considerations

6.1. Range of JSON Values

6.2. Programming Language Independence

6.3. Mathematical Integers

6.4. Regular Expressions

6.5. Extending JSON Schema

danielaparker commented Mar 9, 2021 • edited Loading

cabo commented Mar 10, 2021 via email

gregsdennis commented Mar 10, 2021 • edited Loading

gregsdennis commented Mar 10, 2021

cabo commented Jan 17, 2022

gregsdennis commented Mar 9, 2021 •

edited

Loading

danielaparker commented Mar 9, 2021 •

edited

Loading

gregsdennis commented Mar 10, 2021 •

edited

Loading