Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What do we want our RFC to do? #78

Closed
timbray opened this issue Mar 22, 2021 · 20 comments
Closed

What do we want our RFC to do? #78

timbray opened this issue Mar 22, 2021 · 20 comments

Comments

@timbray
Copy link
Contributor

timbray commented Mar 22, 2021

(Co-chair hat off)

This is related to #63 ("Respect Implementations"). I think it will help us if we think about what we want the output of our work to be. It is a fact that there are lots of implementations, that they are incompatible in places, and that few of them are likely to change as a result of us publishing an RFC. Given that, what are reasonable goals for our work bearing our charter in mind?

First, provide a formal, readable, and helpful specification of JSONPath syntax as it exists. To the extent that there is "consensus" (a matter of judgment, not voting) we can use MUST here. If we encounter a case where there are broad divergences in with more than 1 in widespread use, we make a judgment call which to describe - it's OK to describe more than one option - and possible to use SHOULD. For example, there are a significant number of implementations that support initial "@" and so we should describe that syntax.

Second, provide interoperability guidance (one of my favorite parts of 8259). For example, for interoperability, users SHOULD avoid JSONPaths with initial "@". Another example: they SHOULD prefer bracket notation to dot notation so they don't have to worry about selectors with hyphens in them. I suspect there are lots of other opportunities to do this usefully.

Third, provide lots and lots of examples. For every fragment of syntax, offer examples. There are a few sections that lack this currently.

Fourth (maybe not possible) find a way to include (if only by reference) the existing work from Glyn and cburgmer which provides concrete data about expected inputs and outputs, and variations observed in the field.

@glyn
Copy link
Collaborator

glyn commented Mar 23, 2021

When I was faced with using JSONPath for the first time, it required guesswork and experimentation because the implementation concerned was essentially undocumented. Guesswork and experimentation can lead to dependency on unintended or unstable behaviour. This was not satisfactory. Also, any bug fixing of the implementation in the absence of a specification would not necessarily lead to convergence. One benefit of an RFC would be to clearly specify (common) behaviour of JSONPath such that implementations could either migrate to supporting that behaviour or, failing that, could at least document which standard behaviours they support and how they deviate from standard behaviour. Given a RFC and supposing the implementation I was using was still undocumented, I would have performed experiments on the implementation to see which standard behaviours it supported. I could even have contributed the results to form partial documentation since the implementation was open source.

Similarly, when I had to implement JSONPath, I found it was necessary to come up with my own specification of syntax and semantics. @cburgmer's comparison project gave me a consensus to aim for, but outside that consensus, there was a lot of latitude and I had to make several policy decisions which, being relatively new to JSONPath, I wasn't certain about. If an RFC had existed, I would have tried to implement it, or at least a subset.

So, for me, it is more important that a RFC provide a clear and unambiguous specification of common behaviour of JSONPath rather than necessarily exploring features provided by a minority of implementations. I'm certainly not looking for innovation or new features to be added to JSONPath by a RFC because I want to encourage implementations to (slowly) migrate towards standard behaviour over time and I'd like to minimise the cost of developing new implementations. Also, from a usability point of view, I think "less is more", so it's important to keep the scope of the RFC fairly small to avoid producing a large, and potentially unreadable, document.

@bettio
Copy link

bettio commented Mar 23, 2021

@glyn:

So, for me, it is more important that a RFC provide a clear and unambiguous specification of common behaviour of JSONPath rather than necessarily exploring features provided by a minority of implementations.

I will use regex (#70) again as an example:
Your priority is clearly (unintentionally) against regex, which are implemented by a minority (11 implementations), moreover expressions using them aren't even portable: when used on some notable implementations an unexpected (not empty) result is given.
Also I expect that choosing the regex flavor is something that matters for compatibility and it will be even a more painful choice. For example, you mentioned re2, but it is not widely available and I'm not even sure if any implementation is actually using it.

@glyn:

Also, from a usability point of view, I think "less is more", so it's important to keep the scope of the RFC fairly small to avoid producing a large, and potentially unreadable, document.

So for instance your point of view is again against regex, according to #70 there is a lot of specification work (and a lot of text needs to be written).

I think that regex are useful, but do we wish excluding them?

I don't think so, I think that we need a different attitude if we want standardize something useful. I believe that there is a number of topics that will run into the same kind of problems, if we stick to this idea.

Furthermore,

@glyn:

So, for me, it is more important that a RFC provide a clear and unambiguous specification of common behaviour of JSONPath rather than necessarily exploring features provided by a minority of implementations. I'm certainly not looking for innovation or new features to be added to JSONPath by a RFC

Is your point of view compatible with our charter?

The draft charter was:

The WG will develop a standards-track JSONPath specification, with
the primary goal of capturing the common semantics of existing
implementations and, where there are differences, choosing
semantics with the goal of causing the least disruption among
JSONPath users.

I will quote Barry Leiba, that asked to change it:

I fear that this text appears to say that we primarily want to develop a
dumbed-down compromise mush, and I'm sure that's not what we really mean. The
primary goal is, surely, to develop a specification for JSONPath that is
technically sound, complete, and useful.

And the following has been proposed:

The WG will develop a standards-track JSONPath specification that
is technically sound and complete, based on the common semantics
and other aspects of existing implementations. Where there are
differences, the working group will analyze those differences and
make choices that rough consensus considers technically best, with
an aim toward minimizing disruption among the different JSONPath
implementations.

Honestly, I think that your point of view (which I quoted before) is not compatible with the aim of the charter.

I wish we succeed in writing and standardizing a feature rich (e.g. which may include regex and other features), that is valuable for users (again users need features)

I'm pretty sure that if we standardize something interesting, users will prefer this to the legacy implementations.

Of course we need to stay coherent with our core values, that are described in Goessner's article, but this is not in contradiction with reasonable adjustments, polishing and additions (such as regex) to it.

Some disruptions will happen anyway (we need to make a choice about filters at a certain point, and we are going to define a DSL for them), but it will happen for a reason.

I also wish you understand that deferring difficult choices is not helpful, maybe we rather need to discuss them as soon as possible. We don't need writing the spec for all of them right now, but we need to discuss and check consensus for several of them as soon as possible so we don't waste precious time.

Anyway if your plan is just mapping the status quo and writing a document about a common interoperable subset, I think that might be useful for some users, however it sounds to me quite sterile (and I'm not interested to spend my time on it) and I think that there isn't any need of a standard or a discussion on it for just gathering facts about existing implementations.

@timbray I acknowledge that you are a heavy JSONPath user and you don't want to see useless disruptions, my aim is to avoid them too, but we also need to acknowledge that some disruptions may happen (e.g. no one is going to replace $.foo with €.foo ;) ).

@glyn
Copy link
Collaborator

glyn commented Mar 23, 2021

@bettio I was simply expressing my, perhaps idealistic, preference for what I want our RFC to do, so that the WG knows where I'm coming from. I wouldn't attempt to suggest this as a consensus in the WG. If the RFC does more than I'd like without becoming bloated and unreadable, that's a bonus. Now you know where I am coming from, hopefully the motivation for some of my comments in others issues will be clearer.

On regexes, I fully expect we'll need to come up with some kind of compromise rather than omit them altogether. (BTW my implementation supports RE2. ;-) )

@bettio
Copy link

bettio commented Mar 25, 2021

@glyn:

On regexes, I fully expect we'll need to come up with some kind of compromise rather than omit them altogether.

Yes, we really should support them, however we need to stay inside of our decision process frame (I'm strongly against any kind of cherry picking or any double standard, so let's agree on a common ground and let's stick to it).

So, we can have regex if our common decision process allows us to: accept features implemented by a minority (just 11 implementations), eventual disruptions (they will happen for sure with regex), features that are not part of the common subset (regex are out of that subset) or any semantic/flavor that doesn't already exist (or which is not already part of existing implementations or just a couple of them).

To be honest I don't feel like at the moment we share the right set of values to allow regex, but we can fix that and we can allow regex and other features.
On the other side I feel like that the charter already allows us to support such similar features, so it is just up to us, and I feel like that we are just limiting ourself without any good reason.

I'm on the side of regex, because we should build something "based on existing implementations", not "limited to existing implementations". And we should work towards something sound and complete, not a compromise mush.

@bettio
Copy link

bettio commented Mar 26, 2021

Let's try to write a proposal that may hopefully make everyone happy:

The RFC will contain a JSONPath Core subset that is meant for interoperability across about 70%~90% of existing implementations (the number depends on the extent of the Core subset) and for transitional purposes.
The RFC will contain also a number of extensions and additions built around JSONPath Core that are required for a full JSONPath implementation. The full JSONPath set is not limited to existing implementations and will contain everything is needed for a technically sound and complete specification (regex will be part of this set).

Jayway Java JSONPath will be likely a JSONPath Core implementation.

Do you think that this approach will be viable and able to match the 2 different point of view?

@cabo
Copy link
Member

cabo commented Mar 26, 2021

Obviously, restricting the standard (or even its core) to the places where all the holes in the Swiss cheese align is not going to lead to a successful specification. I'd rather state this as an objective, i.e., we don't have to do it when it is not possible to do it. Even limiting ourselves to existing practice is thinking too small, because there may be a good way forward that none of the individual implementations were forced to take because they could always cook their own soup.

In general, the discussion here is somewhat useful to get on the same page, but I don't expect we will have hard and fast rules at the end — we need to do the right thing instead.

@goessner
Copy link
Collaborator

My primary guiding principle is Minimalism. As a bad example I would consider bloated SVG webstandard, purely designed by committee. Browser vendors were reluctant for a very long time before they finally started to implement it.

In contrast, current JSONPath situation can be seen beneficial, as there is already a lot of practical experience with numerous implementations from which the WG can learn.

Looking at some standards, we have

  • JSON RFC 8259 (16 pages)
  • JSON Pointer RFC 6901 (8 pages)
  • JSON Schema ID (19 pages)
  • XPath 1.0 (30 pages)
  • XPath 3.1 (186 pages)

Please don't get me wrong. I do not want to limit the page count to ... say 25, but I want to minimize size of the specification ... and thus the hurdle for future implementors. It seems to work, when you look at the overall size of my proposal from 2007. So how can we cover only the essentials (examples included) ?

  • use Pareto principle ... the 80/20 rule already quoted by Tim.
  • "If in doubt, leave it out" ... We don't need to decide for one side, if two implementions disagree in one aspect. It might not be essential.
  • If two implementations disagree in one essential aspect, the WG pragmatically decides for the best of both ... or for an even better third.

This is roughly inline with ...

The WG will develop a standards-track JSONPath specification that
is technically sound and complete, based on the common semantics
and other aspects of existing implementations ...

Our biggest challenge here seems to define "complete" ...

@bettio
Copy link

bettio commented Mar 26, 2021

@goessner:

My primary guiding principle is Minimalism. As a bad example I would consider bloated SVG webstandard, purely designed by committee. Browser vendors were reluctant for a very long time before they finally started to implement it.

That would be a disaster: we can all agree that it must not happen.

However I don't have any evidence at the moment that we are running into that problem, conversely most of the proposals I saw so far are quite reasonable.

(Does anyone have any kind of proof that this is happening right now?)

I feel like that we are maybe just bikeshedding (and wasting our time) against a situation that might never happen, so maybe we should take actions against it just in case it happens.

Some additional thoughts:

@goessner:

use Pareto principle ... the 80/20 rule already quoted by Tim.

If my users/customers complain about missing features 2 times on 10, I cannot be happy at all (I would call that a disaster), if that kind of annoyance is caused by a compliant JSONPath I will be asked to extend it for sure (regardless of the RFC or other implementations).
The result would be a jungle of custom additions that will bring us back to the original problem.

Also RFC compliancy can be appealing only if it has a reasonable number of features, otherwise people will continue to customize their implementations regardless of our work. If no one will use the work produced by this WG it means that we are just wasting our time.

@goessner:

"If in doubt, leave it out" ... We don't need to decide for one side, if two implementions disagree in one aspect. It might not be essential.

I think that implementations are frequently focused on use-cases in their developers mind. It means that some features that are not compelling for an important use-case, might be vital for another important use-case.
So I don't agree here, otherwise we might end into rejecting features like regex (and I would regret that).

Anyway I think that we should lean towards including features, rahter excluding them.
Again I have no evidence that we actually risk of running into a bloated specification, in the worst case we can clean up superfluos features during review phase.

@goessner:

If two implementations disagree in one essential aspect, the WG pragmatically decides for the best of both ... or for an even better third.

Sadly "best" is quite subjective, I would rather evaluate:

  • Internal consistency
  • Implementations weighted consensus (e.g. my implementation is not widely used as Jayway)
  • Practical implementation effort (is X hard or simple to implement?)

@goessner:

Our biggest challenge here seems to define "complete" ...

I think that a reasonable rough upper limit might be XPath 1.0.

Proposal 2: Let's just close this conversation and let's include all the features (that are supported by reasonable arguments).

Also we should encourage people to open PR for the features they want in, so we are not taking more editor's time and we can involve more people into this WG.

@glyn
Copy link
Collaborator

glyn commented Mar 26, 2021

It's helpful to look at the major features that are still to be added (in addition to the processing model):

  1. nested descendants (..) - seems uncontentious - just needs a PR.
  2. filter expressions (?()) - contains some areas where there is a lack of consensus, so probably best to subdivide this.
  3. "script" expressions (()) - I'd hope we could offer minimal support for these without too much contention.

Any I've missed?

@cabo
Copy link
Member

cabo commented Mar 26, 2021 via email

@goessner
Copy link
Collaborator

Regarding "designed by committee", the WG is itself a committee, which is proper for a standardization body.

Yes, of course ... but in the initial SVG committee there was only a single implementor from "Macromedia" – later aquired by "Adobe" – badly influencing and bloating the emerging web standard.

The JSON Pointer RFC, which is less complicated than JSONPath, doesn't follow the 80/20 rule, its specification is unambiguous. All implementations will behave the same, and hence are interoperable.

If we would skip (or haven't invent)

  • filter expressions
  • current node selector
  • ideas for regex's

we would be interoperable as well. And even if we strictly follow Greg's and yours proposal for only basic math and boolean operators in #17 and also a very basic subset of regex-rules we can tell everyone then, that we are unambiguous and complete now on 100 %.

@bettio suggested the possibility of a proposal that would "make everyone happy", but I think that may be somewhat ambitious. I would be satisfied with a proposal that made Stefan Goessner happy. It is after all his creation.

Thank you very much for that nice compliment :)

@gregsdennis
Copy link
Collaborator

What constitutes "breaking" an implementation?

  • Adding something to the spec that an implementation doesn't support?
  • Not including something that an implementation already supports?
  • Defining the behavior of a supported syntax in a way different to the implementation?

I'd like to highlight the discussion at #88 on unions. My implementation already doesn't support multiple indices of any kind, so adding this concept at all "breaks" my implementation, according to the above scenarios.

Is this really a break though? The spec has required something that my implementation doesn't support. But I never claimed to be in compliance with the spec. I couldn't have claimed that because the spec didn't exist when I published the library. That means that the spec could completely overhaul the syntax, and my implementation wouldn't break. It still adheres to what it claims to. That someone published a document that says "this is what JSON Path is now" doesn't change what my implementation claims to support.

Consider my JSON Schema library. It claims to support drafts 6 through 2020-12 (we had a version scheme change a bit ago). When the next draft is published, probably draft 2021-something, my implementation will still be in compliance with what it claims. It doesn't claim to support the new draft, and the draft's mere existence doesn't break anything in my implementation.

Now, if I want to be able to claim compliance to this new JSON Path spec, I'll need to make updates, and I'm fine with that. It makes sense that I would have to.

I honestly don't think that we can create a specification that won't break anyone (without making it uselessly simplistic) or to which any single implementation will be able to claim compliance without having to update something.

Implementations should expect to have to change in order to comply with this new specification.

I think this is the line of thought that we need to take when we consider what a breaking change is. We can try to minimize that impact as much as possible, but we're not going to be able to eliminate it.

Therefore I think we need to put less weight on the idea of not "breaking" implementations because they're going to need updates, regardless of what we do.

@bettio
Copy link

bettio commented Mar 28, 2021

What constitutes "breaking" an implementation?

I think that breaking for us means turning existing valid expresions for a certain implementation invalid from the RFC point of view (or changing their behavior). So the implementation will have to decide whether comply with the new RFC or invalidating already existing user expresions.

e.g. changing $.foo in £.foo is breaking, accepting also @.bar as an addition (while also accepting $.foo with no changes to it) is not breaking.

Avoiding changing code is not a point. This is going to happen, otherwise it means that our job here is mostly useless.
From my point of view my users are more important than my code.

Adding something to the spec that an implementation doesn't support?

not breaking

Not including something that an implementation already supports?

breaking, but it may allow some kind of legacy additional features with a reasonable effort. It might not work in certain situations.

Defining the behavior of a supported syntax in a way different to the implementation?

breaking and requires user knowledge about the change.

Is this really a break though? The spec has required something that my implementation doesn't support. But I never claimed to be in compliance with the spec. I couldn't have claimed that because the spec didn't exist when I published the library. That means that the spec could completely overhaul the syntax, and my implementation wouldn't break. It still adheres to what it claims to. That someone published a document that says "this is what JSON Path is now" doesn't change what my implementation claims to support.

This is a good argument. Anyway, I honestly don't mind changing my implementation code, I mostly care about users expresions.

Now, if I want to be able to claim compliance to this new JSON Path spec, I'll need to make updates, and I'm fine with that. It makes sense that I would have to.

I agree.

I honestly don't think that we can create a specification that won't break anyone (without making it uselessly simplistic) or to which any single implementation will be able to claim compliance without having to update something.

Indeed. 100% agree.

Implementations should expect to have to change in order to comply with this new specification.

Right.

Therefore I think we need to put less weight on the idea of not "breaking" implementations because they're going to need updates, regardless of what we do.

I completely agree.

@glyn
Copy link
Collaborator

glyn commented Mar 28, 2021

I agree that breaking currently valid query expressions is something most implementations will want to avoid. I'm not convinced that should be an absolute constraint on the standard, since implementations always have the option of not supporting the standard. (The standard would still have some value for such implementations as a point of reference for documentation.) Of course, we shouldn't make such changes lightly, but let's not tie our hands prematurely either. Let's discuss specific cases on their merits, bearing in mind the general desire not to break currently valid query expressions.

@bettio
Copy link

bettio commented Mar 28, 2021

#88 is another example of proposal which:

  1. no existing implementation is implementing it right now
  2. introduces an addition to the existing "common" syntax
  3. increases the specification surface (e.g. multiple expressions combined with an | operator)
  4. requires a non-negligible amount of work

I'm ok with all the previous points, and I would like to get involved into #88 discussion which is interesting.

However before spending time on it, if anyone wants to stop it with a "it is incompatible with our WG charter" comment, I would like to ask to let us know here so we don't spend our precious time on a dead end discussion.

Honestly I think that #88 is 100% compatible with our charter, and I think that it should be discussed.
I think it should be accepted or rejected according to technical arguments rather than arguments about the WG charter itself.

So if we are going to discuss #88 further (hence investing more time investigating it) I think that clearly means that there is consensus about its compatibility with the charter and that previous points are ok for most of us (therefore we are implicitly agreeing that this issue is resolved).

@glyn
Copy link
Collaborator

glyn commented Mar 28, 2021

I would say the discussion in #88 is part of thrashing out the union feature and so is compatible with the WG charter.

@bettio
Copy link

bettio commented Mar 28, 2021

@glyn: I wish to read a feedback from you about my last post.

Do you agree with my points 1-4? If you don't agree with any of them, can you tell us which one and why you don't agree?

@glyn
Copy link
Collaborator

glyn commented Mar 28, 2021

@bettio Yes, I agree with those points.

@bettio
Copy link

bettio commented Mar 28, 2021

@glyn thanks for your feedback.

I will take part into that discussion as we agreed that those points 1-4 are compatible with our charter (and not required by it). As you know, it really matters for me not spending time on a discussion that is going to be "cancelled" for "charter compatibility" reasons.

@cabo
Copy link
Member

cabo commented Jan 17, 2022

This was a useful discussion issue but no actions emerged.

@cabo cabo closed this as completed Jan 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants