Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

key missing vs key present with null #118

Closed
gregsdennis opened this issue Sep 6, 2021 · 25 comments
Closed

key missing vs key present with null #118

gregsdennis opened this issue Sep 6, 2021 · 25 comments

Comments

@gregsdennis
Copy link
Collaborator

A new StackOverflow question highlights the need to define whether these two cases are the same or different:

  • a key missing
  • a key present but with null as a value

I would expect that Javascript would handle these cases differently, but .Net generally wouldn't (it seems Newtonsoft has some special handling for this).

Posed in a more direct, testable way, given the data

[
  {
      "id": 2
  }
]

would

$[?(@.foo==null)]

return any results?

(related to #64 and #47)

@cburgmer
Copy link

cburgmer commented Sep 6, 2021

I think the corresponding test in the comparison project is https://cburgmer.github.io/json-path-comparison/results/filter_expression_with_equals_null.html. Currently no consensus exists.

@danielaparker
Copy link

danielaparker commented Sep 6, 2021

I think this is related to #74. Or, what does the left hand side of @.foo==null mean? It means different things in different implementations.

For JSONPath implementations in interpreted languages such as JavaScript, Python or PHP, and whose filter expressions are these languages, @.foo in the case of no match may mean undefined, null, false, or a type error.

In Javascript, @.foo in the case of no match means undefined. So in Goessner Javascript, we have:

$[?(@.foo==null)] returns [ { "id": 2 }]
$[?(@.foo==false)] returns [ { "id": 2 }]
$[?(@.foo===null)] returns false
$[?(@.foo===false)] returns false
$[?(@.foo===undefined)] returns [ { "id": 2 }]

For implementations such as Jayway, jsoncons, and JsonCons.JsonPath, where @.foo means a JSONPath expression that returns a single JSON value (rather than an array of values), it depends on what the implementation specifies @.foo to evaluate to when there is no match. For some, it's null, for others false, and for others it's a type error. In Jayway, by default it's a type error, and the item being evaluated is omitted from the results (Jayway has an option to evaluate to a null value instead.)

In JMESPath, for comparison, evaluating an object with a missing key is specified to return a null value, so

[?foo==null]

returns

[  { "id": 2 }]

In JSONiq, for another comparison, evaluating an object with a missing key results in an empty sequence, viz.

let $doc := [
  {
      "id": 2
  }
]

for $x in $doc[]
where $x."foo" eq null

return $x

@gregsdennis
Copy link
Collaborator Author

gregsdennis commented Sep 6, 2021

Yes, it does relate to that as well! I interpret literals as JSON values.

@danielaparker
Copy link

danielaparker commented Sep 6, 2021

Yes, it does relate to that as well! I interpret literals as JSON values.

That and #64 (comment)

@danielaparker
Copy link

danielaparker commented Sep 7, 2021

Yes, it does relate to that as well! I interpret literals as JSON values.

That and #64 (comment)

In any case, it's impossible to talk about what

?(@.foo==null)

means without first defining what @.foo means (including when foo is missing), and what == means. Different implementations define them differently, the draft doesn't currently define them at all.

Daniel

@gregsdennis
Copy link
Collaborator Author

@danielaparker you're getting a little ahead of yourself. Operators and relative paths in expressions are discussed in other issues that deal with expressions themselves. These are quite well-defined in those issues.

@danielaparker
Copy link

@gregsdennis, just noting that the answer to this issue you raised follows immediately once the meaning of @.foo and == have been defined. Nothing more.

@cabo
Copy link
Member

cabo commented Sep 8, 2021

@gregsdennis, just noting that the answer to this issue you raised follows immediately once the meaning of @.foo and == have been defined. Nothing more.

Indeed, so the discussion here should inform the decisions we make for empty @.foo and for ==.

@goessner
Copy link
Collaborator

goessner commented Sep 8, 2021

The latest draft seems to be pretty clear about that.

https://ietf-wg-jsonpath.github.io/draft-ietf-jsonpath-base/draft-ietf-jsonpath-base.html#name-filter-selector

A member or element value by itself is falsy only, if it does not exist. Otherwise it is truthy, resulting in its value. To be more specific explicit comparisons are necessary. This existence test -- as an exception of the general rule -- also works with complex values.

Problem here is, that JSON lacks undefined value (from JavaScript). So we decided to use standalone path expressions in filters as existence tests

  • [?@foo] ... selected, if member foo exists.
  • [?(!@foo)] ... selected, if member foo does not exists.
  • but [?(!!@foo)] ... selected, if value of foo is truthy.
  • [?(@foo == null)] ... selected only, if member exactly has that value.
  • [?(!@foo == true)] ... not allowed with comparison, rather use [?(@foo != true)].

Using new in operator also allows test for existence now as with [?('foo' in @)].

Most of discussions were done in this pull request

#111

Am I missing something here?

-- sg

@danielaparker
Copy link

danielaparker commented Sep 15, 2021

The latest draft seems to be pretty clear about that.

https://ietf-wg-jsonpath.github.io/draft-ietf-jsonpath-base/draft-ietf-jsonpath-base.html#name-filter-selector

Most of discussions were done in this pull request

#111

Am I missing something here?

What's missing, and what's missing generally in the draft, is any survey and analysis of prior experience in the great physical tarpit that is JSONPath. And if the purpose of this IETF effort really is to standardize existing practice, I think that's important. Of course, if that's not the purpose, than it doesn't matter, but everybody says that's the purpose.

It's not that the rules you've sketched here cannot be motivated by some prior experience, see e.g my analysis here. 11 of 42 implementations, including two that may be considered important, do follow this rule:

[?(@.foo)] ... selected, if member foo exists.

But Goessner Javascript doesn't follow that rule. Nor do a significant number of others.

If the draft really is about standardizing existing behaviour, it seems to me that it needs to justify its choices with reference to prior experience. Not just with what "we" (a handful of people) "agreed".

@goessner
Copy link
Collaborator

hmm ... in my original proposal I included (as already someone mentioned):

//book[isbn]   | $..book[?(@.isbn)]   |  filter all books with isbn number |

It seems to be, this was of no significant relevance for existing implementations.

@gregsdennis
Copy link
Collaborator Author

@goessner would your implementation match on isbn: null?

@goessner
Copy link
Collaborator

of course it would ... as I reused the Javascript engine for simplicity then.

@danielaparker
Copy link

danielaparker commented Sep 16, 2021

@goessner, maybe I've misunderstood Greg's question, but I think you meant to reply "of course it wouldn't". Using Goessner Javascript, and given the JSON document

{
    "store": {
        "book": [
            {
                "isbn" : null
            }
    ]
  }
}

the query

$..[?(@.isbn)]

returns nothing. This is what we expect from JavaScript, and is consistent with the results reported here.

Jayway (Java), on other other hand, returns

[
   {
      "isbn" : null
   }
]

@danielaparker
Copy link

danielaparker commented Sep 16, 2021

hmm ... in my original proposal I included (as already someone mentioned):

//book[isbn]   | $..book[?(@.isbn)]   |  filter all books with isbn number |

It seems to be, this was of no significant relevance for existing implementations.

I don't think any of the JSONPath implementations that use a dynamic language for evaluation, whether JavaScript, PHP, Python, Ruby or Lua, evaluate ?(@.isbn) that way. They all evaluate @.isbn first and then apply "not falsey" rules of one sort or another to the result. It's the textbook way for interpreting an expression. That's a lot of implementations, including the original.

On the other hand a significant number of (not all) implementations that implemented their own expression evaluators do evaluate ?(@.isbn) that way, including Jayway JsonPath. It's messier to implement it this way because it's not following the textbook evaluation patterns. Jayway does it by treating @.isbn as a recoverable error if isbn is missing, and handling that error to allow other terms being evaluated to continue.

@cabo
Copy link
Member

cabo commented Sep 16, 2021

If the draft really is about standardizing existing behaviour, it seems to me that it needs to justify its choices with reference to prior experience. Not just with what "we" (a handful of people) "agreed".

The charter is quite specific:

The WG will develop a standards-track JSONPath specification that
is technically sound and complete, based on the common semantics
and other aspects of existing implementations. Where there are
differences, the working group will analyze those differences and
make choices that rough consensus considers technically best, with
an aim toward minimizing disruption among the different JSONPath
implementations.

Since we do have those differences here, we are asked to consider what is "technically best", but "with an aim toward minimizing disruption". This seems clear to me, and I think our charter writers did a very good job here.

@danielaparker
Copy link

danielaparker commented Sep 16, 2021

Since we do have those differences here, we are asked to consider what is "technically best", but "with an aim toward minimizing disruption". This seems clear to me, and I think our charter writers did a very good job here.

I think it's hard to make a case that the draft is doing much to minimize disruption.

On the one hand, it's becoming incompatible with all of the implementations that use a dynamic language such as Javascript, PHP, Python, Ruby or Lua for evaluating expressions. For users of such implementations, having an expression language in their familiar language, with all of the functions that that language provides, is part of the appeal of these implementations. The users of these implementations all have a normalize-unicode function equivalent for comparing strings, and much else. These users don't care about interoperability, they care about familiarity and expressiveness.

On the other hand, the WG is sketching out an expression language that is different from the expression languages that were developed by other implementations mostly following Jayway Java. These expression languages specified @.relative-path as a JSONPath expression, with rules for "ExecuteSingle" to produce a single value for use in comparisons. @. means something different in the draft, where it's not a JSONPath expression. But Jayway Java is integrated into legacy Java enterprise software, and cannot change.

It seems to me that the draft is departing from both streams of JSONPath that evolved from the original article. I don't see anything in the draft that suggests that the editors consider prior experience important. There is no text in the draft that discusses the important implementations, and which features were motivated by one, which from another, and which are new or refinements.

Anyway, I realize that nobody here is interested in this topic, no doubt for the obvious reason :-)

@glyn
Copy link
Collaborator

glyn commented Sep 16, 2021 via email

@gregsdennis
Copy link
Collaborator Author

gregsdennis commented Sep 17, 2021

I agree that such references do not belong in the main text of the spec. An appendix is okay. I'd I had my way, if have a separate companion doc that summarized individual decisions and the reasoning behind them. Such a document would be a natural place to include references to behaviors in existing implementations.

@danielaparker
Copy link

danielaparker commented Sep 24, 2021

@glyn wrote

Perhaps an appendix is the way to do it, in which case, please feel free to submit a PR to help lead the way forward.

If the purpose of the WG is to standardize existing behavior, as they say it is, I think it's up to the authors to justify how their choices relate to prior experience.

For example, there is plenty of prior experience with specifying filter expressions, starting with Jayway, the second most important and influential implementation after Goessner. Jayway, for example, has answers to all the questions like #118, #119 and #123. But the WG take on expressions appears to be inconsistent with that prior experience in significant respects. I think the authors should explain why. That is, if the authors really mean it when they say that their intent is to standardize existing behavior.

@cabo
Copy link
Member

cabo commented Sep 24, 2021 via email

@danielaparker
Copy link

Sent from mobile, sorry for terse
On 24. Sep 2021, at 14:53, Daniel Parker @.***> wrote: If the purpose of the WG is to standardize existing behavior
It’s not.

Thanks for the clarification.

@cabo
Copy link
Member

cabo commented Jan 17, 2022

Fixed in 76789d8

@cabo cabo closed this as completed Jan 17, 2022
@gregsdennis
Copy link
Collaborator Author

Can you summarize the resolution, please?

@cabo
Copy link
Member

cabo commented Jan 18, 2022

Since 76789d8, Section 3.5.9 now says:

   *  A member or element value by itself in a Boolean context is
      interpreted as false only if it does not exist.  Otherwise it is
      interpreted as true.  To be more specific about the actual value,
      explicit comparisons are necessary.  This existence test -- as an
      exception to the general rule -- also works with structured
      values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants