key missing vs key present with null #118

gregsdennis · 2021-09-06T05:14:17Z

A new StackOverflow question highlights the need to define whether these two cases are the same or different:

a key missing
a key present but with null as a value

I would expect that Javascript would handle these cases differently, but .Net generally wouldn't (it seems Newtonsoft has some special handling for this).

Posed in a more direct, testable way, given the data

[
  {
      "id": 2
  }
]

would

$[?(@.foo==null)]

return any results?

(related to #64 and #47)

The text was updated successfully, but these errors were encountered:

cburgmer · 2021-09-06T06:45:49Z

I think the corresponding test in the comparison project is https://cburgmer.github.io/json-path-comparison/results/filter_expression_with_equals_null.html. Currently no consensus exists.

danielaparker · 2021-09-06T12:12:28Z

I think this is related to #74. Or, what does the left hand side of @.foo==null mean? It means different things in different implementations.

For JSONPath implementations in interpreted languages such as JavaScript, Python or PHP, and whose filter expressions are these languages, @.foo in the case of no match may mean undefined, null, false, or a type error.

In Javascript, @.foo in the case of no match means undefined. So in Goessner Javascript, we have:

$[?(@.foo==null)] returns [ { "id": 2 }]
$[?(@.foo==false)] returns [ { "id": 2 }]
$[?(@.foo===null)] returns false
$[?(@.foo===false)] returns false
$[?(@.foo===undefined)] returns [ { "id": 2 }]

For implementations such as Jayway, jsoncons, and JsonCons.JsonPath, where @.foo means a JSONPath expression that returns a single JSON value (rather than an array of values), it depends on what the implementation specifies @.foo to evaluate to when there is no match. For some, it's null, for others false, and for others it's a type error. In Jayway, by default it's a type error, and the item being evaluated is omitted from the results (Jayway has an option to evaluate to a null value instead.)

In JMESPath, for comparison, evaluating an object with a missing key is specified to return a null value, so

[?foo==null]

returns

[  { "id": 2 }]

In JSONiq, for another comparison, evaluating an object with a missing key results in an empty sequence, viz.

let $doc := [
  {
      "id": 2
  }
]

for $x in $doc[]
where $x."foo" eq null

return $x

gregsdennis · 2021-09-06T12:56:06Z

Yes, it does relate to that as well! I interpret literals as JSON values.

danielaparker · 2021-09-06T14:17:15Z

Yes, it does relate to that as well! I interpret literals as JSON values.

That and #64 (comment)

danielaparker · 2021-09-07T11:32:09Z

Yes, it does relate to that as well! I interpret literals as JSON values.

That and #64 (comment)

In any case, it's impossible to talk about what

?(@.foo==null)

means without first defining what @.foo means (including when foo is missing), and what == means. Different implementations define them differently, the draft doesn't currently define them at all.

Daniel

gregsdennis · 2021-09-07T20:01:29Z

@danielaparker you're getting a little ahead of yourself. Operators and relative paths in expressions are discussed in other issues that deal with expressions themselves. These are quite well-defined in those issues.

danielaparker · 2021-09-07T21:26:10Z

@gregsdennis, just noting that the answer to this issue you raised follows immediately once the meaning of @.foo and == have been defined. Nothing more.

cabo · 2021-09-08T05:50:30Z

@gregsdennis, just noting that the answer to this issue you raised follows immediately once the meaning of @.foo and == have been defined. Nothing more.

Indeed, so the discussion here should inform the decisions we make for empty @.foo and for ==.

goessner · 2021-09-08T09:40:34Z

The latest draft seems to be pretty clear about that.

https://ietf-wg-jsonpath.github.io/draft-ietf-jsonpath-base/draft-ietf-jsonpath-base.html#name-filter-selector

A member or element value by itself is falsy only, if it does not exist. Otherwise it is truthy, resulting in its value. To be more specific explicit comparisons are necessary. This existence test -- as an exception of the general rule -- also works with complex values.

Problem here is, that JSON lacks undefined value (from JavaScript). So we decided to use standalone path expressions in filters as existence tests

[?@foo] ... selected, if member foo exists.
[?(!@foo)] ... selected, if member foo does not exists.
but [?(!!@foo)] ... selected, if value of foo is truthy.
[?(@foo == null)] ... selected only, if member exactly has that value.
[?(!@foo == true)] ... not allowed with comparison, rather use [?(@foo != true)].

Using new in operator also allows test for existence now as with [?('foo' in @)].

Most of discussions were done in this pull request

#111

Am I missing something here?

-- sg

danielaparker · 2021-09-15T18:46:33Z

The latest draft seems to be pretty clear about that.

https://ietf-wg-jsonpath.github.io/draft-ietf-jsonpath-base/draft-ietf-jsonpath-base.html#name-filter-selector

Most of discussions were done in this pull request

#111

Am I missing something here?

What's missing, and what's missing generally in the draft, is any survey and analysis of prior experience in the great physical tarpit that is JSONPath. And if the purpose of this IETF effort really is to standardize existing practice, I think that's important. Of course, if that's not the purpose, than it doesn't matter, but everybody says that's the purpose.

It's not that the rules you've sketched here cannot be motivated by some prior experience, see e.g my analysis here. 11 of 42 implementations, including two that may be considered important, do follow this rule:

[?(@.foo)] ... selected, if member foo exists.

But Goessner Javascript doesn't follow that rule. Nor do a significant number of others.

If the draft really is about standardizing existing behaviour, it seems to me that it needs to justify its choices with reference to prior experience. Not just with what "we" (a handful of people) "agreed".

goessner · 2021-09-16T07:26:19Z

hmm ... in my original proposal I included (as already someone mentioned):

//book[isbn]   | $..book[?(@.isbn)]   |  filter all books with isbn number |

It seems to be, this was of no significant relevance for existing implementations.

gregsdennis · 2021-09-16T08:25:15Z

@goessner would your implementation match on isbn: null?

goessner · 2021-09-16T10:00:26Z

of course it would ... as I reused the Javascript engine for simplicity then.

danielaparker · 2021-09-16T13:40:06Z

@goessner, maybe I've misunderstood Greg's question, but I think you meant to reply "of course it wouldn't". Using Goessner Javascript, and given the JSON document

{
    "store": {
        "book": [
            {
                "isbn" : null
            }
    ]
  }
}

the query

$..[?(@.isbn)]

returns nothing. This is what we expect from JavaScript, and is consistent with the results reported here.

Jayway (Java), on other other hand, returns

[
   {
      "isbn" : null
   }
]

danielaparker · 2021-09-16T14:32:21Z

hmm ... in my original proposal I included (as already someone mentioned):
//book[isbn]   | $..book[?(@.isbn)]   |  filter all books with isbn number |
It seems to be, this was of no significant relevance for existing implementations.

I don't think any of the JSONPath implementations that use a dynamic language for evaluation, whether JavaScript, PHP, Python, Ruby or Lua, evaluate ?(@.isbn) that way. They all evaluate @.isbn first and then apply "not falsey" rules of one sort or another to the result. It's the textbook way for interpreting an expression. That's a lot of implementations, including the original.

On the other hand a significant number of (not all) implementations that implemented their own expression evaluators do evaluate ?(@.isbn) that way, including Jayway JsonPath. It's messier to implement it this way because it's not following the textbook evaluation patterns. Jayway does it by treating @.isbn as a recoverable error if isbn is missing, and handling that error to allow other terms being evaluated to continue.

cabo · 2021-09-16T15:00:20Z

If the draft really is about standardizing existing behaviour, it seems to me that it needs to justify its choices with reference to prior experience. Not just with what "we" (a handful of people) "agreed".

The charter is quite specific:

The WG will develop a standards-track JSONPath specification that
is technically sound and complete, based on the common semantics
and other aspects of existing implementations. Where there are
differences, the working group will analyze those differences and
make choices that rough consensus considers technically best, with
an aim toward minimizing disruption among the different JSONPath
implementations.

Since we do have those differences here, we are asked to consider what is "technically best", but "with an aim toward minimizing disruption". This seems clear to me, and I think our charter writers did a very good job here.

danielaparker · 2021-09-16T19:47:52Z

Since we do have those differences here, we are asked to consider what is "technically best", but "with an aim toward minimizing disruption". This seems clear to me, and I think our charter writers did a very good job here.

I think it's hard to make a case that the draft is doing much to minimize disruption.

On the one hand, it's becoming incompatible with all of the implementations that use a dynamic language such as Javascript, PHP, Python, Ruby or Lua for evaluating expressions. For users of such implementations, having an expression language in their familiar language, with all of the functions that that language provides, is part of the appeal of these implementations. The users of these implementations all have a normalize-unicode function equivalent for comparing strings, and much else. These users don't care about interoperability, they care about familiarity and expressiveness.

On the other hand, the WG is sketching out an expression language that is different from the expression languages that were developed by other implementations mostly following Jayway Java. These expression languages specified @.relative-path as a JSONPath expression, with rules for "ExecuteSingle" to produce a single value for use in comparisons. @. means something different in the draft, where it's not a JSONPath expression. But Jayway Java is integrated into legacy Java enterprise software, and cannot change.

It seems to me that the draft is departing from both streams of JSONPath that evolved from the original article. I don't see anything in the draft that suggests that the editors consider prior experience important. There is no text in the draft that discusses the important implementations, and which features were motivated by one, which from another, and which are new or refinements.

Anyway, I realize that nobody here is interested in this topic, no doubt for the obvious reason :-)

glyn · 2021-09-16T20:38:48Z

Daniel wrote:

I don't see anything in the draft that suggests that the editors consider

prior experience important. There is no text in the draft that discusses the important implementations,

and which features were motivated by one, which from another, and which

are new or refinements. I consider prior experience to be very important and I'm sure Stefan does too. However, I am not sure to what extent the spec should reference particular implementations since such references will not necessarily stand the test of time. Perhaps an appendix is the way to do it, in which case, please feel free to submit a PR to help lead the way forward. Editors need not be the only authors.

Anyway, I realize that nobody here is interested in this topic, no doubt

for the obvious reason :-) Not so.

…

On Thu, 16 Sept 2021 at 20:48, Daniel Parker ***@***.***> wrote: Since we do have those differences here, we are asked to consider what is "technically best", but "with an aim toward minimizing disruption". This seems clear to me, and I think our charter writers did a very good job here. I think it's hard to make a case that the draft is doing much to minimize disruption. On the one hand, it's becoming incompatible with all of the implementations that use a dynamic language such as Javascript, PHP, Python, Ruby or Lua for evaluating expressions. For users of such implementations, having an expression language in their familiar language, with all of the functions that that language provides, is part of the appeal of these implementations. On the other hand, the WG is sketching out an expression language that is different from the expression languages that were developed by other implementations mostly following Jayway. These expression languages specified @.relative-path as a JSONPath expression, with rules for "ExecuteSingle" to produce a single value for use in comparisons. @. means something different in the draft, where it's not a JSONPath expression. It seems to me that the draft is departing from both streams of JSONPath that evolved from the original article. I don't see anything in the draft that suggests that the editors consider prior experience important. There is no text in the draft that discusses the important implementations, and which features were motivated by one, which from another, and which are new or refinements. Anyway, I realize that nobody here is interested in this topic, no doubt for the obvious reason :-) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#118 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAXF2OMWFKBT5DNTHL3FULUCJCXHANCNFSM5DPUZ5FA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

gregsdennis · 2021-09-17T00:46:57Z

I agree that such references do not belong in the main text of the spec. An appendix is okay. I'd I had my way, if have a separate companion doc that summarized individual decisions and the reasoning behind them. Such a document would be a natural place to include references to behaviors in existing implementations.

danielaparker · 2021-09-24T12:52:41Z

@glyn wrote

Perhaps an appendix is the way to do it, in which case, please feel free to submit a PR to help lead the way forward.

If the purpose of the WG is to standardize existing behavior, as they say it is, I think it's up to the authors to justify how their choices relate to prior experience.

For example, there is plenty of prior experience with specifying filter expressions, starting with Jayway, the second most important and influential implementation after Goessner. Jayway, for example, has answers to all the questions like #118, #119 and #123. But the WG take on expressions appears to be inconsistent with that prior experience in significant respects. I think the authors should explain why. That is, if the authors really mean it when they say that their intent is to standardize existing behavior.

cabo · 2021-09-24T16:24:33Z

Sent from mobile, sorry for terse

On 24. Sep 2021, at 14:53, Daniel Parker ***@***.***> wrote: If the purpose of the WG is to standardize existing behavior

It’s not. Please reread the charter. RFC 2418 talks more about the role of authors (ie to implement the directions of the WG), and I’m certain the authors will do that. If the WG wants to have a big rationale section to document their decisions beyond what feels natural for the authors, all I can say is “send text”…

danielaparker · 2021-09-24T16:52:21Z

Sent from mobile, sorry for terse
On 24. Sep 2021, at 14:53, Daniel Parker @.***> wrote: If the purpose of the WG is to standardize existing behavior
It’s not.

Thanks for the clarification.

cabo · 2022-01-17T22:36:37Z

Fixed in 76789d8

gregsdennis · 2022-01-18T04:52:58Z

Can you summarize the resolution, please?

cabo · 2022-01-18T06:35:15Z

Since 76789d8, Section 3.5.9 now says:

   *  A member or element value by itself in a Boolean context is
      interpreted as false only if it does not exist.  Otherwise it is
      interpreted as true.  To be more specific about the actual value,
      explicit comparisons are necessary.  This existence test -- as an
      exception to the general rule -- also works with structured
      values.

cabo closed this as completed Jan 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

key missing vs key present with null #118

key missing vs key present with null #118

gregsdennis commented Sep 6, 2021

cburgmer commented Sep 6, 2021

danielaparker commented Sep 6, 2021 •

edited

Loading

gregsdennis commented Sep 6, 2021 •

edited

Loading

danielaparker commented Sep 6, 2021 •

edited

Loading

danielaparker commented Sep 7, 2021 •

edited

Loading

gregsdennis commented Sep 7, 2021

danielaparker commented Sep 7, 2021

cabo commented Sep 8, 2021

goessner commented Sep 8, 2021

danielaparker commented Sep 15, 2021 •

edited

Loading

goessner commented Sep 16, 2021

gregsdennis commented Sep 16, 2021

goessner commented Sep 16, 2021

danielaparker commented Sep 16, 2021 •

edited

Loading

danielaparker commented Sep 16, 2021 •

edited

Loading

cabo commented Sep 16, 2021

danielaparker commented Sep 16, 2021 •

edited

Loading

glyn commented Sep 16, 2021 via email

gregsdennis commented Sep 17, 2021 •

edited

Loading

danielaparker commented Sep 24, 2021 •

edited

Loading

cabo commented Sep 24, 2021 via email

danielaparker commented Sep 24, 2021

cabo commented Jan 17, 2022

gregsdennis commented Jan 18, 2022

cabo commented Jan 18, 2022

key missing vs key present with null #118

key missing vs key present with null #118

Comments

gregsdennis commented Sep 6, 2021

cburgmer commented Sep 6, 2021

danielaparker commented Sep 6, 2021 • edited Loading

gregsdennis commented Sep 6, 2021 • edited Loading

danielaparker commented Sep 6, 2021 • edited Loading

danielaparker commented Sep 7, 2021 • edited Loading

gregsdennis commented Sep 7, 2021

danielaparker commented Sep 7, 2021

cabo commented Sep 8, 2021

goessner commented Sep 8, 2021

danielaparker commented Sep 15, 2021 • edited Loading

goessner commented Sep 16, 2021

gregsdennis commented Sep 16, 2021

goessner commented Sep 16, 2021

danielaparker commented Sep 16, 2021 • edited Loading

danielaparker commented Sep 16, 2021 • edited Loading

cabo commented Sep 16, 2021

danielaparker commented Sep 16, 2021 • edited Loading

glyn commented Sep 16, 2021 via email

gregsdennis commented Sep 17, 2021 • edited Loading

danielaparker commented Sep 24, 2021 • edited Loading

cabo commented Sep 24, 2021 via email

danielaparker commented Sep 24, 2021

cabo commented Jan 17, 2022

gregsdennis commented Jan 18, 2022

cabo commented Jan 18, 2022

danielaparker commented Sep 6, 2021 •

edited

Loading

gregsdennis commented Sep 6, 2021 •

edited

Loading

danielaparker commented Sep 6, 2021 •

edited

Loading

danielaparker commented Sep 7, 2021 •

edited

Loading

danielaparker commented Sep 15, 2021 •

edited

Loading

danielaparker commented Sep 16, 2021 •

edited

Loading

danielaparker commented Sep 16, 2021 •

edited

Loading

danielaparker commented Sep 16, 2021 •

edited

Loading

gregsdennis commented Sep 17, 2021 •

edited

Loading

danielaparker commented Sep 24, 2021 •

edited

Loading