-
Notifications
You must be signed in to change notification settings - Fork 4
Examples
One of the most common usage scenarios for JMESPath is being able to take a complex JSON document and simplify it down. The main features at work here are filters and multiselects. In this example below, we're taking the array of people and, for any element with an age key whose value is greater than 20, we're creating a sub list of the name and age values.
people[?age > `20`].[name, age]
{
"people": [
{
"age": 20,
"other": "foo",
"name": "Bob"
},
{
"age": 25,
"other": "bar",
"name": "Fred"
},
{
"age": 30,
"other": "baz",
"name": "George"
}
]
}
In the previous example we were taking an array of hashes, and
simplifying down to an array of two element arrays containing a name and
an age. We're also only including list elements where the age
key is
greater than 20
. If instead we want to create the same hash structure
but only include the age
and name
key, we can instead say:
people[?age > `20`].{name: name, age: age}
{
"people": [
{
"age": 20,
"other": "foo",
"name": "Bob"
},
{
"age": 25,
"other": "bar",
"name": "Fred"
},
{
"age": 30,
"other": "baz",
"name": "George"
}
]
}
The last half of the above expression contains key value pairs which
have the general form keyname: <expression>
. In the above expression
we're just using a field as an expression, but they can be more
advanced expressions. For example:
people[*].{name: name, tags: tags[0]}
{
"people": [
{
"age": 20,
"tags": [
"a",
"b",
"c"
],
"name": "Bob"
},
{
"age": 25,
"tags": [
"d",
"e",
"f"
],
"name": "Fred"
},
{
"age": 30,
"tags": [
"g",
"h",
"i"
],
"name": "George"
}
]
}
Notice in the above example instead of applying a filter expression
([? <expr> ]
), we're selecting all array elements via [*]
.
reservations[].instances[].[tags[?Key=='Name'].Values[] | [0], type, state.name]
{
"reservations": [
{
"instances": [
{
"type": "small",
"state": {
"name": "running"
},
"tags": [
{
"Key": "Name",
"Values": [
"Web"
]
},
{
"Key": "version",
"Values": [
"1"
]
}
]
},
{
"type": "large",
"state": {
"name": "stopped"
},
"tags": [
{
"Key": "Name",
"Values": [
"Web"
]
},
{
"Key": "version",
"Values": [
"1"
]
}
]
}
]
},
{
"instances": [
{
"type": "medium",
"state": {
"name": "terminated"
},
"tags": [
{
"Key": "Name",
"Values": [
"Web"
]
},
{
"Key": "version",
"Values": [
"1"
]
}
]
},
{
"type": "xlarge",
"state": {
"name": "running"
},
"tags": [
{
"Key": "Name",
"Values": [
"DB"
]
},
{
"Key": "version",
"Values": [
"1"
]
}
]
}
]
}
]
}
The above example combines several JMESPath features including the
flatten
operator, multiselect lists, filters, and pipes.
The input data contains a top level key, "reservations", which is a list. Within each list, there is an "instances" key, which is also a list.
The first thing we're doing here is creating a single list from
multiple lists of instances. By using the flatten
we can take the two instances from the first list and the
two instances from the second list, and combine them into a single list.
Try changing the above expression to just reservations[].instances[]
to see what this flattened list looks like. Everything to the right of
the reservations[].instances[]
is about taking the flattened list and
paring it down to contain only the data that we want. This expression is
taking each element in the original list and transforming it into a
three element sublist. The three elements are:
- In the
tags
list, select the first element in the flattenedValues
list whoseKey
has a value ofName
. - The
type
- The
state.name
of each instance.
The most interesting of those three expressions is the
tags[?Key=='Name'].Values[] | [0]
part. Let's examine that further.
The first thing to notice is that we're filtering down the list
associated with the tags
key. The tags[?Key==`Name`]
tells us to
only include list elements that contain a Key
whose value is Name
.
From those filtered list elements we're going to take the Values
key
and flatten the list. Finally, the | [0]
will take the entire list and
extract the 0th element.
In this example, we're going to look at how you can filter nested hashes.
people[?general.id==`100`].general | [0]
{
"people": [
{
"general": {
"id": 100,
"age": 20,
"other": "foo",
"name": "Bob"
},
"history": {
"first_login": "2014-01-01",
"last_login": "2014-01-02"
}
},
{
"general": {
"id": 101,
"age": 30,
"other": "bar",
"name": "Bill"
},
"history": {
"first_login": "2014-05-01",
"last_login": "2014-05-02"
}
}
]
}
In this example we're searching through the people
array. Each
element in this array contains a hash of two elements, and each value in
the hash is itself a hash. We're trying to retrieve the value of the
general
key that contains an id
key with a value of 100
.
If we just had the expression people[?general.id==`100`]
, we'd
have a result of:
[{
"general": {
"id": 100,
"age": 20,
"other": "foo",
"name": "Bob"
},
"history": {
"first_login": "2014-01-01",
"last_login": "2014-01-02"
}
}]
Let's walk through how we arrived at this result. In words, the
people[?general.id==`100`]
expression is saying "for each element
in the people array, select the elements where the general.id
equals
100
". If we trace the execution of this filtering process we have:
# First element:
{
"general": {
"id": 100,
"age": 20,
"other": "foo",
"name": "Bob"
},
"history": {
"first_login": "2014-01-01",
"last_login": "2014-01-02"
}
},
# Applying the expression ``general.id`` to this hash::
100
# Does 100==100?
true
# Add this first element (in its entirety) to the result list.
# Second element:
{
"general": {
"id": 101,
"age": 30,
"other": "bar",
"name": "Bill"
},
"history": {
"first_login": "2014-05-01",
"last_login": "2014-05-02"
}
}
# Applying the expression ``general.id`` to this element::
101
# Does 101==100?
false
# Do not add this element to the results list.
# Result of this expression is a list containing the first element.
However, this still isn't the final value we want which is:
{
"id": 100,
"age": 20,
"other": "foo",
"name": "Bob"
}
In order to get to this value from our filtered results we need to first
select the general
key. This gives us a list of just the values of the
general
hash:
[{
"id": 100,
"age": 20,
"other": "foo",
"name": "Bob"
}]
From there, we then uses a pipe (|
) to stop projections so that we can
finally select the first element ([0]
). Note that we are making the
assumption that there's only one hash that contains an id
of 100
.
Given the way the data is structured, it's entirely possible to have
data such as:
{
"people": [
{
"general": {
"id": 100,
"age": 20
},
"history": {
}
},
{
"general": {
"id": 101,
"age": 30
},
"history": {
}
},
{
"general": {
"id": 100,
"age": 30
},
"history": {
}
}
]
}
Note here that the first and last elements in the people
array both
have an id
of 100
. Our expression would then select the first
element that matched.
Finally, it's worth mentioning there is more than one way to write this
expression. In this example we've decided that after we filter the list
we're going to select the value of the general
key and then select
the first element in that list. We could also reverse the order of those
operations, we could have taken the filtered list, selected the first
element, and then extracted the value associated with the general
key.
That expression would be:
people[?general.id==`100`] | [0].general
Both versions are equally valid.
JMESPath functions give you a lot of power and flexibility when working with JMESPath expressions. Below are some common expressions and functions used in JMESPath.
sort_by(Contents, &Date)[*].{Key: Key, Size: Size}
{
"Contents": [
{
"Date": "2014-12-21T05:18:08.000Z",
"Key": "logs/bb",
"Size": 303
},
{
"Date": "2014-12-20T05:19:10.000Z",
"Key": "logs/aa",
"Size": 308
},
{
"Date": "2014-12-20T05:19:12.000Z",
"Key": "logs/qux",
"Size": 297
},
{
"Date": "2014-11-20T05:22:23.000Z",
"Key": "logs/baz",
"Size": 329
},
{
"Date": "2014-12-20T05:25:24.000Z",
"Key": "logs/bar",
"Size": 604
},
{
"Date": "2014-12-20T05:27:12.000Z",
"Key": "logs/foo",
"Size": 647
}
]
}
The first interesting thing here if the use of the function sort_by
.
In this example we are sorting the Contents
array by the value of each
Date
key in each element in the Contents
array. The sort_by
function takes two arguments. The first argument is an array, and the
second argument describes the key that should be used to sort the array.
The second interesting thing in this expression is that the second
argument starts with &
, which creates an expression type. Think of
this conceptually as a reference to an expression that can be evaluated
later. If you are familiar with lambda and anonymous functions,
expression types are similar. The reason we use &Date
instead of
Date
is because if the expression is Date
, it would be evaluated
before calling the function, and given there's no Date
key in the
outer hash, the second argument would evaluate to null
. Check out
function-evaluation
{.interpreted-text role="ref"} in the specification
for more information on how functions are evaluated in JMESPath. Also,
note that we're taking advantage of the fact that the dates are in ISO
8601 format, which can be sorted lexicographically.
And finally, the last interesting thing in this expression is the [*]
immediately after the sort_by
function call. The reason for this is
that we want to apply the multiselect hash, the second half of the
expression, to each element in the sorted array. In order to do this we
need a projection. The [*]
does exactly that, it takes the input array
and creates a projection such that the multiselect hash
{Key: Key, Size: Size}
will be applied to each element in the list.
There are other functions that take expression types that are similar to
sort_by
including min_by
and max_by
.
Pipe expression are useful for stopping projections. They can also be used to group expressions.
Let's look at a modified version of the expression on the JMESPath front page.
locations[?state == 'WA'].name | sort(@)[-2:] | {WashingtonCities: join(', ', @)}
{
"locations": [
{
"name": "Seattle",
"state": "WA"
},
{
"name": "New York",
"state": "NY"
},
{
"name": "Bellevue",
"state": "WA"
},
{
"name": "Olympia",
"state": "WA"
}
]
}
We can think of this JMESPath expression as having three components,
each separated by the pipe character |
. The first expression is
familiar to us, it's similar to the first example on this page. The
second part of the expression, sort(@)
, is similar to the sort_by
function we saw in the previous section. The @
token is used to refer
to the current element. The sort
function takes a single parameter which is an array. If the input JSON
document was a hash, and we wanted to sort the foo
key, which was an
array, we could just use sort(foo)
. In this scenario, the input JSON
document is the array we want to sort. To refer to this value, we use
the current element, @
, to indicate this. We're also only taking a
subset of the sorted array. We're using a slice ([-2:]
) to indicate
that we only want the last two elements in the sorted array to be passed
through to the final third of this expression.
And finally, the third part of the expression,
{WashingtonCities: join(', ', @)}
, creates a multiselect hash. It
takes as input, the list of sorted city names, and produces a hash with
a single key, WashingtonCities
, whose values are the input list
(denoted by @
) as a string separated by a comma.