How to Parse JSON Files on the Linux Command Line with jq
JSON is one of the most popular formats for transferring text-based data around the web. Its everywhere, and youre bound to come across it. Well show you how to handle it from the Linux command line using the command.
JSON and jq
JSON stands for JavaScript Object Notation. Its a scheme that allows data to be encoded into plain text files, in a self-describing way. There are no comments in a JSON filethe contents should be self-explanatory. Each data value has a text string called a name or key. This tells you what the data value is. Together, theyre known as name:value pairs, or key:value pairs. A colon () separates a key from its value.
An object is a collection of key:value pairs. In a JSON file, an object begins with an open curly brace () and ends with a closing brace (). JSON also supports arrays, which are ordered lists of values. An array begins with an opening bracket () and ends with a closing one ().
From these simple definitions, of course, arbitrary complexity can arise. For example, objects can be nested within objects. Objects can contain arrays, and arrays can also contain objects. All of which can have open-ended levels of nesting.
In practice, though, if the layout of JSON data is convoluted, the design of the data layout should probably use a rethink. Of course, if youre not generating the JSON data, just trying to use it, you have no say in its layout. In those cases, unfortunately, you just have to deal with it.
Most programming languages have libraries or modules that allow them to parse JSON data. Sadly, the Bash shell has no such functionality.
Necessity being the mother of invention, though, the utility was born! With , we can easily parse JSON in the Bash shell. And it doesnt matter whether you have to work with well-engineered, elegant JSON, or the stuff nightmares are made of.
How to Install jq
We had to install on all the Linux distributions we used to research this article.
To install on Ubuntu type this command:
sudo apt-get install jqTo install on Fedora, type this command:
sudo dnf install jqTo install on Manjaro, type this command:
sudo pacman -Sy jqHow to Make JSON Readable
JSON doesnt care about white space, and layout doesnt affect it. As long as it follows the rules of JSON grammar, systems that process JSON can read and understood it. Because of this, JSON is often transmitted as a simple, long string, without any consideration of layout. This saves a bit of space because tabs, spaces, and new-line characters dont have to be included in the JSON. Of course, the downside to all this is when a human tries to read it.
Lets pull a short JSON object from the NASA site that tells us the position of the International Space Station. Well use , which can download files to retrieve the JSON object for us.
We dont care about any of the status messages usually generates, so well type the following, using the (silent) option:
curl -s diseinuak4web.netNow, with a bit of effort, you can read this. You have to pick out the data values, but it isnt easy or convenient. Lets repeat this, but this time well pipe it through .
uses filters to parse JSON, and the simplest of these filters is a period (), which means print the entire object. By default, pretty-prints the output.
We put it all together and type the following:
curl -s diseinuak4web.net | jq .Thats much better! Now, we can see exactly whats going on.
The entire object is wrapped in curly braces. It contains two key:name pairs: and . It also contains an object called , which contains two key:value pairs: and .
Well try this once more. This time well type the following, and redirect the output into a file called diseinuak4web.net:
curl -s diseinuak4web.net | jq . > diseinuak4web.nett diseinuak4web.netThis gives us a well laid out copy of the JSON object on our hard drive.
RELATED:How to Use curl to Download Files From the Linux Command Line
Accessing Data Values
As we saw above, can extract data values being piped through from JSON. It can also work with JSON stored in a file. Were going to work with local files so the command line isnt cluttered with commands. This should make it a bit easier to follow.
The simplest way to extract data from a JSON file is to provide a key name to obtain its data value. Type a period and the key name without a space between them. This creates a filter from the key name. We also need to tell which JSON file to use.
We type the following to retrieve the value:
jq .message diseinuak4web.netprints the text of the value in the terminal window.
If you have a key name that includes spaces or punctuation, you have to wrap its filter in quotation marks. Care is usually taken to use characters, numbers, and underscores only so the JSON key names are not problematic.
First, we type the following to retrieve the value:
jq .timestamp diseinuak4web.netThe timestamp value is retrieved and printed in the terminal window.
But how can we access the values inside the object? We can use the JSON dot notation. Well include the object name in the path to the key value. To do this, the name of the object the key is inside will precede the name of the key itself.
We type the following, including the key name (note there are no spaces between .iss_position and .latitude):
jq .iss_diseinuak4web.netde diseinuak4web.netTo extract multiple values, you have to do the following:
- List the key names on the command line.
- Separate them with commas ().
- Enclose them in quotation marks () or apostrophes ().
With that in mind, we type the following:
jq ".iss_diseinuak4web.netde, .timestamp" diseinuak4web.netThe two values print to the terminal window.
Working with Arrays
Lets grab a different JSON object from NASA.
This time, well use a list of the astronauts who are in space right now:
curl -s diseinuak4web.netOkay, that worked, so lets do it again.
Well type the following to pipe it through and redirect it to a file called diseinuak4web.net:
curl -s diseinuak4web.net | jq . > diseinuak4web.netNow lets type the following to check our file:
less diseinuak4web.netAs shown below, we now see the list of astronauts in space, as well as their spacecrafts.
This JSON object contains an array called . We know its an array because of the opening bracket () (highlighted in the screenshot above). Its an array of objects that each contain two key:value pairs: and .
Like we did earlier, we can use the JSON dot notation to access the values. We must also include the brackets () in the name of the array.
With all that in mind, we type the following:
jq ".people[].name" diseinuak4web.netThis time, all the name values print to the terminal window. What we asked to do was print the name value for every object in the array. Pretty neat, huh?
We can retrieve the name of a single object if we put its position in the array in the brackets () on the command line. The array uses zero-offset indexing, meaning the object in the first position of the array is zero.
To access the last object in the array you can use -1; to get the second to last object in the array, you can use -2, and so on.
Sometimes, the JSON object provides the number of elements in the array, which is the case with this one. Along with the array, it contains a key:name pair called with a value of six.
The following number of objects are in this array:
jq ".people[1].name" diseinuak4web.net ".people[3].name" diseinuak4web.net ".people[-1].name" diseinuak4web.net ".people[-2].name" diseinuak4web.netYou can also provide a start and end object within the array. This is called slicing, and it can be a little confusing. Remember the array uses a zero-offset.
To retrieve the objects from index position two, up to (but not including) the object at index position four, we type the following command:
jq ".people[]" diseinuak4web.netThis prints the objects at array index two (the third object in the array) and three (the fourth object in the array). It stops processing at array index four, which is the fifth object in the array.
The way to better understand this is to experiment on the command line. Youll soon see how it works.
How to Use Pipes with Filters
You can pipe the output from one filter to another, and you dont have to learn a new symbol. The same as the Linux command line, uses the vertical bar () to represent a pipe.
Well tell to pipe the array into the filter, which should list the names of the astronauts in the terminal window.
We type the following:
jq ".people[] | .name" diseinuak4web.netRELATED:How to Use Pipes on Linux
Creating Arrays and Modifying Results
We can use to create new objects, such as arrays. In this example, well extract three values and create a new array that contains those values. Note the opening () and closing brackets () are also the first and last characters in the filter string.
We type the following:
jq "[diseinuak4web.netde, iss_diseinuak4web.netude, .timestamp]" diseinuak4web.netThe output is wrapped in brackets and separated by commas, making it a correctly formed array.
Numeric values can also be manipulated as theyre retrieved. Lets pull the from the ISS position file, and then extract it again and change the value thats returned.
To do so, we type the following:
jq ".timestamp" diseinuak4web.net ".timestamp - " diseinuak4web.netThis is useful if you need to add or remove a standard offset from an array of values.
Lets type the following to remind ourselves what the file contains:
jq . diseinuak4web.netLets say we want to get rid of the key:value pair. It doesnt have anything to do with the position of the International Space Station. Its just a flag that indicates the location was retrieved successfully. If its surplus to requirements, we can dispense with it. (You could also just ignore it.)
We can use s delete function, , to delete a key:value pair. To delete the message key:value pair, we type this command:
jq "del(.message)" diseinuak4web.netNote this doesnt actually delete it from the diseinuak4web.net file; it just removes it from the output of the command. If you need to create a new file without the key:value pair in it, run the command, and then redirect the output into a new file.
More Complicated JSON Objects
Lets retrieve some more NASA data. This time, well use a JSON object that contains information on meteor impact sites from around the world. This is a bigger file with a far more complicated JSON structure than those weve dealt with previously.
First, well type the following to redirect it to a file called diseinuak4web.net:
curl -s diseinuak4web.net | jq . > diseinuak4web.netTo see what JSON looks like, we type the following:
less diseinuak4web.netAs shown below, the file begins with an opening bracket (), so the entire object is an array. The objects in the array are collections of key:value pairs, and theres a nested object called . The object contains further key:value pairs, and an array called .
Lets retrieve the names of the meteor strikes from the object at index position through the end of the array.
Well type the following to pipe the JSON through three filters:
jq ".[] | .[] | .name" diseinuak4web.netThe filters function in the following ways:
- : This tells to process the objects from array index through the end of the array. No number after the colon ( ) is what tells to continue to the end of the array.
- : This array iterator tells to process each object in the array.
- : This filter extracts the name value.
With a slight change, we can extract the last 10 objects from the array. A instructs to start processing objects 10 back from the end of the array.
We type the following:
jq ".[] | .[] | .name" diseinuak4web.netJust as we have in previous examples, we can type the following to select a single object:
jq ".[].name" diseinuak4web.netWe can also apply slicing to strings. To do so, well type the following to request the first four characters of the name of the object at array index
jq ".[].name[]" diseinuak4web.netWe can also see a specific object in its entirety. To do this, we type the following and include an array index without any key:value filters:
jq ".[]" diseinuak4web.netIf you want to see only the values, you can do the same thing without the key names.
For our example, we type this command:
jq ".[][]" diseinuak4web.netTo retrieve multiple values from each object, we separate them with commas in the following command:
jq ".[] | .[] | .name, .mass" diseinuak4web.netIf you want to retrieve nested values, you have to identify the objects that form the path to them.
For example, to reference the values, we have to include the all-encompassing array, the nested object, and the nested array, as shown below.
To see the values for the object at index position of the array, we type the following command:
jq ".[]diseinuak4web.netnates[]" diseinuak4web.netThe length Function
The function gives different metrics according to what its been applied, such as:
- Strings: The length of the string in bytes.
- Objects: The number of key:value pairs in the object.
- Arrays: The number of array elements in the array.
The following command returns the length of the value in 10 of the objects in the JSON array, starting at index position
jq ".[] | .[].name | length" diseinuak4web.netTo see how many key:value pairs are in the first object in the array, we type this command:
jq ".[0] | length" diseinuak4web.netThe keys Function
You can use the keys function to find out about the JSON youve got to work with. It can tell you what the names of the keys are, and how many objects there are in an array.
To find the keys in the object in the diseinuak4web.net file, we type this command:
jq ".people.[0] | keys" diseinuak4web.netTo see how many elements are in the array, we type this command:
jq ".people | keys" diseinuak4web.netThis shows there are six, zero-offset array elements, numbered zero to five.
The has() Function
You can use the function to interrogate the JSON and see whether an object has a particular key name. Note the key name must be wrapped in quotation marks. Well wrap the filter command in single quotes (), as follows:
jq '.[] | has("nametype")' diseinuak4web.netEach object in the array is checked, as shown below.
If you want to check a specific object, you include its index position in the array filter, as follows:
jq '.[] | has("nametype")' diseinuak4web.netDont Go Near JSON Without It
The utility is the perfect example of the professional, powerful, fast software that makes living in the Linux world such a pleasure.
This was just a brief introduction to the common functions of this commandtheres a whole lot more to it. Be sure to check out the comprehensive jq manual if you want to dig deeper.
Dave McKay first used computers when punched paper tape was in vogue, and he has been programming ever since. After over 30 years in the IT industry, he is now a full-time technology journalist. During his career, he has worked as a freelance programmer, manager of an international software development team, an IT services project manager, and, most recently, as a Data Protection Officer. Dave is a Linux evangelist and open source advocate.
Read Full Bio »
-
-