Filtering and Selecting Substrings in JSON with jq
jq
is an incredibly versatile tool for parsing JSON from the command line.
This snippet describes how to use it filter based on the presence (or absence) of a substring within a JSON object's attribute. It also describes how to use a regular expression to extract that substring with an example of printing to a CSV.
Details
- Language: MISC
Snippet
# Filtering based on presence of substring in property select(.property1 | contains("channel: ")) # Filtering based on absence of substring in property select(.property1 | contains("channel: ") | not) # Extracting substring value .property1 | capture("channel: \\\"(?<channel>[a-z,0-9,-]+)\\\"","i").channel
Usage Example
JSON_STR='[{"ID": "1", "Name": "obj1", "prop1" : "foo\nbar\nchannel: \"sed\"\nzoo"}, {"ID": "2", "Name": "obj2", "prop1" : "foo\nbar\nchannel: \"\"\nzoo"}, {"ID": "3", "Name": "obj2", "prop1" : "foo\nbar\n\nzoo"}, {"ID": "4", "Name": "obj2", "prop1" : "foo\nbar\nchannel: \"zoo\"\nzoo"}]' # Items containing channel: echo "$JSON_STR" | jq '.[] | select(.prop1 | contains("channel: "))' # Items not containing channel echo "$JSON_STR" | jq '.[] | select(.prop1 | contains("channel: ") | not)' # Items with a non-empty channel echo "$JSON_STR" | jq '.[] | select(.prop1 | contains("channel: ")) | select(.prop1 | contains("channel: \"\"") | not)' # Extracting channel names echo "$JSON_STR" | jq '.[] | select(.prop1 | contains("channel: ")) | .prop1 | capture("channel: \\\"(?<channel>[a-z,0-9,-]+)\\\"","i").channel' # Generating a Tab seperated CSV of ID, Name and channel name # Skip empty channels echo "$JSON_STR" | jq -r '.[] | select(.prop1 | contains("channel: ")) | select(.prop1 | contains("channel: \"\"") | not) | [.ID, .Name, (.prop1 | capture("channel: \\\"(?<channel>[a-z,0-9,-]+)\\\"","i").channel)] | @tsv'