Tutorial

GitHub has a JSON API, so let’s play with that. This URL gets us the last 5 commits from the jq repo:

curl 'https://api.github.com/repos/stedolan/jq/commits?per_page=5'
Show result
[
  {
    "sha": "d25341478381063d1c76e81b3a52e0592a7c997f",
    "commit": {
      "author": {
        "name": "Stephen Dolan",
        "email": "mu@netsoc.tcd.ie",
        "date": "2013-06-22T16:30:59Z"
      },
      "committer": {
        "name": "Stephen Dolan",
        "email": "mu@netsoc.tcd.ie",
        "date": "2013-06-22T16:30:59Z"
      },
      "message": "Merge pull request #162 from stedolan/utf8-fixes\n\nUtf8 fixes. Closes #161",
      "tree": {
        "sha": "6ab697a8dfb5a96e124666bf6d6213822599fb40",
        "url": "https://api.github.com/repos/stedolan/jq/git/trees/6ab697a8dfb5a96e124666bf6d6213822599fb40"
      },
      "url": "https://api.github.com/repos/stedolan/jq/git/commits/d25341478381063d1c76e81b3a52e0592a7c997f",
      "comment_count": 0
    },
    "url": "https://api.github.com/repos/stedolan/jq/commits/d25341478381063d1c76e81b3a52e0592a7c997f",
    "html_url": "https://github.com/stedolan/jq/commit/d25341478381063d1c76e81b3a52e0592a7c997f",
    "comments_url": "https://api.github.com/repos/stedolan/jq/commits/d25341478381063d1c76e81b3a52e0592a7c997f/comments",
    "author": {
      "login": "stedolan",
    ...

Github returns a ton of info here, and it’s relatively hard to read as it is. To syntax highlight it and order the attributes we pipe it through jq, telling jq to just spit the input back at us using the expression .:

curl 'https://api.github.com/repos/stedolan/jq/commits?per_page=5' | jq '.'
Show result
[
  {
    "parents": [
      {
        "html_url": "https://github.com/stedolan/jq/commit/54b9c9bdb225af5d886466d72f47eafc51acb4f7",
        "url": "https://api.github.com/repos/stedolan/jq/commits/54b9c9bdb225af5d886466d72f47eafc51acb4f7",
        "sha": "54b9c9bdb225af5d886466d72f47eafc51acb4f7"
      },
      {
        "html_url": "https://github.com/stedolan/jq/commit/8b1b503609c161fea4b003a7179b3fbb2dd4345a",
        "url": "https://api.github.com/repos/stedolan/jq/commits/8b1b503609c161fea4b003a7179b3fbb2dd4345a",
        "sha": "8b1b503609c161fea4b003a7179b3fbb2dd4345a"
      }
    ],
    "committer": {
      "type": "User",
      "received_events_url": "https://api.github.com/users/stedolan/received_events",
      "events_url": "https://api.github.com/users/stedolan/events{/privacy}",
      "repos_url": "https://api.github.com/users/stedolan/repos",
      "organizations_url": "https://api.github.com/users/stedolan/orgs",
...

There’s still far too much info for our purposes, so we can try just grab the first commit:

curl 'https://api.github.com/repos/stedolan/jq/commits?per_page=5' | jq '.[0]'
Show result
{
  "parents": [
    {
      "html_url": "https://github.com/stedolan/jq/commit/54b9c9bdb225af5d886466d72f47eafc51acb4f7",
      "url": "https://api.github.com/repos/stedolan/jq/commits/54b9c9bdb225af5d886466d72f47eafc51acb4f7",
      "sha": "54b9c9bdb225af5d886466d72f47eafc51acb4f7"
    },
    {
      "html_url": "https://github.com/stedolan/jq/commit/8b1b503609c161fea4b003a7179b3fbb2dd4345a",
      "url": "https://api.github.com/repos/stedolan/jq/commits/8b1b503609c161fea4b003a7179b3fbb2dd4345a",
      "sha": "8b1b503609c161fea4b003a7179b3fbb2dd4345a"
    }
  ],
  "committer": {
    "type": "User",
    "received_events_url": "https://api.github.com/users/stedolan/received_events",
    "events_url": "https://api.github.com/users/stedolan/events{/privacy}",
    "repos_url": "https://api.github.com/users/stedolan/repos",
    "organizations_url": "https://api.github.com/users/stedolan/orgs",
    "subscriptions_url": "https://api.github.com/users/stedolan/subscriptions",
    "starred_url": "https://api.github.com/users/stedolan/starred{/owner}{/repo}",
    "gists_url": "https://api.github.com/users/stedolan/gists{/gist_id}",
    "login": "stedolan",
    "id": 79765,
    "avatar_url": "https://1.gravatar.com/avatar/31de909d8e55dd07ed782d92ece59842?d=https%3A%2F%2Fidenticons.github.com%2Ffc5b6765b1c9cfaecea48ae71df4d279.png",
    "gravatar_id": "31de909d8e55dd07ed782d92ece59842",
    "url": "https://api.github.com/users/stedolan",
    "html_url": "https://github.com/stedolan",
    "followers_url": "https://api.github.com/users/stedolan/followers",
    "following_url": "https://api.github.com/users/stedolan/following{/other_user}"
  },
  "author": {
    "type": "User",
    "received_events_url": "https://api.github.com/users/stedolan/received_events",
    "events_url": "https://api.github.com/users/stedolan/events{/privacy}",
    "repos_url": "https://api.github.com/users/stedolan/repos",
    "organizations_url": "https://api.github.com/users/stedolan/orgs",
    "subscriptions_url": "https://api.github.com/users/stedolan/subscriptions",
    "starred_url": "https://api.github.com/users/stedolan/starred{/owner}{/repo}",
    "gists_url": "https://api.github.com/users/stedolan/gists{/gist_id}",
    "login": "stedolan",
    "id": 79765,
    "avatar_url": "https://1.gravatar.com/avatar/31de909d8e55dd07ed782d92ece59842?d=https%3A%2F%2Fidenticons.github.com%2Ffc5b6765b1c9cfaecea48ae71df4d279.png",
    "gravatar_id": "31de909d8e55dd07ed782d92ece59842",
    "url": "https://api.github.com/users/stedolan",
    "html_url": "https://github.com/stedolan",
    "followers_url": "https://api.github.com/users/stedolan/followers",
    "following_url": "https://api.github.com/users/stedolan/following{/other_user}"
  },
  "comments_url": "https://api.github.com/repos/stedolan/jq/commits/d25341478381063d1c76e81b3a52e0592a7c997f/comments",
  "html_url": "https://github.com/stedolan/jq/commit/d25341478381063d1c76e81b3a52e0592a7c997f",
  "url": "https://api.github.com/repos/stedolan/jq/commits/d25341478381063d1c76e81b3a52e0592a7c997f",
  "commit": {
    "comment_count": 0,
    "url": "https://api.github.com/repos/stedolan/jq/git/commits/d25341478381063d1c76e81b3a52e0592a7c997f",
    "tree": {
      "url": "https://api.github.com/repos/stedolan/jq/git/trees/6ab697a8dfb5a96e124666bf6d6213822599fb40",
      "sha": "6ab697a8dfb5a96e124666bf6d6213822599fb40"
    },
    "message": "Merge pull request #162 from stedolan/utf8-fixes\n\nUtf8 fixes. Closes #161",
    "committer": {
      "date": "2013-06-22T16:30:59Z",
      "email": "mu@netsoc.tcd.ie",
      "name": "Stephen Dolan"
    },
    "author": {
      "date": "2013-06-22T16:30:59Z",
      "email": "mu@netsoc.tcd.ie",
      "name": "Stephen Dolan"
    }
  },
  "sha": "d25341478381063d1c76e81b3a52e0592a7c997f"
}

For the rest of the examples, I’ll leave out the curl command - it’s not going to change.

There’s still a lot of info we don’t care about there, so we’ll restrict it down to the most interesting fields.

jq '.[0] | {message: .commit.message, name: .commit.committer.name}'
Show result
{
  "name": "Stephen Dolan",
  "message": "Merge pull request #162 from stedolan/utf8-fixes\n\nUtf8 fixes. Closes #161"
}

The | operator in jq feeds the output of one filter (.[0] which gets the first element of the array in the response) into the input of another ({...} which builds an object out of those fields). You can access nested attributes, such as .commit.message:

Now let’s get the rest of the commits:

jq '.[] | {message: .commit.message, name: .commit.committer.name}'
Show result
{
  "name": "Stephen Dolan",
  "message": "Merge pull request #162 from stedolan/utf8-fixes\n\nUtf8 fixes. Closes #161"
}
{
  "name": "Stephen Dolan",
  "message": "Reject all overlong UTF8 sequences."
}
{
  "name": "Stephen Dolan",
  "message": "Fix various UTF8 parsing bugs.\n\nIn particular, parse bad UTF8 by replacing the broken bits with U+FFFD\nand resychronise correctly after broken sequences."
}
{
  "name": "Stephen Dolan",
  "message": "Fix example in manual for `floor`. See #155."
}
{
  "name": "Nicolas Williams",
  "message": "Document floor"
}

.[] returns each element of the array returned in the response, one at a time, which are all fed into {message: .commit.message, name: .commit.committer.name}.

Data in jq is represented as streams of JSON values - every jq expression runs for each value in its input stream, and can produce any number of values to its output stream.

Streams are serialised by just separating JSON values with whitespace. This is a cat-friendly format - you can just join two JSON streams together and get a valid JSON stream.

If you want to get the output as a single array, you can tell jq to “collect” all of the answers by wrapping the filter in square brackets:

jq '[.[] | {message: .commit.message, name: .commit.committer.name}]'
Show result
[
  {
    "name": "Stephen Dolan",
    "message": "Merge pull request #162 from stedolan/utf8-fixes\n\nUtf8 fixes. Closes #161"
  },
  {
    "name": "Stephen Dolan",
    "message": "Reject all overlong UTF8 sequences."
  },
  {
    "name": "Stephen Dolan",
    "message": "Fix various UTF8 parsing bugs.\n\nIn particular, parse bad UTF8 by replacing the broken bits with U+FFFD\nand resychronise correctly after broken sequences."
  },
  {
    "name": "Stephen Dolan",
    "message": "Fix example in manual for `floor`. See #155."
  },
  {
    "name": "Nicolas Williams",
    "message": "Document floor"
  }
]

Next, let’s try getting the URLs of the parrent commits out of the API results as well. In each commit, the GitHub API includes information about “parent” commits. There can be one or many.

"parents": [
  {
    "html_url": "https://github.com/stedolan/jq/commit/54b9c9bdb225af5d886466d72f47eafc51acb4f7",
    "url": "https://api.github.com/repos/stedolan/jq/commits/54b9c9bdb225af5d886466d72f47eafc51acb4f7",
    "sha": "54b9c9bdb225af5d886466d72f47eafc51acb4f7"
  },
  {
    "html_url": "https://github.com/stedolan/jq/commit/8b1b503609c161fea4b003a7179b3fbb2dd4345a",
    "url": "https://api.github.com/repos/stedolan/jq/commits/8b1b503609c161fea4b003a7179b3fbb2dd4345a",
    "sha": "8b1b503609c161fea4b003a7179b3fbb2dd4345a"
  }
]

We want to pull out all of the “html_url” fields inside that array of parent commits and make a simple list of strings to go along with the “message” and “author” fields we already have.

jq '[.[] | {message: .commit.message, name: .commit.committer.name, parents: [.parents[].html_url]}]'
Show result
[
  {
    "parents": [
      "https://github.com/stedolan/jq/commit/54b9c9bdb225af5d886466d72f47eafc51acb4f7",
      "https://github.com/stedolan/jq/commit/8b1b503609c161fea4b003a7179b3fbb2dd4345a"
    ],
    "name": "Stephen Dolan",
    "message": "Merge pull request #162 from stedolan/utf8-fixes\n\nUtf8 fixes. Closes #161"
  },
  {
    "parents": [
      "https://github.com/stedolan/jq/commit/ff48bd6ec538b01d1057be8e93b94eef6914e9ef"
    ],
    "name": "Stephen Dolan",
    "message": "Reject all overlong UTF8 sequences."
  },
  {
    "parents": [
      "https://github.com/stedolan/jq/commit/54b9c9bdb225af5d886466d72f47eafc51acb4f7"
    ],
    "name": "Stephen Dolan",
    "message": "Fix various UTF8 parsing bugs.\n\nIn particular, parse bad UTF8 by replacing the broken bits with U+FFFD\nand resychronise correctly after broken sequences."
  },
  {
    "parents": [
      "https://github.com/stedolan/jq/commit/3dcdc582ea993afea3f5503a78a77675967ecdfa"
    ],
    "name": "Stephen Dolan",
    "message": "Fix example in manual for `floor`. See #155."
  },
  {
    "parents": [
      "https://github.com/stedolan/jq/commit/7c4171d414f647ab08bcd20c76a4d8ed68d9c602"
    ],
    "name": "Nicolas Williams",
    "message": "Document floor"
  }
]

Here we’re making an object as before, but this time the parents field is being set to [.parents[].html_url]}], which collects all of the parent commit urls defined in the parents object.


Here endeth the tutorial! There’s lots more to play with, go read the manual if you’re interested, and download jq if you haven’t already.