Tips for productive DevOps workflows: JSON formatting with jq and CI/CD linting automation

Apr 21, 2021 · 26 min read
Michael Friedrich GitLab profile

What is JSON linting?

To understand JSON linting, let’s quickly break down the two concepts of JSON and linting.

JSON is an acronym for JavaScript Object Notation, which is a lightweight, text-based, open standard format designed specifically for representing structured data based on the JavaScript object syntax. It is most commonly used for transmitting data in web applications. It parses data faster than XML and is easy for humans to read and write.

Linting is a process that automatically checks and analyzes static source code for programming and stylistic errors, bugs and suspicious constructs.

JSON has become popular because it is human-readable and doesn’t require a complete markup structure like XML. It is easy to analyze into logical syntactic components, especially in JavaScript. It also has many JSON libraries for most programming languages.

Benefits of JSON linting

Finding an error in JSON code can be challenging and time-consuming. The best way to find and correct errors while simultaneously saving time is to use a linting tool. When Json code is copied and pasted into the linting editor, it validates and reformats Json. It is easy to use and supports a wide range of browsers, so applications development with Json coding don’t require a lot of effort to make them browser-compatible.

JSON linting is an efficient way to reduce errors and it improves the overall quality of the JSON code. This can help accelerate development and reduce costs because errors are discovered earlier.

Some common JSON linting errors

In instances where a JSON transaction fails, the error information is conveyed to the user by the API gateway. By default, the API gateway returns a very basic fault to the client when a message filter has failed.

One common JSON linting error is parsing. A “parse: unexpected character" error occurs when passing a value that is not a valid JSON string to the JSON. parse method, for example, a native JavaScript object. To solve the error, make sure to only pass valid JSON strings to the JSON.

Another common error is NULL or inaccurate data errors, not using the right data type per column or extension for JSON files, and not ensuring every row in the JSON table is in the JSON format.

How to fix JSON linting errors

If you encounter a NULL or inaccurate data error in parsing, the first step is to make sure you use the right data type per column. For example, in the case of “age,” use 12 instead of twelve.

Also make sure you are using the right extension for JSON files. When using a compressed JSON file, it must end with “json” followed by the extension of the format, such as “.gz.”

Next, make sure the JSON format is used for every row in the JSON table. Create a table with a delimiter that is not in the input files. Then, run a query equivalent to the return name of the file, row points and the file path for the null NSON rows.

Sometimes you may find files that are not your source code files, but ones generated by the system when compiling your project. In that instance, when the file has a .js extension, the ESLint needs to exclude that file when searching for errors. One method of doing this is by using ‘IgnorePatterns:’ in .eslintrc.json file either after or before the “rules” tag.

“ignorePatterns”: [“temp.js”, “*/vendor/.js”],

“rules”: {

Alternatively, you can create a separate file named‘.eslintignore’ and incorporate the files to be excluded as shown below : */.js If you opt to correct instead of ignore, look for the error code in the last column. Correct all the errors in one fule and rerun ‘npx eslint . >errfile’ and ensure all the errors of that type are cleared. Then look for the next error code and repeat the procedure until all errors are cleared.

Of course, there will be instances when you won’t understand an error, so in that case, open and type the error code in the ‘Search’ field on the top of the document. There you will find very detailed instructions as to why that error is raised and how to fix it.

Finally, you can forcibly fix errors automatically while generating the error list using:

Npx eslintrc . — fix

This is not recommended until you become more well-versed with lint errors and how to fix them. Also, you should keep a backup of the files you are linting because while fixing errors, certain code may get overwritten, which could cause your program to fail.

JSON linting best practices

Here are some tips for helping your consumers use your output:

First, always enclose the Key : Value pair within double quotes. It may be convenient (not sure how) to generate with Single quotes, but JSON parser don’t like to parse JSON objects with single quotes.

For numerical values, quotes are optional but it is a good idea to enclose them in double quotes.

Next, don’t ever use hyphens in your key fields because it breaks python and scala parser. Instead use underscores (_).

It’s a good idea to always create a root element, especially when you’re creating a complicated JSON.

Modern web applications come with a REST API which returns JSON. The format needs to be parsed, and often feeds into scripts and service daemons polling the API for automation.

Starting with a new REST API and its endpoints can often be overwhelming. Documentation may suggest looking into a set of SDKs and libraries for various languages, or instruct you to use curl or wget on the CLI to send a request. Both CLI tools come with a variety of parameters which help to download and print the response string, for example in JSON format.

The response string retrieved from curl may get long and confusing. It can require parsing the JSON format and filtering for a smaller subset of results. This helps with viewing the results on the CLI, and minimizes the data to process in scripts. The following example retrieves all projects from GitLab and returns a paginated result set with the first 20 projects:

$ curl ""

Raw JSON as API response

The GitLab REST API documentation guides you through the first steps with error handling and authentication. In this blog post, we will be using the Personal Access Token as the authentication method. Alternatively, you can use project access tokens for automated authentication that avoids the use of personal credentials.

REST API authentication

Since not all endpoints are accessible with anonymous access they might require authentication. Try fetching user profile data with this request:

$ curl ""
{"message":"401 Unauthorized"}

The API request against the /user endpoint requires to pass the personal access token into the request, for example, as a request header. To avoid exposing credentials on the terminal, you can export the token and its value into the user's environment. You can automate the variable export with ZSH and the .env plugin in your shell environment. You can also source the .env once in the existing shell environment.

$ vim ~/.env

export GITLAB_TOKEN=”...”

$ source ~/.env

Scripts and commands being run in your shell environment can reference the $GITLAB_TOKEN variable. Try querying the user API endpoint again, with adding the authorization header into the request:

$ curl -H "Authorization: Bearer $GITLAB_TOKEN" ""

A reminder that only administrators can see the attributes of all users, and the individual can only see their user profile – for example, email is hidden from the public domain.

How to request responses in JSON

The GitLab API provides many resources and URL endpoints. You can manage almost anything with the API that you’d otherwise configure using the graphic user interface.

After sending the API request, the response message contains the body as string, for example as a JSON content type. curl can provide more information about the response headers which is helpful for debugging. Multiple verbose levels enable the full debug output with -vvv:

$ curl -vvv ""
* SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305
* ALPN, server accepted to use h2
* Server certificate:
*  subject:
*  start date: Jan 21 00:00:00 2021 GMT
*  expire date: May 11 23:59:59 2021 GMT
*  subjectAltName: host "" matched cert's ""
*  issuer: C=GB; ST=Greater Manchester; L=Salford; O=Sectigo Limited; CN=Sectigo RSA Domain Validation Secure Server CA
*  SSL certificate verify ok.
> GET /api/v4/projects HTTP/2
> Host:
> User-Agent: curl/7.64.1
> Accept: */*
< HTTP/2 200
< date: Mon, 19 Apr 2021 11:25:31 GMT
< content-type: application/json
[{"id":25993690,"description":"project for adding issues","name":"project-for-issues-1e1b6d5f938fb240","name_with_namespace":"gitlab-qa-sandbox-group / qa-test-2021-04-19-11-13-01-d7d873fd43cd34b6 / project-for-issues-1e1b6d5f938fb240","path":"project-for-issues-1e1b6d5f938fb240","path_with_namespace":"gitlab-qa-sandbox-group/qa-test-2021-04-19-11-13-01-d7d873fd43cd34b6/project-for-issues-1e1b6d5f938fb240"

[... JSON content ...]

* Closing connection 0

The curl command output provides helpful insights into TLS ciphers and versions, the request lines starting with > and response lines starting with <. The response body string is encoded as JSON.

How to see the structure of the returned JSON

To get a quick look at the structure of the returned JSON file, try these tips:

The values in JSON consist of specific types - a string value is put in double-quotes. Boolean true/false, numbers, and floating-point numbers are also present as types. If a key exists but its value is not set, REST APIs often return null.

Verify the data structure by running "linters". Python's JSON module can parse and lint JSON strings. The example below misses a closing square bracket to showcase the error:

$ echo '[{"key": "broken"}' | python -m json.tool
Expecting object: line 1 column 19 (char 18)

jq – a lightweight and flexible CLI processor – can be used as a standalone tool to parse and validate JSON data.

$ echo '[{"key": "broken"}' | jq
parse error: Unfinished JSON term at EOF at line 2, column 0

jq is available in the package managers of most operating systems.

$ brew install jq
$ apt install jq
$ dnf install jq
$ zypper in jq
$ pacman -S jq
$ apk add jq

Dive deep into JSON data structures

The true power of jq lies in how it can be used to parse JSON data:

jq is like sed for JSON data. It can be used to slice, filter, map, and transform structured data with the same ease that sed, awk, grep etc., let you manipulate text.

The output below shows how it looks to run the request against the project API again, but this time, the output is piped to jq.

$ curl "" | jq
    "id": 25994891,
    "description": "...",
    "name": "...",


    "forks_count": 0,
    "star_count": 0,
    "last_activity_at": "2021-04-19T11:50:24.292Z",
    "namespace": {
      "id": 11528141,
      "name": "...",



The first difference is the format of the JSON data structure, so-called pretty-printed. New lines and indents in data structure scopes help your eyes and allow you to identify the inner and outer data structures involved. This format is needed to determine which jq filters and methods you want to apply next.

About arrays and dictionaries

The set of results from an API often is returned as a list (or "array") of items. An item itself can be a single value or a JSON object. The following example mimics the response from the GitLab API and creates an array of dictionaries as a nested result set.

$ vim result.json
    "id": 1,
    "name": "project1"
    "id": 2,
    "name": "project2"
    "id": 3,
    "name": "project-internal-dev",
    "namespace": {
      "name": "🦊"

Use cat to print the file content on stdout and pipe it into jq. The outer data structure is an array – use -c .[] to access and print all items.

$ cat result.json | jq -c '.[]'

How to filter data structures with jq

Filter items by passing | select (...) to jq. The filter takes a lambda callback function as a comparator condition. When the item matches the condition, it is returned to the caller.

Use the dot indexer . to access dictionary keys and their values. Try to filter for all items where the name is project2:

$ cat result.json | jq -c '.[] | select (.name == "project2")'

Practice this example by selecting the id with the value 2 instead of the name.

Filter with matching a string

During tests, you may need to match different patterns instead of knowing the full name. Think of projects that match a specific path or are located in a group where you only know the prefix. Simple string matches can be achieved with the | contains (...) function. It allows you to check whether the given string is inside the target string – which requires the selected attribute to be of the string type.

For a filter with the select chain, the comparison condition needs to be changed from the equal operator == to checking the attribute .name with | contains ("dev").

$ cat result.json | jq -c '.[] | select (.name | contains ("dev") )'

Simple matches can be achieved with the contains function.

Filter with matching regular expressions

For advanced string pattern matching, it is recommended to use regular expressions. jq provides the test function for this use case. Try to filter for all projects which end with a number, represented by \d+. Note that the backslash \ needs to be escaped as \\ for shell execution. ^ tests for beginning of the string, $ is the ending check.

$ cat result.json | jq -c '.[] | select (.name | test ("^project\\d+$") )'

Tip: You can test and build the regular expression with regex101 before test-driving it with jq.

Access nested values

Key value pairs in a dictionary may have a dictionary or array as a value. jq filters need to take this factor into account when filtering or transforming the result. The example data structure provides project-internal-dev which has the key namespace and a value of a dictionary type.

    "id": 3,
    "name": "project-internal-dev",
    "namespace": {
      "name": "🦊"

jq allows the user to specify the array and dictionary types as [] and {} to be used in select chains with greater and less than comparisons. The [] brackets select filters for non-empty dictionaries for the namespace attribute, while the {} brackets select for all null (raw JSON) values.

$ cat result.json | jq -c '.[] | select (.namespace >={} )'

$ cat result.json | jq -c '.[] | select (.namespace <={} )'

These methods can be used to access the name attribute of the namespace, but only if the namespace contains values. Tip: You can chain multiple jq calls by piping the result into another jq call. .name is a subkey of the primary .namespace key.

$ cat result.json | jq -c '.[] | select (.namespace >={} )' | jq -c ''

The additional select command with non-empty namespaces ensures that only initialized values for are returned. This is a safety check, and avoids receiving null values in the result you would need to filter again.

$ cat result.json| jq -c '.[]' | jq -c ''

By using the additional check with | select (.namespace >={} ), you only get the expected results and do not have to filter empty null values.

How to expand the GitLab endpoint response

Save the result from the API projects call and retry the examples above with jq.

$ curl "" -o result.json 2&>1 >/dev/null

Validate CI/CD YAML with jq for Git hooks

While writing this blog post, I learned that you can escape and encode YAML into JSON with jq. This trick comes in handy when automating YAML linting on the CLI, for example as a Git pre-commit hook.

Let’s take a look at the simplest way to test GitLab CI/CD from our community meetup workshops. A common mistake with the first steps of the process can be missing the two spaces indent or missing whitespace between the dash and following command. The following examples use .gitlab-ci.error.yml as a filename to showcase errors and .gitlab-ci.main.yml for working examples.

$ vim .gitlab-ci.error.yml

image: alpine:latest

  -exit 1

Committing the change and waiting for the CI/CD pipeline to validate at runtime can be time-consuming. The GitLab API provides a resource endpoint /ci/lint. A POST request with JSON-encoded YAML content will return a linting result faster.

Parse CI/CD YAML into JSON with jq

You can use jq to parse the raw YAML string into JSON:

$ jq --raw-input --slurp < .gitlab-ci.error.yml
"image: alpine:latest\n\ntest:\nscript:\n  -exit 1\n"

The /ci/lint API endpoint requires a JSON dictionary with content as key, and the raw YAML string as a value. You can use jq to format the input by using the arg parser:

§ jq --null-input --arg yaml "$(<.gitlab-ci.error.yml)" '.content=$yaml'
  "content": "image: alpine:latest\n\ntest:\nscript:\n  -exit 1"

Send POST request to /ci/lint

The next building block is to send a POST request to the /ci/lint. The request needs to specify the Content-Type header for the body. With using the pipe | character, the JSON-encoded YAML configuration is fed into the curl command call.

$ jq --null-input --arg yaml "$(<.gitlab-ci.error.yml)" '.content=$yaml' \
| curl "" \
--header 'Content-Type: application/json' --data @-
{"status":"invalid","errors":["jobs test config should implement a script: or a trigger: keyword","jobs script config should implement a script: or a trigger: keyword","jobs config should contain at least one visible job"],"warnings":[],"merged_yaml":"---\nimage: alpine:latest\ntest: \nscript: \"-exit 1\"\n"}

The CLI command returns JSON output. You can use jq again to format the response in a more readable way.

$ jq --null-input --arg yaml "$(<.gitlab-ci.error.yml)" '.content=$yaml' \
| curl "" \
--header 'Content-Type: application/json' --data @- \
| jq --raw-output '.errors'
  "jobs test config should implement a script: or a trigger: keyword",
  "jobs script config should implement a script: or a trigger: keyword",
  "jobs config should contain at least one visible job"

Expanded CI/CD configuration

When you are using GitLab 13.8+ and the pipeline editor, the API endpoint also includes the merged YAML output for further processing. This feature has a limitation: Remote includes work while other include types do not. Push the changes to the repository in a draft MR and trigger a remote full lint as an alternative.

The example below shows CI/CD job templates with extends.

$ vim .gitlab-ci.main.yml

  image: alpine:latest
    BUILD_TYPE: "Debug"
    - echo "Hello from GitLab 🦊"

  extends: .job-tmpl

  extends: .job-tmpl
    BUILD_TYPE: "Release"
    - echo "Hello from GitLab 🦊🌈"

Validate and extract the .merged_yaml attribute by sending the YAML config to the GitLab API.

$ jq --null-input --arg yaml "$(<.gitlab-ci.main.yml)" '.content=$yaml' \
| curl "" \
--header 'Content-Type: application/json' --data @- \
| jq --raw-output '.merged_yaml'
  image: alpine:latest
    BUILD_TYPE: Debug
  - "echo \"Hello from GitLab \U0001F98A\""
  image: alpine:latest
    BUILD_TYPE: Debug
  - "echo \"Hello from GitLab \U0001F98A\""
  extends: ".job-tmpl"
  image: alpine:latest
    BUILD_TYPE: Release
  - "echo \"Hello from GitLab \U0001F98A\U0001F308\""
  extends: ".job-tmpl"

Do more with jq

You can use the CI lint command for your own ideas. For example: Wrapping it in a Git pre-commit hook which triggers an API call to /ci/lint on your GitLab host. Make sure to edit the variables fitting your environment. In this case, GITLAB_URL needs to point to your self-managed instance.

$ vim





while read -r value; do
done < <(jq --null-input --arg yaml "${GITLAB_CI_YML_CONTENT}" '.content=$yaml' \
| curl "${GITLAB_CI_LINT_URL}?include_merged_yaml=true" \
--header 'Content-Type: application/json' --data @- --silent \
| jq --raw-output '.errors' | jq -c '.[]')

echo -e "Analysing CI/CD config lint results ..."


for error in "${errors[@]}"; do
        echo "${error}"

if [[ $count_err -gt 0 ]]; then
        echo -e "GitLab CI/CD linting errors found. Aborting."
        exit 1
        echo -e "GitLab CI/CD linting ok."
        exit 0

Save the file and make it executable with chmod.

$ chmod +x

When the script is run with the working .gitlab-ci.main.yml file, the output looks like this:

$ rm .gitlab-ci.yml
$ ln -s .gitlab-ci.main.yml .gitlab-ci.yml

$ ./
Analysing CI/CD config lint results ...
GitLab CI/CD linting ok.

If you change the symlink to the .gitlab-ci.error.yml file and run the script again you can see the error and exit code:

$ rm .gitlab-ci.yml
$ ln -s .gitlab-ci.error.yml .gitlab-ci.yml

$ ./
Analysing CI/CD config lint results ...
"jobs test config should implement a script: or a trigger: keyword"
"jobs script config should implement a script: or a trigger: keyword"
"jobs config should contain at least one visible job"
GitLab CI/CD linting errors found. Aborting.

The Git Hook is located in the CI/CD API lint hook repository in the Developer Evangelism group.

Git hook with CI/CD YAML linting with the GitLab API

Use cases for programmatic API Clients

Sometimes shell programming cannot solve a requirement or a specific language integration is required for communicating with the API. Our community provides awesome API clients for many different programming languages.

Status and error handling

The GitLab API is designed to return different status codes depending on the context and requests. The HTTP response headers and response body tell about possible errors and API clients provide a programmatic interface.

Large result sets and pagination

The REST API can return a lot of results, and this stresses both the server and client on a new request. With returning a smaller subset of results - a page with a defined number of results - this limits response and helps save resources. This is called "Pagination" in the context of a REST API.

Pagination is enabled by default for the GitLab API. It requires you to fetch multiple pages to retrieve a full result set. The Link headers specify the next/previous page to follow.

Parsing the response header with Bash and jq can get complicated and is prone to error. Programming languages like Python, Perl, etc., provide abstract interfaces for HTTP requests and responses, header parsing and error handling. API client libraries are available that provide full support for pagination in a few lines of code.

The monitoring scripts for Docker Hub rate limits use a similar approach in Python where parsing the response headers is required to determine the rate limit values.

The following code provides an example with pagination using the python-gitlab docs and works with Python 3:

$ vim requirements.txt


$ pip3 install -r requirements.txt

$ vim

#!/usr/bin/env python

import gitlab
import os


# Prefer keyset pagination
gl = gitlab.Gitlab(SERVER, private_token=os.environ['GITLAB_TOKEN'], pagination="keyset", order_by="id", per_page=100)

# Iterate over the list, and fire new API calls in case the result set does not match yet
groups = gl.groups.list(as_list=False)

found_page = 0

for group in groups:
    if GROUP_NAME in
        found_page = groups.current_page

print("Pagination API example for Python with %s %s - result on page %d" % ("GitLab", "🦊", found_page))

Run the script with the Python interpreter shown below. Adjust the python as needed for your environment.

$ python3

Pagination API example for Python with GitLab 🦊 - result on page 5

The full code example can be found in my API playground repository.

What's next?

Programming language libraries and SDKs provide abstractions for requests, response, and error handling. Depending on the use case, language libraries and SDKs can help with tests and code quality and be used instead of CLI calls. CLI, curl, and jq are a great combination to quickly test the response on a remote server shell. There are many more API endpoints and tips and tricks beyond what is described in this blog post. Read the posts below to learn more about API endpoint strategies.

What’s your cool API integrations you have built with jq and/or a programming language (library)? Tweet your favorites to @dnsmichi @gitlab :)

Cover image by Gert Boers on Unsplash

“Learn JSON superpowers with jq, the @GitLab API & automated CI/CD Linting” – Michael Friedrich

Click to tweet

Edit this page View source