How to parse a JSON file in JavaScript?

What Is JSON And Why Parse It?

JSON stands for "JavaScript Object Notation". It's one of the most popular formats used for storing and sharing data containing key-value pairs, which may also be nested or in a list. For many applications that work with data, including web scraping, it is important to be able to write and parse data in the JSON format.

Here is a sample JSON string:

{
    "name": "John Doe",
    "age": 32,
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "state": "CA"
    }
}

How To Read A JSON File?

In JavaScript, you can parse a JSON string using the JSON.parse() method. A JSON file is essentially a text file containing a JSON string. Therefore, to read a JSON file, you first need to read the file as a string and then parse it into an object that contains key-value pairs.

Reading A JSON File Using fs.readFile

Here is some sample code that uses the Node.js-based fs module's readFile method to load the above file from disk and then uses the JSON.parse method to parse it into a JavaScript object:

var fs = require('fs');

fs.readFile('file.json', 'utf-8', function (err, data) {
    if (err) throw err;

    var obj = JSON.parse(data);

    console.log(obj);
});

Output:

{
  name: 'John Doe',
  age: 32,
  address: { street: '123 Main St', city: 'Anytown', state: 'CA' }
}

Reading A JSON File Using fs.readFileSync

The fs module also has a readFileSync method, which reads the file synchronously. Here, the execution loop is blocked until the file is read, unlike the readFile method, where the file is read asynchronously and the execution is not blocked. The code for this would look like:

const fs = require('fs');

const data = fs.readFileSync('file.json');

const obj = JSON.parse(data);
console.log(obj);

The output is the same as the previous case, but the code here is elegant and straightforward, as we don't need to provide a callback. However, this may be inefficient if it is used for loading large JSON files in cases where there is other code that could be executed before the file loads.

Reading A JSON File In The Browser

If you want to parse a JSON string in the browser, you can use the same JSON.parse method. However, if you are using the fetch API to load a remote file/API, you can simply use the json method of the response object. Here is an example:

<script>
    fetch('http://httpbin.org/ip')
      .then(res => res.json())
      .then((data) => {
        console.log(data);
      }).catch(err => console.error(err));
</script>

Importing JSON File As A Module

In NodeJS, you can also import a JSON file as an Object directly, using the require() function or the import keyword:

const obj = require('./file.json');

Or in ES6:

import obj from './file.json';

Parsing A JSON Object To Print All The Key Value Pairs

Let's write some code that parses a JSON file and prints all the key value pairs (including lists and nested objects) in the file:

const fs = require('fs');

function printKeyValList(obj, prefix) {
  Object.keys(obj).forEach((key) => {
    const val = obj[key];
    if(typeof val == 'object') {
      printKeyValList(val, prefix + key + '.');
    } else {
      console.log(`${prefix}${key}: ${val}`);
    }
  })  
}

const data = fs.readFileSync('file.json');
const obj = JSON.parse(data);

printKeyValList(obj, '');

The above code reads a JSON file and iterates over the key-value pairs in the object. If any of the values are a list or a nested object, it recurses until all the values are simpler data types. Let's look at the output for a sample JSON file:

INPUT: (file.json)
{
    "name": "John Doe",
    "age": 32,
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "state": "CA"
    },
    "emails": ["john@doe.com", "johndoe1993@gmail.com"]
}

OUTPUT:
name: John Doe
age: 32
address.street: 123 Main St
address.city: Anytown
address.state: CA
emails.0: john@doe.com
emails.1: johndoe1993@gmail.com

We can see that a nested JSON object is printed as a flat list of key-value pairs. The arrays in the JSON file are also treated as objects with array indices acting as object keys.

Parsing A JSON File With An Array Of Objects

Let's say you have a JSON file containing a list of objects, like the one below:

[
  {"name": "John Doe", "age": 32},
  {"name": "Joe Doe", "age": 30},
  {"name": "Alice", "age": 28}
]

You can parse this using JSON.parse() just as we did above, except that the function will return a JavaScript array:

var fs = require('fs');

fs.readFile('file-with-array.json', 'utf-8', function (err, data) {
    if (err) throw err;

    var arr = JSON.parse(data);

    arr.forEach((obj) => {
      console.log(`${obj.name} is ${obj.age} years old`)
    })
});

This code gives us the following output:

John Doe is 32 years old
Joe Doe is 30 years old
Alice is 28 years old

Reading A File With A Large Number Of JSON Objects

Let's say you have a file with a large number of JSON objects (a few million), with one JSON object per line, that looks something like this:

{"name": "Alice", age: 22}
{"name": "Bob", age: 23}
{"name": "Charlie", age: 22}
{"name": "Dave", age: 26}
... another few million name + age objects

If you try reading the file all at once and then parsing it into a JavaScript array, it will most likely crash the system because the whole file is loaded into memory. In this case, you can read the file as a stream, a few chunks at a time, and process each object on the fly. Notice that the file above does not have the opening and closing square brackets or a comma after each object. This is not standard JSON, but rather a modified format called JSONL (JSON lines). The NodeJS code to read and parse it using streams is below:

var fs = require('fs');

var stream = fs.createReadStream('test.jsonl');

var bigChunk = ''; // the chunk of the file being processed

stream.on('data', (streamChunk) => {
  // the stream will read the file in chunks. 
  // this function will process each chunk as it is read

  // add streamed chunk to bigChunk
  bigChunk += streamChunk

  // streamChunk is not necessarily one line
  // bigChunk may contain multiple lines and break in between a line
  // process all the whole lines in bigChunk

  // position of first newline in the string
  var i = bigChunk.indexOf('\n'); 
  
  // iterate until all whole lines in the bigChunk are processed
  while ((i >= 0)) {
    if (i==0) {
      bigChunk = bigChunk.slice(1);
    } else {
      const line = bigChunk.slice(0, i);

      // parse object from the line
      const obj = JSON.parse(line);
      
      // do whatever you want with the object
      console.log(obj);

      // remove the processed line from bigChunk
      bigChunk = bigChunk.slice(i+1);

    }
    // reset i to next index of newline
    i = bigChunk.indexOf('\n');
  }
  
  // bigChunk is now empty or has a partially loaded line
  // if it has a partially loaded line, that line will be processed when fully loaded
});

Related JSON web scraping questions: