Chapter 2. Asynchronous Control Flow Patterns

Moving from a synchronous programming style to a platform such as Node.js, where continuation-passing style and asynchronous APIs are the norm, can be frustrating. Writing asynchronous code can be a different experience, especially when it comes to control flow. Simple problems such as iterating over a set of files, executing tasks in sequence, or waiting for a set of operations to complete, require the developer to take new approaches and techniques to avoid ending up writing inefficient and unreadable code. One common mistake is to fall into the trap of the callback hell problem and see the code growing horizontally rather than vertically, with a nesting that makes even simple routines hard to read and maintain.

In this chapter, we will see how it's actually possible to tame callbacks and write clean, manageable asynchronous code by using some discipline and with the aid of some patterns. We will see how control flow libraries, such as async, can significantly simplify our problems, and we will also discover that the continuation-passing style is not the only way to implement asynchronous API. In fact, we will learn how Promises and ECMAScript 6 generators can be powerful and flexible alternatives. For each one of these paradigms, we will learn about patterns that will help us implement the most common control flows, and by the end of the chapter, we should be ready and confident to write clean and efficient asynchronous code.

The difficulties of asynchronous programming

Losing control of asynchronous code in JavaScript is undoubtedly easy. Closures and in-place definition of anonymous functions allow a smooth programming experience that doesn't require the developer to jump to other points in the code base. This is perfectly in line with the KISS principle; it's simple, it keeps the code flowing, and we get it working in less time. Unfortunately, sacrificing qualities such as modularity, reusability, and maintainability will sooner or later lead to the uncontrolled proliferation of callback nesting, the growth in the size of functions, and will lead to poor code organization. Most of the time, creating closures is not functionally needed, so it's more a matter of discipline than a problem related to asynchronous programming. Recognizing that our code is becoming unwieldy—or even better, knowing in advance that it might become unwieldy—and then acting accordingly with the most adequate solution is what differentiates a novice from an expert.

Creating a simple web spider

To explain the problem, we will create a little web spider, a command-line application that takes in a web URL as input and downloads its contents locally into a file. In the code presented in this chapter, we are going to use a few npm dependencies:

request: A library to streamline HTTP calls
mkdirp: A small utility to create directories recursively

Also, we will often refer to a local module named ./utilities, which contains some helpers which we will be using in our application. We omit the contents of this file for brevity, but you can find the full implementation, along with a package.json containing the full list of dependencies, in the download pack for this book available at http://www.packtpub.com.

The core functionality of our application is contained inside a module named spider.js. Let's see how it looks. To start with, let's load all the dependencies that we are going to use:

var request = require('request');
var fs = require('fs');
var mkdirp = require('mkdirp');
var path = require('path');
var utilities = require('./utilities');

Next, we create a new function named spider(), which takes in the URL to download and a callback function that will be invoked when the download process completes:

function spider(url, callback) {
  var filename = utilities.urlToFilename(url);
  fs.exists(filename, function(exists) {        //[1]
    if(!exists) {
      console.log("Downloading " + url);
      request(url, function(err, response, body) {      //[2]
        if(err) {
          callback(err);
        } else {
          mkdirp(path.dirname(filename), function(err) {    //[3]
            if(err) {
              callback(err);
            } else {
              fs.writeFile(filename, body, function(err) { //[4]
                if(err) {
                  callback(err);
                } else {
                  callback(null, filename, true);
                }
              });
            }
          });
        }
      });
    } else {
      callback(null, filename, false);
    }
  });
}

The preceding function executes the following tasks:

Checks if the URL was already downloaded by verifying that the corresponding file was not already created:
```
fs.exists(filecodename, function(exists) …
```
If the file is not found, the URL is downloaded using the following line of code:
```
request(url, function(err, response, body) …
```
Then, we make sure whether the directory that will contain the file exists or not:
```
mkdirp(path.dirname(filename), function(err) …
```
Finally, we write the body of the HTTP response to the filesystem:
```
fs.writeFile(filename, body, function(err) …
```

To complete our web spider application, we just need to invoke the spider() function by providing a URL as an input (in our case, we read it from the command-line arguments):

spider(process.argv[2], function(err, filename, downloaded) {
  if(err) {
    console.log(err);
  } else if(downloaded){
    console.log('Completed the download of "'+ filename +'"');
  } else {
    console.log('"'+ filename +'" was already downloaded');
  }
});

Now, we are ready to try our web spider application, but first, make sure you have the utilities.js module and the package.json containing the full list of dependencies in your project directory. Then, install all the dependencies by running the following command:

npm install

Next, we can execute the spider module to download the contents of a web page, with a command like this:

node spider http://www.example.com

Note

Our web spider application requires that we always include the protocol (for example, http://) in the URL we provide. Also, do not expect HTML links to be rewritten or resources such as images to be downloaded as this is just a simple example to demonstrate how asynchronous programming works.

The callback hell

Looking at the spider() function we defined earlier, we can surely notice that even though the algorithm we implemented is really straightforward, the resulting code has several levels of indentation and is very hard to read. Implementing a similar function with direct style blocking API would be straightforward, and there would be very few chances to make it look so wrong. However, using asynchronous CPS is another story, and making bad use of closures can lead to an incredibly bad code.

The situation where the abundance of closures and in-place callback definitions transform the code into an unreadable and unmanageable blob is known as callback hell. It's one of the most well recognized and severe anti-patterns in Node.js and JavaScript in general. The typical structure of a code affected by this problem looks like the following:

asyncFoo(function(err) {
  asyncBar(function(err) {
    asyncFooBar(function(err) {
      [...]
    });
  });
});

We can see how code written in this way assumes the shape of a pyramid due to the deep nesting and that's why it is also colloquially known as the pyramid of doom.

The most evident problem with code such as the preceding one is the poor readability. Due to the nesting being too deep, it's almost impossible to keep track of where a function ends and where another one begins.

Another issue is caused by the overlapping of the variable names used in each scope. Often, we have to use similar or even identical names to describe the content of a variable. The best example is the error argument received by each callback. Some people often try to use variations of the same name to differentiate the object in each scope—for example, err, error, err1, err2, and so on; others prefer to just hide the variable defined in the scope by always using the same name; for example, err. Both the alternatives are far from perfect, and cause confusion and increase the probability of introducing defects.

Also, we have to keep in mind that closures come at a small price in terms of performances and memory consumption. In addition, they can create memory leaks that are not so easy to identify because we shouldn't forget that any context referenced by an active closure is retained from garbage collection.

Note

For a great introduction to how closures work in V8 you can refer to the blog post by Vyacheslav Egorov, a software engineer at Google working on V8, at http://mrale.ph/blog/2012/09/23/grokking-v8-closures-for-fun.html.

If we look at our spider() function, we will notice that it clearly represents a callback hell situation and has all the problems we just described. That's exactly what we are going to fix with the patterns and techniques we will learn in this chapter.