The callback pattern

Callbacks are the materialization of the handlers of the reactor pattern and they are literally one of those imprints that give Node.js its distinctive programming style. Callbacks are functions that are invoked to propagate the result of an operation and this is exactly what we need when dealing with asynchronous operations. They practically replace the use of the return instruction that, as we know, always executes synchronously. JavaScript is a great language to represent callbacks, because as we know, functions are first class objects and can be easily assigned to variables, passed as arguments, returned from another function invocation, or stored into data structures. Also, closures are an ideal construct for implementing callbacks. With closures, we can in fact reference the environment in which a function was created, practically, we can always maintain the context in which the asynchronous operation was requested, no matter when or where its callback is invoked.

In this section, we will analyze this particular style of programming made of callbacks instead of the return instructions.

In JavaScript, a callback is a function that is passed as an argument to another function and is invoked with the result when the operation completes. In functional programming, this way of propagating the result is called continuation-passing style, for brevity, CPS. It is a general concept, and it is not always associated with asynchronous operations. In fact, it simply indicates that a result is propagated by passing it to another function (the callback), instead of directly returning it to the caller.

Now, let's consider the case where the add() function is asynchronous, which is as follows:

In the previous code, we simply use setTimeout() to simulate an asynchronous invocation of the callback. Now, let's try to use this function and see how the order of the operations changes:

The preceding code will print the following:

Since setTimeout() triggers an asynchronous operation, it will not wait anymore for the callback to be executed, but instead, it returns immediately giving the control back to addAsync(), and then back to its caller. This property in Node.js is crucial, as it allows the stack to unwind, and the control to be given back to the event loop as soon as an asynchronous request is sent, thus allowing a new event from the queue to be processed.

The following image shows how this works:

Asynchronous continuation-passing style

When the asynchronous operation completes, the execution is then resumed starting from the callback provided to the asynchronous function that caused the unwinding. The execution will start from the Event Loop, so it will have a fresh stack. This is where JavaScript comes in really handy, in fact, thanks to closures it is trivial to maintain the context of the caller of the asynchronous function, even if the callback is invoked at a different point in time and from a different location.

We have seen how the order of the instructions changes radically depending on the nature of a function - synchronous or asynchronous. This has strong repercussions on the flow of the entire application, both in correctness and efficiency. The following is an analysis of these two paradigms and their pitfalls. In general, what must be avoided, is creating inconsistency and confusion around the nature of an API, as doing so can lead to a set of problems which might be very hard to detect and reproduce. To drive our analysis, we will take as example the case of an inconsistently asynchronous function.

Now, let's see how the use of an unpredictable function, such as the one that we defined previously, can easily break an application. Consider the following code:

When the preceding function is invoked, it creates a new object that acts as a notifier, allowing to set multiple listeners for a file read operation. All the listeners will be invoked at once when the read operation completes and the data is available. The preceding function uses our inconsistentRead() function to implement this functionality. Let's now try to use the createFileReader() function:

The preceding code will print the following output:

As you can see, the callback of the second operation is never invoked. Let's see why:

The callback behavior of our inconsistentRead() function is really unpredictable, as it depends on many factors, such as the frequency of its invocation, the filename passed as argument, and the amount of time taken to load the file.

The bug that we've just seen might be extremely complicated to identify and reproduce in a real application. Imagine to use a similar function in a web server, where there can be multiple concurrent requests; imagine seeing some of those requests hanging, without any apparent reason and without any error being logged. This definitely falls under the category of nasty defects.

Isaac Z. Schlueter, creator of npm and former Node.js project lead, in one of his blog posts compared the use of this type of unpredictable functions to unleashing Zalgo. If you're not familiar with Zalgo, you are invited to find out what it is.

The lesson to learn from the unleashing Zalgo example is that it is imperative for an API to clearly define its nature, either synchronous or asynchronous.

One suitable fix for our inconsistentRead() function, is to make it totally synchronous. This is possible because Node.js provides a set of synchronous direct style APIs for most of the basic I/O operations. For example, we can use the fs.readFileSync() function in place of its asynchronous counterpart. The code would now be as follows:

We can see that the entire function was also converted to a direct style. There is no reason for the function to have a continuation-passing style if it is synchronous. In fact, we can state that it is always a good practice to implement a synchronous API using a direct style; this will eliminate any confusion around its nature and will also be more efficient from a performance perspective.

Please bear in mind that changing an API from CPS to a direct style, or from asynchronous to synchronous, or vice versa might also require a change to the style of all the code using it. For example, in our case, we will have to totally change the interface of our createFileReader() API and adapt it to work always synchronously.

Also, using a synchronous API instead of an asynchronous one has some caveats:

In our consistentReadSync() function, the risk of blocking the event loop is partially mitigated, because the synchronous I/O API is invoked only once per each filename, while the cached value will be used for all the subsequent invocations. If we have a limited number of static files, then using consistentReadSync() won't have a big effect on our event loop. Things can change quickly if we have to read many files and only once. Using synchronous I/O in Node.js is strongly discouraged in many circumstances; however, in some situations, this might be the easiest and most efficient solution. Always evaluate your specific use case in order to choose the right alternative.

Another alternative for fixing our inconsistentRead() function is to make it purely asynchronous. The trick here is to schedule the synchronous callback invocation to be executed "in the future" instead of being run immediately in the same event loop cycle. In Node.js, this is possible using process.nextTick(), which defers the execution of a function until the next pass of the event loop. Its functioning is very simple; it takes a callback as an argument and pushes it on the top of the event queue, in front of any pending I/O event, and returns immediately. The callback will then be invoked as soon as the event loop runs again.

Let's apply this technique to fix our inconsistentRead() function as follows:

Now, our function is guaranteed to invoke its callback asynchronously, under any circumstances.

Another API for deferring the execution of code is setImmediate(), which—despite the name—might actually be slower than process.nextTick(). While their purpose is very similar, their semantic is quite different. Callbacks deferred with process.nextTick() run before any other I/O event is fired, while with setImmediate(), the execution is queued behind any I/O event that is already in the queue. Since process.nextTick() runs before any already scheduled I/O, it might cause I/O starvation under certain circumstances, for example, a recursive invocation; this can never happen with setImmediate(). We will learn to appreciate the difference between these two APIs when we analyze the use of deferred invocation for running synchronous CPU-bound tasks later in the book.

In Node.js, continuation-passing style APIs and callbacks follow a set of specific conventions. These conventions apply to the Node.js core API but they are also followed virtually by every userland module and application. So, it's very important that we understand them and make sure that we comply whenever we need to design an asynchronous API.

You might have seen from the readJSON() function defined previously that in order to avoid any exception to be thrown into the fs.readFile() callback, we put a try-catch block around JSON.parse(). Throwing inside an asynchronous callback, in fact, will cause the exception to jump up to the event loop and never be propagated to the next callback.

In Node.js, this is an unrecoverable state and the application will simply shut down printing the error to the stderr interface. To demonstrate this, let's try to remove the try-catch block from the readJSON() function defined previously:

Now, in the function we just defined, there is no way of catching an eventual exception coming from JSON.parse(). Let's try, for example, to parse an invalid JSON file with the following code:

This would result in the application being abruptly terminated and the following exception being printed on the console:

Now, if we look at the preceding stack trace, we will see that it starts somewhere from the fs.js module, practically from the point at which the native API has completed reading and returned its result back to the fs.readFile() function, via the event loop. This clearly shows us that the exception traveled from our callback into the stack that we saw, and then straight into the event loop, where it's finally caught and thrown in the console.

This also means that wrapping the invocation of readJSONThrows() with a try-catch block will not work, because the stack in which the block operates is different from the one in which our callback is invoked. The following code shows the anti-pattern that we just described:

The preceding catch statement will never receive the JSON parsing exception, as it will travel back to the stack in which the exception was thrown, and we just saw that the stack ends up in the event loop and not with the function that triggers the asynchronous operation.

We already said that the application is aborted the moment an exception reaches the event loop; however, we still have a last chance to perform some cleanup or logging before the application terminates. In fact, when this happens, Node.js emits a special event called uncaughtException just before exiting the process. The following code shows a sample use case:

It's important to understand that an uncaught exception leaves the application in a state that is not guaranteed to be consistent, which can lead to unforeseeable problems. For example, there might still have incomplete I/O requests running, or closures might have become inconsistent. That's why it is always advised, especially in production, to exit anyway from the application after an uncaught exception is received.