We already mentioned at the beginning of the chapter that CPS is not the only way to write asynchronous code. In fact, the JavaScript ecosystem provides alternatives to the traditional callback pattern. One of these in particular is receiving a lot of momentum, especially now that it is going to be part of the ECMAScript 6 specification (also known as ES6 or Harmony), the upcoming version of the JavaScript language. We are talking, of course, about promises, and in particular about those implementations that follow the Promises/A+ specification (https://promisesaplus.com).
In very simple terms, promises are an abstraction that allow an asynchronous function to return an object called a promise, which represents the eventual result of the operation. In the promises jargon, we say that a promise is pending when the asynchronous operation is not yet complete, it's fulfilled when the operation successfully completes, and rejected when the operation terminates with an error. Once a promise is either fulfilled or rejected, it's considered settled.
To receive the fulfillment value or the error (reason) associated with the rejection, we can use the then()
method of the promise. The following is its signature:
promise.then([onFulfilled], [onRejected])
Where onFulfilled()
is a function that will eventually receive the fulfillment value of the promise, and onRejected()
is another function that will receive the reason of the rejection (if any). Both functions are optional.
To have an idea of how Promises can transform our code, let's consider the following:
asyncOperation(arg, function(err, result) { if(err) { //handle error } //do stuff with result });
Promises allow to transform this typical CPS code into a better structured and more elegant code, such as the following:
asyncOperation(arg) .then(function(result) { //do stuff with result }, function(err) { //handle error });
One crucial property of the then()
method is that it synchronously returns another promise. If any of the onFulfilled()
or onRejected()
functions return a value x
, the promise returned by the then()
method will be as follows:
x
if x
is a valuex
if x
is a promise or a thenablex
if x
is a promise or a thenableThis feature allows us to build chains of promises, allowing easy aggregation and arrangement of asynchronous operations in several configurations. Also, if we don't specify an onFulfilled()
or onRejected()
handler, the fulfillment value or rejection reasons are automatically forwarded to the next promise in the chain. This allows us, for example, to automatically propagate errors across the whole chain until caught by an onRejected()
handler. With a promise chain, sequential execution of tasks suddenly becomes a trivial operation:
asyncOperation(arg) .then(function(result1) { //returns another promise return asyncOperation(arg2); }) .then(function(result2) { //returns a value return 'done'; }) .then(undefined, function(err) { //any error in the chain is caught here });
The following diagram provides another perspective on how a promise chain works:
Another important property of promises is that the onFulfilled()
and onRejected()
functions are guaranteed to be invoked asynchronously, even if we resolve the promise synchronously with a value, as we did in the preceding example, where we returned the string done
in the last then()
function of the chain. This behavior shields our code against all those situations where we could unintentionally release Zalgo, making our asynchronous code more consistent and robust with no effort.
Now comes the best part. If an exception is thrown (using the throw
statement) from the onFulfilled()
or onRejected()
handler, the promise returned by the then()
method will automatically reject with the exception as the rejection reason. This is a tremendous advantage over CPS, as it means that with promises, exceptions will propagate automatically across the chain, and that the throw
statement is not an enemy anymore.
For a detailed description of the Promises/A+ specification, you can refer to the official website http://promises-aplus.github.io/promises-spec/.
In Node.js, and in general in JavaScript, there are several libraries implementing the Promises/A+ specifications. The following are the most popular:
What really differentiates them is the additional set of features they provide on top of the Promises/A+ standard. The standard, in fact, defines only the behavior of the then()
method and the promise resolution procedure, but it does not specify other functionalities, for example, how a promise is created from a callback-based asynchronous function.
In our examples, we will try to use the set of API implemented by the ES6 promises, as they will be natively available in JavaScript without the support of any external library. Luckily, the preceding list of libraries are gradually adapting to support the ES6 API, so using any one of them should not force us into any strong implementation lock-in as far as we use only the feature set of the ES6 standard.
Please bear in mind that the ECMAScript 6 specification is still a draft at the time of writing. So there might be some differences from what will be the final standard. Also, consider that at the time of writing, the version of V8 used by Node.js still does not support promises natively. So, for our examples, we are going to use one of the preceding listed implementations, namely, Bluebird. Of course, we will use only the part of its API that is compatible with ES6 promises.
For reference, here is the list of the APIs currently provided by the ES6 promises:
new Promise(function(resolve, reject) {})
): This creates a new promise that fulfills or rejects based on the behavior of the function passed as an argument. The arguments of the constructor are explained as follows:Promise
object:Promise.resolve(obj)
: This creates a new promise from a thenable or a value.Promise.reject(err)
: This creates a promise that rejects with err
as the reason.Promise.all(array)
: This creates a promise that fulfills with an array of fulfillment values when every item in the array fulfills, and rejects with the first rejection reason if any item rejects. Each item in the array can be a promise, a generic thenable, or a value.Promise
instance:It is worth mentioning that some promise implementations offer another mechanism to create new promises; this is called deferreds. We are not going to describe it here, because it's not part of the ES6 standard, but if you want to know more, you can read the documentation for Q (https://github.com/kriskowal/q#using-deferreds) or When.js (https://github.com/cujojs/when/wiki/Deferred).
In Node.js, and in general in JavaScript, there are only a few libraries supporting promises out-of-the-box. Most of the time, in fact, we have to convert a typical callback-based function into one that returns a promise; this is also known as promisification.
Fortunately, the callback conventions used in Node.js allow us to create a reusable function that we can utilize to promisify any Node.js style API. We can do this easily by using the constructor of the Promise
object. Let's then create a new function called promisify()
and include it into the utilities.js
module (so we can use it later in our web spider application):
var Promise = require('bluebird'); module.exports.promisify = function(callbackBasedApi) { return function promisified() { var args = [].slice.call(arguments); return new Promise(function(resolve, reject) { //[1] args.push(function(err, result) { //[2] if(err) { return reject(err); //[3] } if(arguments.length <= 2) { //[4] resolve(result); } else { resolve([].slice.call(arguments, 1)); } }); callbackBasedApi.apply(null, args); //[5] }); } };
The preceding function returns another function called promisified()
, which represents the promisified version of the callbackBasedApi
given in the input. This is how it works:
promisified()
function creates a new promise using the Promise
constructor and immediately returns it back to the caller.Promise
constructor, we make sure to pass to callbackBasedApi
, a special callback. As we know that the callback always comes last, we simply append it to the argument list (args
) provided to the promisified()
function.callbackBasedApi
with the list of arguments we have built.After a little bit of necessary theory, we are now ready to convert our web spider application to use promises. Let's start directly from version 2, the one downloading in sequence the links of a web page.
In the spider.js
module, the very first step required is to load our promises implementation (we will use it later) and promisify the callback-based functions that we plan to use:
var Promise = require('bluebird'); var utilities = require('./utilities'); var request = utilities.promisify(require('request')); var mkdirp = utilities.promisify(require('mkdirp')); var fs = require('fs'); var readFile = utilities.promisify(fs.readFile); var writeFile = utilities.promisify(fs.writeFile);
Now, we can start converting the download()
function:
function download(url, filename) { console.log('Downloading ' + url); var body; return request(url) .then(function(results) { body = results[1]; return mkdirp(path.dirname(filename)); }) .then(function() { return writeFile(filename, body); }) .then(function() { console.log('Downloaded and saved: ' + url); return body; }); }
We can see straightaway how elegant some sequential code implemented with promises is; we simply have an intuitive chain of then()
functions. The final return value of the download()
function is the promise returned by the last then()
invocation in the chain. This makes sure that the caller receives a promise that fulfills with body
only after all the operations (request
, mkdirp
, writeFile
) have completed.
Next, it's the turn of the spider()
function:
function spider(url, nesting) { var filename = utilities.urlToFilename(url); return readFile(filename, 'utf8') .then( function(body) { return spiderLinks(url, body, nesting); }, function(err) { if(err.code !== 'ENOENT') { throw err; } return download(url, filename) .then(function(body) { return spiderLinks(url, body, nesting); }); } ); }
The important thing to notice here is that we also registered an onRejected()
function for the promise returned by readFile()
, to handle the case when a page was not already downloaded (file does not exist). Also, it's interesting to see how we were able to use throw
to propagate the error from within the handler.
Now that we have converted our spider()
function as well, we can modify its main invocation as follows:
spider(process.argv[2], 1) .then(function() { console.log('Download complete'); }) .catch(function(err) { console.log(err); });
Note how we used, for the first time, the syntactic sugar catch
to handle any error situation originated from the spider()
function. If we look again at all the code we have written so far in this section, we would be pleasantly surprised by the fact that we didn't include any error propagation logic like we would be forced to do by using callbacks. This is clearly an enormous advantage, as it greatly reduces the boilerplate in our code and the chances of missing any asynchronous error.
Now, the only missing bit to complete the version 2 of our web spider application is the spiderLinks()
function, which we are going to see in a moment.
The web spider code so far was mainly an overview of what promises are and how they are used, demonstrating how simple and elegant it is to implement a sequential execution flow using promises. However, the code we considered so far involves only the execution of a known set of asynchronous operations. So, the missing piece that will complete our exploration of sequential execution flows is to see how we can implement an iteration using promises. Again, the spiderLinks()
function of web spider version 2 is a perfect example to show that.
Let's add the missing piece to the code we wrote so far:
function spiderLinks(currentUrl, body, nesting) { var promise = Promise.resolve(); //[1] if(nesting === 0) { return promise; } var links = utilities.getPageLinks(currentUrl, body); links.forEach(function(link) { //[2] promise = promise.then(function() { return spider(link, nesting - 1); }); }); return promise; }
To iterate asynchronously over all the links of a web page, we had to dynamically build a chain of promises:
undefined
. This promise is just used as a starting point to build our chain.promise
variable with a new promise obtained by invoking then()
on the previous promise in the chain. This is actually our asynchronous iteration pattern using promises.This way, at the end of the loop, the promise
variable will contain the promise of the last then()
invocation in the loop, so it will resolve only when all the promises in the chain have been resolved.
With this, we completely converted our web spider version 2 to use promises. We should now be able to try it out again.
To conclude this section on sequential execution, let's extract the pattern to iterate over a set of promises in sequence:
var tasks = [...] var promise = Promise.resolve(); tasks.forEach(function(task) { promise = promise.then(function() { return task(); }); }); promise.then(function() { //All tasks completed });
An alternative to using the forEach()
loop is to use reduce()
, allowing an even more compact code:
var tasks = [...] var promise = tasks.reduce(function(prev, task) { return prev.then(function() { return task(); }); }, Promise.resolve()); promise.then(function() { //All tasks completed });
As always, with simple adaptations of this pattern, we could collect all the tasks' results in an array; we could implement a mapping algorithm, or build a filter, and so on.
Another execution flow that becomes trivial with promises is the parallel execution flow. In fact, all that we need to do is use the built-in Promise.all()
helper that creates another promise, which fulfills only when all the promises received in an input are fulfilled. That's essentially a parallel execution because no order between the various promises' resolutions is enforced.
To demonstrate this, let's consider version 3 of our web spider application, the one downloading all the links of a page in parallel. Let's update the spiderLinks()
function again to implement a parallel flow, using promises:
function spiderLinks(currentUrl, body, nesting) { if(nesting === 0) { return Promise.resolve(); } var links = utilities.getPageLinks(currentUrl, body); var promises = links.map(function(link) { return spider(link, nesting - 1); }); return Promise.all(promises); }
Trivially, the pattern consists in starting the spider()
tasks all at once into the elements.map()
loop, which also collects all their promises. This time, in the loop, we are not waiting for the previous download to complete before starting a new one, all the download tasks are started in the loop at once, one after the other. Afterwards, we leveraged the Promise.all()
method, which returns a new promise that will be fulfilled when all the promises in the array are fulfilled. In other words, it fulfills when all the download tasks have completed; exactly what we wanted.
Unfortunately, the ES6 Promise API does not provide a way to implement a limited parallel control flow natively, but we can always rely on what we learned about limiting the concurrency with plain JavaScript. In fact, the pattern we implemented inside the TaskQueue
class can be easily adapted to support tasks that return a promise. This can be done trivially by modifying the next()
method:
TaskQueue.prototype.next = function() { var self = this; while(self.running < self.concurrency && self.queue.length) { var task = self.queue.shift(); task().then(function() { self.running--; self.next(); }); self.running++; } }
So now, instead of handling the task with a callback, we simply invoke then()
on the promise it returns. The rest of the code is practically identical to the old version of TaskQueue
.
Now, we can go back to the spider.js
module, modifying it to support our new version of the TaskQueue
class. First, we make sure to define a new instance of TaskQueue
:
var TaskQueue = require('./taskQueue'); var downloadQueue = new TaskQueue(2);
Then, it's the turn of the spiderLinks()
function again. The change here is also pretty straightforward:
function spiderLinks(currentUrl, body, nesting) { if(nesting === 0) { return Promise.resolve(); } var links = utilities.getPageLinks(currentUrl, body); //we need the following because the Promise we create next //will never settle if there are no tasks to process if(links.length === 0) { return Promise.resolve(); } return new Promise(function(resolve, reject) { //[1] var completed = 0; links.forEach(function(link) { var task = function() { //[2] return spider(link, nesting - 1) .then(function() { if(++completed === links.length) { resolve(); } }) .catch(reject); }; downloadQueue.pushTask(task); }); }); }
There are a couple of things in the preceding code that merit our attention:
onFulfilled()
callback to the promise returned by spider()
, so we could count the number of download tasks completed. When the amount of completed downloads matches the number of links in the current page, we know that we are done processing, so we can invoke the resolve()
function of the outer promise.The Promises/A+ specification states that the onFulfilled()
and onRejected()
callbacks of the then()
method have to be invoked only once and exclusively (only one or the other is invoked). A compliant promises implementation makes sure that even if we call resolve
or reject
multiple times, the promise is either fulfilled or rejected only once.
Version 4 of the web spider application using promises should now be ready to be tried out. We might notice once again how the download tasks now run in parallel, with a concurrency limit of 2.