When your web application requires heavy lifting or background processing on the JavaScript side, the Web Workers API is your answer.
The Web Workers interface spawns real OS-level threads, allowing for data to be passed back and forth between any given threads (or worker). Furthermore, because communication points between threads are carefully controlled, concurrency problems are rare. You cannot access components unsafe to threads or the DOM, and you have to pass specific data in and out of a thread through serialized objects. So you have to work extremely hard to cause problems in your code. Regardless of how you plan to use Web Workers in your application, the main idea behind processing any behind-the-scenes data lies in the idea of creating multiple workers (or threads) in the browser.
As of this writing, Safari, Safari for iOS5, Chrome, Opera, and Mozilla Firefox support the Web Workers API, but Internet Explorer does not. (Internet Explorer 10 did add support for Web Workers in Platform Preview 2.) Web Workers in Android versions 2.0 and 2.1 support Web Workers, as well, but later versions of Android do not. The only shim currently available for Web Workers makes use of Google Gears. If the core Web Workers API is not supported on a device or browser, you can detect if Google Gears is installed. For more details, see http://html5-shims.googlecode.com/svn/trunk/demo/workers.html.
With Web Workers and its multithreaded approach, you do not have access to the DOM (which is
not thread safe), the window
, document
, or parent
objects. You do,
however, have access to the quite a few other features and objects, starting with the
navigator
object:
appCodeName
//the code name of the browser
appName
//the name of the browser
appVersion
//the version information of the browser
cookieEnabled
//Determines whether cookies are enabled in the browser
platform
//Returns for which platform the browser is compiled
userAgent
//the user-agent header sent by the browser to the server
Although you can access the location
object, it is read only:
hash
//the anchor portion of a URL
host
//the hostname and port of a URL
hostname
//the hostname of a URL
href
//the entire URL
pathname
//the path name of a URL
port
//the port number the server uses for a URL
protocol
//the protocol of a URL
search
//the query portion of a URL
You can use XMLHttpRequest
to make AJAX calls within a worker, as well as import external scripts using the importScripts()
method, as long as they’re in the same domain. To cut down wait times, you can set and clear timeouts and intervals with setTimeout()
, clearTimeout()
, setInterval()
, and clearInterval()
, respectively. Finally, you can access the Application cache and spawn other workers. Creating a worker is quite easy; you need only a JavaScript file’s URL. The Worker()
constructor is invoked with the URL to that file as its only argument:
var
worker
=
new
Worker
(
'worker.js'
);
Worker scripts must be external files with the same scheme as their calling page. Thus, you cannot load a script from a data URL and an HTTPS page cannot start worker scripts that begin with HTTP URLs.
The worker is not actually started until you call postMessage()
, such as by sending some object data to the worker:
worker
.
postMessage
({
'haz'
:
'foo'
});
// Start the worker.
Next, add an EventListener
to listen for data the worker returns:
worker
.
addEventListener
(
'message'
,
function
(
e
)
{
console
.
log
(
'returned data from worker'
,
e
.
data
);
},
false
);
In the actual worker.js file, you could have something simple like:
self
.
addEventListener
(
'message'
,
function
(
e
)
{
var
data
=
e
.
data
;
//Manipulate data and send back to parent
self
.
postMessage
(
data
.
haz
);
//posts 'foo' to parent DOM
},
false
);
The previous example simply relays serialized JSON from the parent DOM to the spawned worker instance, and back again.
In newer browsers (like Chrome), you can take your data types a step further and pass binary data between workers. With transferable objects, data is transferred from one context to another. It is zero-copy, which vastly improves the performance of sending data to a worker.
When you transfer an ArrayBuffer
from your main app to a worker, the original ArrayBuffer
is cleared and is made no longer usable by the browser. Its contents are transferred to the worker context.
Chrome version 8 and above also includes a new version of postMessage()
that supports transferable objects:
var
uInt8Array
=
new
Uint8Array
(
new
ArrayBuffer
(
10
));
for
(
var
i
=
0
;
i
<
uInt8Array
.
length
;
++
i
)
{
uInt8Array
[
i
]
=
i
*
2
;
// [0, 2, 4, 6, 8,...]
}
worker
.
webkitPostMessage
(
uInt8View
.
buffer
,
[
uInt8View
.
buffer
]);
Figure 9-1 shows how much faster data can travel between threads using transferable objects. For example, 32MB of data makes a round trip from the worker back to the parent in 2ms. Using previous methods, such as structured cloning, took upward of 300ms to copy the data between threads. To try this test for yourself, visit http://html5-demos.appspot.com/static/workers/transferables/index.html.
The following example, originally inspired by Jos Dirksen’s thread pool example, gives you a way to specify the number of concurrent workers (or threads). With this method, browsers like Chrome can use multiple CPU cores when processing data concurrently, and you can significantly increase your rendering time by up to 300%. You can view the full demo here at http://html5e.org/example/workers, but the basic worker1.js file contains:
self
.
onmessage
=
function
(
event
)
{
var
myobj
=
event
.
data
;
search
:
while
(
myobj
.
foo
<
200
)
{
myobj
.
foo
+=
1
;
for
(
var
i
=
2
;
i
<=
Math
.
sqrt
(
myobj
.
foo
);
i
+=
1
)
if
(
myobj
.
foo
%
i
==
0
)
continue
search
;
// found a prime!
self
.
postMessage
(
myobj
);
}
// close this worker
self
.
close
();
};
The above code simply spits out prime numbers and ends at 200. You could set the while
loop to while(true)
for endless output of prime numbers, but this is a simple example to demonstrate how you can process data in chunks and parallelize the code to reach a common goal with multiple worker threads.
From your main index.html (the place you want all the data to be displayed), initialize your thread pool and give the workers a callback:
slidfast
({
workers
:
{
script
:
'worker1.js'
,
threads
:
9
,
mycallback
:
workerCallback
}
});
To view a live demo of this technique, visit https://github.com/html5e/slidfast/blob/master/example/workers/index.html.
When the workers
parameter initializes, the following code creates the thread pool and begins each task concurrently:
function
Pool
(
size
)
{
var
_this
=
this
;
// set some defaults
this
.
taskQueue
=
[];
this
.
workerQueue
=
[];
this
.
poolSize
=
size
;
this
.
addWorkerTask
=
function
(
workerTask
)
{
if
(
_this
.
workerQueue
.
length
>
0
)
{
// get the worker from the front of the queue
var
workerThread
=
_this
.
workerQueue
.
shift
();
//get an index for tracking
slidfast
.
worker
.
obj
().
index
=
_this
.
workerQueue
.
length
;
workerThread
.
run
(
workerTask
);
}
else
{
// no free workers,
_this
.
taskQueue
.
push
(
workerTask
);
}
};
this
.
init
=
function
()
{
// create 'size' number of worker threads
for
(
var
i
=
0
;
i
<
size
;
i
++
)
{
_this
.
workerQueue
.
push
(
new
WorkerThread
(
_this
));
}
};
this
.
freeWorkerThread
=
function
(
workerThread
)
{
if
(
_this
.
taskQueue
.
length
>
0
)
{
// don't put back in queue, but execute next task
var
workerTask
=
_this
.
taskQueue
.
shift
();
workerThread
.
run
(
workerTask
);
}
else
{
_this
.
taskQueue
.
push
(
workerThread
);
}
};
}
// runner work tasks in the pool
function
WorkerThread
(
parentPool
)
{
var
_this
=
this
;
this
.
parentPool
=
parentPool
;
this
.
workerTask
=
{};
this
.
run
=
function
(
workerTask
)
{
this
.
workerTask
=
workerTask
;
// create a new web worker
if
(
this
.
workerTask
.
script
!==
null
)
{
var
worker
=
new
Worker
(
workerTask
.
script
);
worker
.
addEventListener
(
'message'
,
function
(
event
)
{
mycallback
(
event
);
_this
.
parentPool
.
freeWorkerThread
(
_this
);
},
false
);
worker
.
postMessage
(
slidfast
.
worker
.
obj
());
}
};
}
function
WorkerTask
(
script
,
callback
,
msg
)
{
this
.
script
=
script
;
this
.
callback
=
callback
;
console
.
log
(
msg
);
this
.
obj
=
msg
;
}
var
pool
=
new
Pool
(
workers
.
threads
);
pool
.
init
();
var
workerTask
=
new
WorkerTask
(
workers
.
script
,
mycallback
,
slidfast
.
worker
.
obj
());
After initializing the worker threads, add the actual workerTasks
to process the data:
pool
.
addWorkerTask
(
workerTask
);
slidfast
.
worker
.
obj
().
foo
=
10
;
pool
.
addWorkerTask
(
workerTask
);
slidfast
.
worker
.
obj
().
foo
=
20
;
pool
.
addWorkerTask
(
workerTask
);
slidfast
.
worker
.
obj
().
foo
=
30
;
pool
.
addWorkerTask
(
workerTask
);
As you can see in Figure 9-2, each thread brings data back to the main page and renders it with the supplied callback. The thread order varies on each refresh and there is no guarantee on how the browser will process the data. To see a demo, visit http://html5e.org/example/workers. Use the latest version of Chrome or another browser that supports actual CPU core usage per web worker.
Crunching prime numbers may not be the best real-world example of using thread pooling, but you can use the same technique for processing image data. For more information, see http://www.smartjava.org/examples/webworkers2 and Figure 9-3.
Web Workers could be put into action within your app for additional scenarios as well. For example, you could parse wiki text as the user types, and then generate the HTML. You can find an example of this at http://www.cach.me/blog/2011/01/javascript-web-workers-tutorial-parse-wiki-text-in-real-time. Or, you could use it for visualizations and business graphs. For a visualization framework, see https://github.com/samizdatco/arbor.