Chapter 5. WebSockets

Every HTTP request sent from the browser includes headers, whether you want them or not. Nor are they small headers. Uncompressed request and response headers can vary in size from 200 bytes to over 2K. Although, typical size is somewhere between 700 and 900 bytes, those numbers will grow as userAgents expand features.

WebSockets give you minimal overhead and a much more efficient way of delivering data to the client and server with full duplex communication through a single socket. The WebSocket connection is made after a small HTTP handshake occurs between the client and the server, over the same underlying TCP/IP connection. This gives you an open connection between the client and the server, and both parties can start sending data at any time.

A few of WebSockets’ many advantages are:

No HTTP headers
No lag due to keep-alive issues
Low latency, better throughput and responsiveness
Easier on mobile device batteries

Building the Stack

To effectively develop any application with WebSockets, you must accept the idea of the “real-time Web” in which the client-side code of your web application communicates continuously with a corresponding real-time server during every user connection. To accomplish this, you can use a capable protocol such as WebSockets or SPDY to build the stack yourself. Or you can choose a service or project to manage the connections and graceful degradation for you. In this chapter, you’ll learn how to implement a raw WebSocket server and the best practices surrounding the details of setting one up.

If you opt to leave the management to someone else, you have choices. Freemium services, such as Pusher (http://pusher.com), are starting to emerge to do this, and companies like Kaazing, which offers the Kaazing Gateway, have been providing adapters for STOMP and Apache ActiveMQ for years. In addition, you can find plenty of wrapper frameworks around WebSockets to provide graceful degradation—from Socket.IO to CometD to whatever’s hot right now. Graceful degradation is the process of falling back to use older technologies, such as Flash or long polling, within the browser if the WebSocket protocol is not supported. Comet, push technology, and long-polling in web apps are slow, inefficient, inelegant and have a higher potential magnitude for unreliability. For this book, I am only covering the core WebSocket specification to avoid confusion and to keep things simple.

Note

As of August 2012, the WebSocket specification was in Working Draft status. Implementers and editors were working to bring the spec into Candidate Release status. Until that status is declared, be aware that things could change in regard to the underlying protocol.

On the Server, Behind the Scenes

Keeping a large number of connections open at the same time requires an architecture that permits other processing to continue before the transmission has finished. Such architectures are usually designed around threading or asynchronous nonblocking IO (NIO). As for the debates between NIO and threading, some might say that NIO does not actually perform better than threading, but only allows you to write single-threaded event loops for multiple clients as with select on Unix. Others argue that choosing NIO or threading depends on your expected workloads. If you have lots of long-term idle connections, NIO wins due to not having thousands of threads “blocking on a read” operation. Again, there are many debates over whether threads are faster or easier to write than event loops (or the opposite) so it all depends on the type of use case you are trying to handle. Don’t worry, I’ll show examples of both event loops and threads in this chapter.

Programming Models

As mentioned earlier, WebSockets present a new development model for server- and client-side applications: the “real-time” Web. During every user connection under this model, your web application’s client side needs to communicate continuously with the corresponding real-time server. Although most server-side frameworks provide eventing mechanisms, few extend the events all the way through to the web browser to support this real-time model. As a result, you are faced with retrofitting your current solutions and architectures into this real-time model.

For example, suppose your server-side framework is capable of sending an event and you have observers of this event in your code. WebSockets gives you the ability to extend that event so that it carries all the way from the server side into the connected client’s browser. A good example would be to notify all WebSocket connections that a user has registered on your site.

The first step towards implementing such a solution is to wire up the three main listeners associated with WebSockets: onopen, onmessage, and onclose. Basically, the following events will be fired automatically when a WebSocket connection opens successfully. For example:

objWebSocket.onopen = function(evt)
{
   alert("WebSocket connection opened successfully");
};
objWebSocket.onmessage = function(evt)
{
   alert("Message : " + evt.data);
};
objWebSocket .onclose = function(evt)
{
   alert("WebSocket connection closed");
};

After the WebSocket connection opens, the onmessage event fires whenever the server sends data to the client. If the client wants to send data to the server, it can do so as follows:

objWebSocket.send("Hello World");

Sending messages in the form of strings over raw WebSockets isn’t very appealing, however, when you want to develop enterprise-style web applications. Because current WebSocket implementations deal mostly with strings, you can use JSON to transfer data to and from the server.

But how do you propagate the server-side events that are fired on the server and then bubble them up on the client? One approach is to relay the events. When a specific server-side event is fired, use a listener or observer to translate the data to JSON and send it to all connected clients.

Relaying Events from the Server to the Browser

Before you can successfully communicate with a server, you need to know what you’re talking to and how. For the chapter’s examples, I’m using the JBoss AS7 application server and embedding Jetty within the web application. The main reasoning behind this approach is to take advantage of a lightweight Java EE 6.0 [Full Profile] application server. There are a few other Java options out there, such as GlassFish or running Jetty standalone, but this solution offers contexts and dependency injection (CDI), distributed transactions, scalable JMS messaging, and data grid support out of the box. Such support is extremely valuable in cutting-edge enterprise initiatives and private cloud architectures.

Because this approach embeds one server (Jetty) with another server (JBoss), we can use it with any app server, even one that may not support WebSockets, and enable existing, older applications to take advantage of real-time connections.

The full deployable source code for this example is on the “embedded-jetty branch”. A few things are worth noting here:

Security: Because the WebSocket server is running on a different port (8081) than the JBoss AS7 server (8080), we must account for not having authentication cookies, and so on. A reverse proxy can handle this problem, however, as you’ll see in the last section of this chapter.
Proxies: As if existing proxy servers weren’t already a huge problem for running WebSockets and HTTP over the same port, in this example, we are now running them separately.
Threading: Because we’re observing and listening for CDI events, we must perform some same thread operations and connection sharing.

The code below first sets up the WebSocket server using Jetty’s WebSocketHandler and embeds it inside a ServletContextListener. Although the app shares a synchronized set of WebSocket connections across threads, we ensure that only a single thread can execute a method or block at one time by using the synchronized keyword. To relay the CDI event to the browser, we must store all the WebSocket connections in a ConcurrentHashSet and write new connections to it as they come online. At any time, the ConcurrentHashSet will be read on a different thread so we know where to relay the CDI events. The ChatWebSocketHandler contains a global set of WebSocket connections and adds each new connection within the Jetty server.

public class ChatWebSocketHandler extends WebSocketHandler {

private static Set<ChatWebSocket> websockets =
    new ConcurrentHashSet<ChatWebSocket>();

    public WebSocket doWebSocketConnect(HttpServletRequest request,
            String protocol) {
        return new ChatWebSocket();
    }

    public class ChatWebSocket implements WebSocket.OnTextMessage {

        private Connection connection;

        public void onOpen(Connection connection) {
            // Client (Browser) WebSockets has opened a connection.
            // 1) Store the opened connection
            this.connection = connection;
            // 2) Add ChatWebSocket in the global list of ChatWebSocket
            // instances
            // instance.
            getWebsockets().add(this);
        }

        public void onMessage(String data) {
            // Loop for each instance of ChatWebSocket to send message
            // server to each client WebSockets.
            try {
                for (ChatWebSocket webSocket : getWebsockets()) {
                    // send a message to the current client WebSocket.
                    webSocket.connection.sendMessage(data);
                }
            } catch (IOException x) {
                // Error was detected, close the ChatWebSocket client side
                this.connection.disconnect();
            }

        }

        public void onClose(int closeCode, String message) {
            // Remove ChatWebSocket in the global list of ChatWebSocket
            // instance.
            getWebsockets().remove(this);
        }
    }

    public static synchronized Set<ChatWebSocket> getWebsockets() {
        return websockets;
    }

}

Next, we embed the Jetty WebSocket-capable server within the web application:

private Server server = null;
    /**
     * Start Embedding Jetty server when WEB Application is started.
     *
     */
    public void contextInitialized(ServletContextEvent event) {
        try {
            // 1) Create a Jetty server with the 8081 port.
            InetAddress addr = InetAddress.getLocalHost();
            this.server = new Server();
            Connector connector = new SelectChannelConnector();
            connector.setPort(8081);
            connector.setHost(addr.getHostAddress());

            server.addConnector(connector);

            // 2) Register ChatWebSocketHandler in the
            //Jetty server instance.
            ChatWebSocketHandler chatWebSocketHandler =
                                       new ChatWebSocketHandler();
            chatWebSocketHandler.setHandler(new DefaultHandler());

            server.setHandler(chatWebSocketHandler);

            // 2) Start the Jetty server.
            server.start();
        } catch (Throwable e) {
            e.printStackTrace();
        }
    }

....
}

Now we’ll create a method to observe CDI events and send the fired Member events to all active connections. This relays a very simple cdievent JavaScript object, which will be pushed to all connected clients and then evaluated on the browser through a JavaScript interpreter.

public void observeItemEvent(@Observes Member member) {
        try {
            for (ChatWebSocket webSocket : websockets) {

webSocket.connection.sendMessage("{\"cdievent\":{\"fire\":function(){" +
                    "eventObj.initEvent(\'memberEvent\', true, true);" +
                    "eventObj.name = '" +  member.getName() + "';\n" +
                    "document.dispatchEvent(eventObj);" +
                    "}}}");
            }
        } catch (IOException x) {
            //...
        }
    }

The above code observes the following event when a new Member is registered through the web interface. As you can see below, memberEventSrc.fire(member) is fired when a user registers through the provided RESTful URL.

@POST
@Consumes(MediaType.APPLICATION_FORM_URLENCODED)
@Produces(MediaType.APPLICATION_JSON)
public Response createMember(@FormParam("name") String name,
                            @FormParam("email") String email,
                            @FormParam("phoneNumber") String phone) {
     ...

      //Create a new member class from fields
      Member member = new Member();
      member.setName(name);
      member.setEmail(email);
      member.setPhoneNumber(phone);

      try {

         //Fire the CDI event
         memberEventSrc.fire(member);

Finally, we set up the WebSocket JavaScript client and safely avoid using the eval() method to execute the received JavaScript.

        ...
        var location = "ws://192.168.1.101:8081/"
        this._ws = new WebSocket(location);
        ....
        _onmessage : function(m) {
            if (m.data) {
                //check to see if this message is a CDI event
                if(m.data.indexOf('cdievent') > 0){
                    try{
                        //$('log').innerHTML = m.data;
                        //avoid use of eval...
                        var event = (m.data);
                        event = (new Function("return " + event))();
                        event.cdievent.fire();
                    }catch(e){
                        alert(e);
                    }
                }else{
                    //... append data in the DOM
                }
            }
        },

Here is the JavaScript code that listens for the CDI event and executes the necessary client-side code:

window.addEventListener('memberEvent', function(e) {
    alert(e.name + ' just registered!');
}, false);

As you can see, this is a very prototyped approach to achieve a running WebSocket server, but it’s a step forward in adding a usable programming layer on top of the WebSocket protocol.

Using the new and shiny

As of this writing, JBoss has just begun to implement WebSockets natively on JBoss AS7. The same example from above has been converted for native WebSocket support (without embedding Jetty) on JBoss AS 7.1.2 and beyond. This gives you the benefit of having both HTTP and WS traffic over the same port without needing to worry about managing data across threads. To see a chat room example that uses native WebSocket, check out https://github.com/html5e/HTML5-Mobile-WebSocket. You can find the JBoss WebSocket source at https://github.com/mikebrock/jboss-websockets.

Binary Data Over WebSockets

Another cool use of WebSockets is the ability to use binary data instead of just JSON strings. For example:

objWebSocket.onopen = function(evt)
{
   var array = new Float32Array(5);
   for (var i = 0; i < array.length; ++i) array[i] = i / 2;
   ws.send(array, {binary: true});
};

Why send binary data? This allows you to stream audio to connected clients using the Web Audio API. Or you could give users the ability to collaborate with a real-time screen sharing application using canvas and avoid the need to base64-encode the images. The possibilities are limitless!

The following code sets up a Node.js server to demo an example of sending audio over a WebSocket connection. See https://github.com/einaros/ws-audio-example for the full example.

var express = require('express');
var WebSocketServer = require('ws').Server;
var app = express.createServer();

function getSoundBuffer(samples) {
  var header = new Buffer([
      0x52,0x49,0x46,0x46, // "RIFF"
      0, 0, 0, 0,          // put total size here
      0x57,0x41,0x56,0x45, // "WAVE"
      0x66,0x6d,0x74,0x20, // "fmt "
      16,0,0,0,            // size of the following
      1, 0,                // PCM format
      1, 0,                // Mono: 1 channel
      0x44,0xAC,0,0,       // 44,100 samples per second
      0x88,0x58,0x01,0,    // byte rate: two bytes per sample
      2, 0,                // aligned on every two bytes
      16, 0,               // 16 bits per sample
      0x64,0x61,0x74,0x61, // "data"
      0, 0, 0, 0           // put number of samples here
  ]);
  header.writeUInt32LE(36 + samples.length, 4, true);
  header.writeUInt32LE(samples.length, 40, true);
  var data = new Buffer(header.length + samples.length);
  header.copy(data);
  samples.copy(data, header.length);
  return data;
}

function makeSamples(frequency, duration) {
  var samplespercycle = 44100 / frequency;
  var samples = new Uint16Array(44100 * duration);
  var da = 2 * Math.PI / samplespercycle;
  for (var i = 0, a = 0; i < samples.length; i++, a += da) {
    samples[i] = Math.floor(Math.sin(a / 300000) * 32768);
  }
  return
getSoundBuffer(new Buffer(Array.prototype.slice.call(samples, 0)));
}

app.use(express.static(__dirname + '/public'));
app.listen(8080);
var wss = new WebSocketServer({server: app, path: '/data'});

var samples = makeSamples(20000, 10);

wss.on('connection', function(ws) {
  ws.on('message', function(message) {
    ws.send('pong');
  });
  ws.send(samples, {binary: true});
});

Managing Proxies

With new technology comes a new set of problems. In the case of WebSockets, the challenges relate to compatibility with the proxy servers that mediate HTTP connections in most company networks. A firewall, proxy server, or switch always is the lynchpin of an enterprise, and these devices and servers limit the kind of traffic you’re allowed to send to and from the server.

The WebSocket protocol uses the HTTP upgrade system (which is normally used for HTTPS/SSL) to “upgrade” an HTTP connection to a WebSocket connection. Some proxy servers are not able to handle this handshake and will drop the connection. So, even if a given client uses the WebSocket protocol, it may not be possible to establish a connection.

global
    maxconn     4096 # Total Max Connections. This is dependent on ulimit
    nbproc      1

defaults
    mode        http

frontend all 0.0.0.0:8080
    timeout client 86400000
    default_backend www_backend
    acl is_websocket hdr(Upgrade) -i WebSocket
    acl is_websocket hdr_beg(Host) -i ws

    use_backend socket_backend if is_websocket

backend www_backend
    balance roundrobin
    option forwardfor # This sets X-Forwarded-For
    timeout server 30000
    timeout connect 4000
    server apiserver 192.168.1.101:8080 weight 1 maxconn 4096 check

backend socket_backend
    balance roundrobin
    option forwardfor # This sets X-Forwarded-For
    timeout queue 5000
    timeout server 86400000
    timeout connect 86400000
    server apiserver 192.168.1.101:8081 weight 1 maxconn 4096 check

This approach is universal to any HTTP server that embeds a separate WebSocket server on a different port.

Frameworks

There are just about as many Comet, AJAX push-based, WebSocket frameworks and servers as there are mobile web frameworks. So sorting out which ones are built for lightweight mobile environments and which ones may be suitable only for desktop browsers is essential. Keep in mind that graceful degradation comes at a cost. If you choose a WebSocket framework that degrades in 10 different ways, you do not want your mobile clients to be penalized with a heavy framework download. To provide real-time connectivity to every browser, you need a framework that will detect the most capable transport at runtime.

You may already be familiar with projects such as Node.js, Ruby EventMachine, or Python Twisted. These projects use an event-based API to allow you to create network-aware applications in just a few lines of code. But what about enterprise-grade performance and concurrency? Take a look at how a few of your options stack up.

Vert.x

A fully asynchronous, general-purpose application container for JVM languages, Vert.x) takes inspiration from such event-driven frameworks as Node.js, then combines it with a distributed event bus and sticks it all on the JVM. The result is a runtime with real concurrency and unrivalled performance. Vert.x then exposes the API in Ruby, JavaScript, Groovy, and Java. Vert.x supports TCP, HTTP, WebSockets, and many more modules. You can think of it as Node.js for JVM languages.

Vert.x recommends SockJS to provide a WebSocket-like object on the client. Under the hood, SockJS tries to use native WebSockets first. If that fails, it can use a variety of browser-specific transport protocols and presents them through WebSocket-like abstractions. SockJS is intended to work for all modern browsers and in environments that don’t support WebSocket protcol, such as behind restrictive corporate proxies.

Vert.x requires JDK 1.7.0. It uses such open source projects as Netty, JRuby, Mozilla Rhino, and Hazelcast, and is under MIT and Apache 2.0 license.

The code for SockJS page set-up is:

<!DOCTYPE html>
<html>
<head>
    <title>my app</title>
</head>
<body>
   <script src="http://cdn.sockjs.org/sockjs-0.1.min.js"></script>
</body>
</html>

To use SockJS:

var sock = new SockJS('http://mydomain.com/my_prefix');
   sock.onopen = function() {
       console.log('open');
   };
   sock.onmessage = function(e) {
       console.log('message', e.data);
   };
   sock.onclose = function() {
       console.log('close');
   };

Socket.IO

Specifically built for use with a Node.js server, Socket.IO (http://socket.io) has the capability to be used with any backend after you set fallback capabilities via Flash. Socket.IO aims to make real-time apps possible in every browser and mobile device, blurring the differences between the different transport mechanisms. Specifically, Socket.IO supports iOS, Android, WebOs, and WebKit License, and is under MIT license.

The page setup for Socket.IO is simple:

<!DOCTYPE html>
<html>
<head>
    <title>my app</title>
</head>
<body>
   <script src="http://cdn.socket.io/stable/socket.io.js"></script>
</body>
</html>

To set up a server, use:

var io = require('socket.io').listen(80);

io.sockets.on('connection', function (socket) {
  socket.emit('news', { hello: 'world' });
  socket.on('my other event', function (data) {
    console.log(data);
  });
});

Finally, set up your client with:

var socket = io.connect('http://localhost');
  socket.on('news', function (data) {
    console.log(data);
    socket.emit('my other event', { my: 'data' });
  });

Atmosphere

Atmosphere is the only portable WebSocket/Comet framework supporting Scala, Groovy, and Java. Atmosphere (https://github.com/Atmosphere) can run on any Java-based web server, including Tomcat, Jetty, GlassFish, Weblogic, Grizzly, JBoss, Resin, and more. The Atmosphere framework has both client (JavaScript, iQuery, GWT) and server components. You can find many examples of how to use Atmosphere in your project at https://github.com/Atmosphere/atmosphere/tree/master/samples (Figure 5-1).

Note

The main concern when using WebSockets is graceful degradation, because most mobile browsers and servers have mixed support. All the frameworks mentioned (plus many more) support some kind of fallback when WebSockets is not available within the browser. All of these fallbacks, however, share the same problem: they carry the overhead of HTTP, which doesn’t make them well suited for low-latency mobile applications. Until all mobile browsers support WebSockets, this is a problem users and developers are forced to deal with.

Figure 5-1. A few of many examples listed in Atmosphere’s github repo