Chapter 5. WebSockets

Every HTTP request sent from the browser includes headers, whether you want them or not. Nor are they small headers. Uncompressed request and response headers can vary in size from 200 bytes to over 2K. Although, typical size is somewhere between 700 and 900 bytes, those numbers will grow as userAgents expand features.

WebSockets give you minimal overhead and a much more efficient way of delivering data to the client and server with full duplex communication through a single socket. The WebSocket connection is made after a small HTTP handshake occurs between the client and the server, over the same underlying TCP/IP connection. This gives you an open connection between the client and the server, and both parties can start sending data at any time.

A few of WebSockets’ many advantages are:

To effectively develop any application with WebSockets, you must accept the idea of the “real-time Web” in which the client-side code of your web application communicates continuously with a corresponding real-time server during every user connection. To accomplish this, you can use a capable protocol such as WebSockets or SPDY to build the stack yourself. Or you can choose a service or project to manage the connections and graceful degradation for you. In this chapter, you’ll learn how to implement a raw WebSocket server and the best practices surrounding the details of setting one up.

If you opt to leave the management to someone else, you have choices. Freemium services, such as Pusher (http://pusher.com), are starting to emerge to do this, and companies like Kaazing, which offers the Kaazing Gateway, have been providing adapters for STOMP and Apache ActiveMQ for years. In addition, you can find plenty of wrapper frameworks around WebSockets to provide graceful degradation—from Socket.IO to CometD to whatever’s hot right now. Graceful degradation is the process of falling back to use older technologies, such as Flash or long polling, within the browser if the WebSocket protocol is not supported. Comet, push technology, and long-polling in web apps are slow, inefficient, inelegant and have a higher potential magnitude for unreliability. For this book, I am only covering the core WebSocket specification to avoid confusion and to keep things simple.

As mentioned earlier, WebSockets present a new development model for server- and client-side applications: the “real-time” Web. During every user connection under this model, your web application’s client side needs to communicate continuously with the corresponding real-time server. Although most server-side frameworks provide eventing mechanisms, few extend the events all the way through to the web browser to support this real-time model. As a result, you are faced with retrofitting your current solutions and architectures into this real-time model.

For example, suppose your server-side framework is capable of sending an event and you have observers of this event in your code. WebSockets gives you the ability to extend that event so that it carries all the way from the server side into the connected client’s browser. A good example would be to notify all WebSocket connections that a user has registered on your site.

The first step towards implementing such a solution is to wire up the three main listeners associated with WebSockets: onopen, onmessage, and onclose. Basically, the following events will be fired automatically when a WebSocket connection opens successfully. For example:

objWebSocket.onopen = function(evt)
{
   alert("WebSocket connection opened successfully");
};
objWebSocket.onmessage = function(evt)
{
   alert("Message : " + evt.data);
};
objWebSocket .onclose = function(evt)
{
   alert("WebSocket connection closed");
};

After the WebSocket connection opens, the onmessage event fires whenever the server sends data to the client. If the client wants to send data to the server, it can do so as follows:

objWebSocket.send("Hello World");

Sending messages in the form of strings over raw WebSockets isn’t very appealing, however, when you want to develop enterprise-style web applications. Because current WebSocket implementations deal mostly with strings, you can use JSON to transfer data to and from the server.

But how do you propagate the server-side events that are fired on the server and then bubble them up on the client? One approach is to relay the events. When a specific server-side event is fired, use a listener or observer to translate the data to JSON and send it to all connected clients.

Before you can successfully communicate with a server, you need to know what you’re talking to and how. For the chapter’s examples, I’m using the JBoss AS7 application server and embedding Jetty within the web application. The main reasoning behind this approach is to take advantage of a lightweight Java EE 6.0 [Full Profile] application server. There are a few other Java options out there, such as GlassFish or running Jetty standalone, but this solution offers contexts and dependency injection (CDI), distributed transactions, scalable JMS messaging, and data grid support out of the box. Such support is extremely valuable in cutting-edge enterprise initiatives and private cloud architectures.

Because this approach embeds one server (Jetty) with another server (JBoss), we can use it with any app server, even one that may not support WebSockets, and enable existing, older applications to take advantage of real-time connections.

The full deployable source code for this example is on the “embedded-jetty branch”. A few things are worth noting here:

The code below first sets up the WebSocket server using Jetty’s WebSocketHandler and embeds it inside a ServletContextListener. Although the app shares a synchronized set of WebSocket connections across threads, we ensure that only a single thread can execute a method or block at one time by using the synchronized keyword. To relay the CDI event to the browser, we must store all the WebSocket connections in a ConcurrentHashSet and write new connections to it as they come online. At any time, the ConcurrentHashSet will be read on a different thread so we know where to relay the CDI events. The ChatWebSocketHandler contains a global set of WebSocket connections and adds each new connection within the Jetty server.

public class ChatWebSocketHandler extends WebSocketHandler {
private static Set<ChatWebSocket> websockets =
    new ConcurrentHashSet<ChatWebSocket>();

    public WebSocket doWebSocketConnect(HttpServletRequest request,
            String protocol) {
        return new ChatWebSocket();
    }

    public class ChatWebSocket implements WebSocket.OnTextMessage {

        private Connection connection;

        public void onOpen(Connection connection) {
            // Client (Browser) WebSockets has opened a connection.
            // 1) Store the opened connection
            this.connection = connection;
            // 2) Add ChatWebSocket in the global list of ChatWebSocket
            // instances
            // instance.
            getWebsockets().add(this);
        }

        public void onMessage(String data) {
            // Loop for each instance of ChatWebSocket to send message
            // server to each client WebSockets.
            try {
                for (ChatWebSocket webSocket : getWebsockets()) {
                    // send a message to the current client WebSocket.
                    webSocket.connection.sendMessage(data);
                }
            } catch (IOException x) {
                // Error was detected, close the ChatWebSocket client side
                this.connection.disconnect();
            }

        }

        public void onClose(int closeCode, String message) {
            // Remove ChatWebSocket in the global list of ChatWebSocket
            // instance.
            getWebsockets().remove(this);
        }
    }

    public static synchronized Set<ChatWebSocket> getWebsockets() {
        return websockets;
    }

}

Next, we embed the Jetty WebSocket-capable server within the web application:

private Server server = null;
    /**
     * Start Embedding Jetty server when WEB Application is started.
     *
     */
    public void contextInitialized(ServletContextEvent event) {
        try {
            // 1) Create a Jetty server with the 8081 port.
            InetAddress addr = InetAddress.getLocalHost();
            this.server = new Server();
            Connector connector = new SelectChannelConnector();
            connector.setPort(8081);
            connector.setHost(addr.getHostAddress());

            server.addConnector(connector);

            // 2) Register ChatWebSocketHandler in the
            //Jetty server instance.
            ChatWebSocketHandler chatWebSocketHandler =
                                       new ChatWebSocketHandler();
            chatWebSocketHandler.setHandler(new DefaultHandler());

            server.setHandler(chatWebSocketHandler);

            // 2) Start the Jetty server.
            server.start();
        } catch (Throwable e) {
            e.printStackTrace();
        }
    }

....
}

Now we’ll create a method to observe CDI events and send the fired Member events to all active connections. This relays a very simple cdievent JavaScript object, which will be pushed to all connected clients and then evaluated on the browser through a JavaScript interpreter.

public void observeItemEvent(@Observes Member member) {
        try {
            for (ChatWebSocket webSocket : websockets) {

webSocket.connection.sendMessage("{\"cdievent\":{\"fire\":function(){" +
                    "eventObj.initEvent(\'memberEvent\', true, true);" +
                    "eventObj.name = '" +  member.getName() + "';\n" +
                    "document.dispatchEvent(eventObj);" +
                    "}}}");
            }
        } catch (IOException x) {
            //...
        }
    }

The above code observes the following event when a new Member is registered through the web interface. As you can see below, memberEventSrc.fire(member) is fired when a user registers through the provided RESTful URL.

@POST
@Consumes(MediaType.APPLICATION_FORM_URLENCODED)
@Produces(MediaType.APPLICATION_JSON)
public Response createMember(@FormParam("name") String name,
                            @FormParam("email") String email,
                            @FormParam("phoneNumber") String phone) {
     ...

      //Create a new member class from fields
      Member member = new Member();
      member.setName(name);
      member.setEmail(email);
      member.setPhoneNumber(phone);

      try {

         //Fire the CDI event
         memberEventSrc.fire(member);

Finally, we set up the WebSocket JavaScript client and safely avoid using the eval() method to execute the received JavaScript.

        ...
        var location = "ws://192.168.1.101:8081/"
        this._ws = new WebSocket(location);
        ....
        _onmessage : function(m) {
            if (m.data) {
                //check to see if this message is a CDI event
                if(m.data.indexOf('cdievent') > 0){
                    try{
                        //$('log').innerHTML = m.data;
                        //avoid use of eval...
                        var event = (m.data);
                        event = (new Function("return " + event))();
                        event.cdievent.fire();
                    }catch(e){
                        alert(e);
                    }
                }else{
                    //... append data in the DOM
                }
            }
        },

Here is the JavaScript code that listens for the CDI event and executes the necessary client-side code:

window.addEventListener('memberEvent', function(e) {
    alert(e.name + ' just registered!');
}, false);

As you can see, this is a very prototyped approach to achieve a running WebSocket server, but it’s a step forward in adding a usable programming layer on top of the WebSocket protocol.

Another cool use of WebSockets is the ability to use binary data instead of just JSON strings. For example:

objWebSocket.onopen = function(evt)
{
   var array = new Float32Array(5);
   for (var i = 0; i < array.length; ++i) array[i] = i / 2;
   ws.send(array, {binary: true});
};

Why send binary data? This allows you to stream audio to connected clients using the Web Audio API. Or you could give users the ability to collaborate with a real-time screen sharing application using canvas and avoid the need to base64-encode the images. The possibilities are limitless!

The following code sets up a Node.js server to demo an example of sending audio over a WebSocket connection. See https://github.com/einaros/ws-audio-example for the full example.

var express = require('express');
var WebSocketServer = require('ws').Server;
var app = express.createServer();

function getSoundBuffer(samples) {
  var header = new Buffer([
      0x52,0x49,0x46,0x46, // "RIFF"
      0, 0, 0, 0,          // put total size here
      0x57,0x41,0x56,0x45, // "WAVE"
      0x66,0x6d,0x74,0x20, // "fmt "
      16,0,0,0,            // size of the following
      1, 0,                // PCM format
      1, 0,                // Mono: 1 channel
      0x44,0xAC,0,0,       // 44,100 samples per second
      0x88,0x58,0x01,0,    // byte rate: two bytes per sample
      2, 0,                // aligned on every two bytes
      16, 0,               // 16 bits per sample
      0x64,0x61,0x74,0x61, // "data"
      0, 0, 0, 0           // put number of samples here
  ]);
  header.writeUInt32LE(36 + samples.length, 4, true);
  header.writeUInt32LE(samples.length, 40, true);
  var data = new Buffer(header.length + samples.length);
  header.copy(data);
  samples.copy(data, header.length);
  return data;
}

function makeSamples(frequency, duration) {
  var samplespercycle = 44100 / frequency;
  var samples = new Uint16Array(44100 * duration);
  var da = 2 * Math.PI / samplespercycle;
  for (var i = 0, a = 0; i < samples.length; i++, a += da) {
    samples[i] = Math.floor(Math.sin(a / 300000) * 32768);
  }
  return
getSoundBuffer(new Buffer(Array.prototype.slice.call(samples, 0)));
}

app.use(express.static(__dirname + '/public'));
app.listen(8080);
var wss = new WebSocketServer({server: app, path: '/data'});

var samples = makeSamples(20000, 10);

wss.on('connection', function(ws) {
  ws.on('message', function(message) {
    ws.send('pong');
  });
  ws.send(samples, {binary: true});
}); 

With new technology comes a new set of problems. In the case of WebSockets, the challenges relate to compatibility with the proxy servers that mediate HTTP connections in most company networks. A firewall, proxy server, or switch always is the lynchpin of an enterprise, and these devices and servers limit the kind of traffic you’re allowed to send to and from the server.

The WebSocket protocol uses the HTTP upgrade system (which is normally used for HTTPS/SSL) to “upgrade” an HTTP connection to a WebSocket connection. Some proxy servers are not able to handle this handshake and will drop the connection. So, even if a given client uses the WebSocket protocol, it may not be possible to establish a connection.

Some proxy servers are harmless and work fine with WebSockets. Others will prevent WebSockets from working correctly, causing the connection to fail. In some cases, additional proxy server configuration may be required, and certain proxy servers may need to be upgraded to support WebSocket connections.

If unencrypted WebSocket traffic flows through an explicit or a transparent proxy server on its way to the WebSocket server, then, whether or not the proxy server behaves as it should, the connection is almost certainly bound to fail. Therefore, unencrypted WebSocket connections should be used only in the simplest topologies. As WebSockets become more mainstream, proxy servers will become WebSocket aware.

If you use an encrypted WebSocket connection, then use Transport Layer Security (TLS) in the WebSocket Secure connection to ensure that an HTTP CONNECT command is issued when the browser is configured to use an explicit proxy server. This sets up a tunnel, which provides low-level end-to-end TCP communication through the HTTP proxy, between the WebSocket Secure client and the WebSocket server. In the case of transparent proxy servers, the browser is unaware of the proxy server, so no HTTP CONNECT is sent. Because the wire traffic is encrypted, however, intermediate transparent proxy servers may simply allow the encrypted traffic through, so there is a much better chance that the WebSocket connection will succeed if you use WebSocket Secure. Using encryption is not free of resource cost, but often provides the highest success rate.

Things have changed since the days of fronting our servers with Apache for tasks like static resource serving. Apache configuration changes result in killing hundreds of active connections, which in turn, kills service availability.

With today’s private cloud architectures, there is a high demand for throughput and availability. If we want our services like Apache or Tomcat to come up or go down at any time, then we simply have to put something in front of those services that can handle routing the traffic correctly, based on the cloud topology at the moment. One way to take down servers and bring up new ones without affecting service availability is to use a proxy. In most cases, HAProxy is the go to-choice for high throughput and availability.

HAProxy is a lightweight proxy server that advertises obscenely high throughput. Such companies as github, Fedora, Stack Overflow, and Twitter all use HAProxy for load balancing and scaling their infrastructure. Not only can HAProxy handle HTTP traffic, but it’s also a general-purpose TCP/IP proxy. Best of all, it’s dead simple to use.

The code that follows adds HAProxy to the previous example. The result is a reverse proxy on the WebSocket port (8081), which allows all traffic (HTTP and WS) to be sent across a common port (8080, in this case). Here is a simple reverse proxy from the example WebSocket server:

global
    maxconn     4096 # Total Max Connections. This is dependent on ulimit
    nbproc      1

defaults
    mode        http

frontend all 0.0.0.0:8080
    timeout client 86400000
    default_backend www_backend
    acl is_websocket hdr(Upgrade) -i WebSocket
    acl is_websocket hdr_beg(Host) -i ws

    use_backend socket_backend if is_websocket

backend www_backend
    balance roundrobin
    option forwardfor # This sets X-Forwarded-For
    timeout server 30000
    timeout connect 4000
    server apiserver 192.168.1.101:8080 weight 1 maxconn 4096 check

backend socket_backend
    balance roundrobin
    option forwardfor # This sets X-Forwarded-For
    timeout queue 5000
    timeout server 86400000
    timeout connect 86400000
    server apiserver 192.168.1.101:8081 weight 1 maxconn 4096 check

This approach is universal to any HTTP server that embeds a separate WebSocket server on a different port.

There are just about as many Comet, AJAX push-based, WebSocket frameworks and servers as there are mobile web frameworks. So sorting out which ones are built for lightweight mobile environments and which ones may be suitable only for desktop browsers is essential. Keep in mind that graceful degradation comes at a cost. If you choose a WebSocket framework that degrades in 10 different ways, you do not want your mobile clients to be penalized with a heavy framework download. To provide real-time connectivity to every browser, you need a framework that will detect the most capable transport at runtime.

You may already be familiar with projects such as Node.js, Ruby EventMachine, or Python Twisted. These projects use an event-based API to allow you to create network-aware applications in just a few lines of code. But what about enterprise-grade performance and concurrency? Take a look at how a few of your options stack up.

Atmosphere is the only portable WebSocket/Comet framework supporting Scala, Groovy, and Java. Atmosphere (https://github.com/Atmosphere) can run on any Java-based web server, including Tomcat, Jetty, GlassFish, Weblogic, Grizzly, JBoss, Resin, and more. The Atmosphere framework has both client (JavaScript, iQuery, GWT) and server components. You can find many examples of how to use Atmosphere in your project at https://github.com/Atmosphere/atmosphere/tree/master/samples (Figure 5-1).

Note

The main concern when using WebSockets is graceful degradation, because most mobile browsers and servers have mixed support. All the frameworks mentioned (plus many more) support some kind of fallback when WebSockets is not available within the browser. All of these fallbacks, however, share the same problem: they carry the overhead of HTTP, which doesn’t make them well suited for low-latency mobile applications. Until all mobile browsers support WebSockets, this is a problem users and developers are forced to deal with.