Every HTTP request sent from the browser includes headers, whether you want them or not. Nor are they small headers. Uncompressed request and response headers can vary in size from 200 bytes to over 2K. Although, typical size is somewhere between 700 and 900 bytes, those numbers will grow as userAgent
s expand features.
WebSockets give you minimal overhead and a much more efficient way of delivering data to the client and server with full duplex communication through a single socket. The WebSocket connection is made after a small HTTP handshake occurs between the client and the server, over the same underlying TCP/IP connection. This gives you an open connection between the client and the server, and both parties can start sending data at any time.
A few of WebSockets’ many advantages are:
No HTTP headers
No lag due to keep-alive issues
Low latency, better throughput and responsiveness
Easier on mobile device batteries
To effectively develop any application with WebSockets, you must accept the idea of the “real-time Web” in which the client-side code of your web application communicates continuously with a corresponding real-time server during every user connection. To accomplish this, you can use a capable protocol such as WebSockets or SPDY to build the stack yourself. Or you can choose a service or project to manage the connections and graceful degradation for you. In this chapter, you’ll learn how to implement a raw WebSocket server and the best practices surrounding the details of setting one up.
If you opt to leave the management to someone else, you have choices. Freemium services, such as Pusher (http://pusher.com), are starting to emerge to do this, and companies like Kaazing, which offers the Kaazing Gateway, have been providing adapters for STOMP and Apache ActiveMQ for years. In addition, you can find plenty of wrapper frameworks around WebSockets to provide graceful degradation—from Socket.IO to CometD to whatever’s hot right now. Graceful degradation is the process of falling back to use older technologies, such as Flash or long polling, within the browser if the WebSocket protocol is not supported. Comet, push technology, and long-polling in web apps are slow, inefficient, inelegant and have a higher potential magnitude for unreliability. For this book, I am only covering the core WebSocket specification to avoid confusion and to keep things simple.
As of August 2012, the WebSocket specification was in Working Draft status. Implementers and editors were working to bring the spec into Candidate Release status. Until that status is declared, be aware that things could change in regard to the underlying protocol.
Keeping a large number of connections open at the same time requires an architecture that permits other processing to continue before the transmission has finished. Such architectures are usually designed around threading or asynchronous nonblocking IO (NIO). As for the debates between NIO and threading, some might say that NIO does not actually perform better than threading, but only allows you to write single-threaded event loops for multiple clients as with select on Unix. Others argue that choosing NIO or threading depends on your expected workloads. If you have lots of long-term idle connections, NIO wins due to not having thousands of threads “blocking on a read” operation. Again, there are many debates over whether threads are faster or easier to write than event loops (or the opposite) so it all depends on the type of use case you are trying to handle. Don’t worry, I’ll show examples of both event loops and threads in this chapter.
As mentioned earlier, WebSockets present a new development model for server- and client-side applications: the “real-time” Web. During every user connection under this model, your web application’s client side needs to communicate continuously with the corresponding real-time server. Although most server-side frameworks provide eventing mechanisms, few extend the events all the way through to the web browser to support this real-time model. As a result, you are faced with retrofitting your current solutions and architectures into this real-time model.
For example, suppose your server-side framework is capable of sending an event and you have observers of this event in your code. WebSockets gives you the ability to extend that event so that it carries all the way from the server side into the connected client’s browser. A good example would be to notify all WebSocket connections that a user has registered on your site.
The first step towards implementing such a solution is to wire up the three main listeners
associated with WebSockets: onopen
, onmessage
, and onclose
. Basically, the following events
will be fired automatically when a WebSocket connection opens successfully. For
example:
objWebSocket
.
onopen
=
function
(
evt
)
{
alert
(
"WebSocket connection opened successfully"
);
};
objWebSocket
.
onmessage
=
function
(
evt
)
{
alert
(
"Message : "
+
evt
.
data
);
};
objWebSocket
.
onclose
=
function
(
evt
)
{
alert
(
"WebSocket connection closed"
);
};
After the WebSocket connection opens, the onmessage
event fires whenever the server sends data to the client. If the client wants to send data to the server, it can do so as follows:
objWebSocket
.
send
(
"Hello World"
);
Sending messages in the form of strings over raw WebSockets isn’t very appealing, however, when you want to develop enterprise-style web applications. Because current WebSocket implementations deal mostly with strings, you can use JSON to transfer data to and from the server.
But how do you propagate the server-side events that are fired on the server and then bubble them up on the client? One approach is to relay the events. When a specific server-side event is fired, use a listener or observer to translate the data to JSON and send it to all connected clients.
Before you can successfully communicate with a server, you need to know what you’re talking to and how. For the chapter’s examples, I’m using the JBoss AS7 application server and embedding Jetty within the web application. The main reasoning behind this approach is to take advantage of a lightweight Java EE 6.0 [Full Profile] application server. There are a few other Java options out there, such as GlassFish or running Jetty standalone, but this solution offers contexts and dependency injection (CDI), distributed transactions, scalable JMS messaging, and data grid support out of the box. Such support is extremely valuable in cutting-edge enterprise initiatives and private cloud architectures.
Because this approach embeds one server (Jetty) with another server (JBoss), we can use it with any app server, even one that may not support WebSockets, and enable existing, older applications to take advantage of real-time connections.
The full deployable source code for this example is on the “embedded-jetty branch”. A few things are worth noting here:
Because the WebSocket server is running on a different port (8081) than the JBoss AS7 server (8080), we must account for not having authentication cookies, and so on. A reverse proxy can handle this problem, however, as you’ll see in the last section of this chapter.
As if existing proxy servers weren’t already a huge problem for running WebSockets and HTTP over the same port, in this example, we are now running them separately.
Because we’re observing and listening for CDI events, we must perform some same thread operations and connection sharing.
The code below first sets up the WebSocket server using Jetty’s WebSocketHandler
and embeds it inside
a ServletContextListener
. Although the app shares
a synchronized set of WebSocket connections across threads, we ensure that only a
single thread can execute a method or block at one time by using the synchronized
keyword. To relay the CDI event to the
browser, we must store all the WebSocket connections in a ConcurrentHashSet
and write new connections to it as they come
online. At any time, the ConcurrentHashSet
will
be read on a different thread so we know where to relay the CDI events. The ChatWebSocketHandler
contains a global set of
WebSocket connections and adds each new connection within the Jetty server.
public class ChatWebSocketHandler extends WebSocketHandler {
private
static
Set
<
ChatWebSocket
>
websockets
=
new
ConcurrentHashSet
<
ChatWebSocket
>();
public
WebSocket
doWebSocketConnect
(
HttpServletRequest
request
,
String
protocol
)
{
return
new
ChatWebSocket
();
}
public
class
ChatWebSocket
implements
WebSocket
.
OnTextMessage
{
private
Connection
connection
;
public
void
onOpen
(
Connection
connection
)
{
// Client (Browser) WebSockets has opened a connection.
// 1) Store the opened connection
this
.
connection
=
connection
;
// 2) Add ChatWebSocket in the global list of ChatWebSocket
// instances
// instance.
getWebsockets
().
add
(
this
);
}
public
void
onMessage
(
String
data
)
{
// Loop for each instance of ChatWebSocket to send message
// server to each client WebSockets.
try
{
for
(
ChatWebSocket
webSocket
:
getWebsockets
())
{
// send a message to the current client WebSocket.
webSocket
.
connection
.
sendMessage
(
data
);
}
}
catch
(
IOException
x
)
{
// Error was detected, close the ChatWebSocket client side
this
.
connection
.
disconnect
();
}
}
public
void
onClose
(
int
closeCode
,
String
message
)
{
// Remove ChatWebSocket in the global list of ChatWebSocket
// instance.
getWebsockets
().
remove
(
this
);
}
}
public
static
synchronized
Set
<
ChatWebSocket
>
getWebsockets
()
{
return
websockets
;
}
}
Next, we embed the Jetty WebSocket-capable server within the web application:
private
Server
server
=
null
;
/**
* Start Embedding Jetty server when WEB Application is started.
*
*/
public
void
contextInitialized
(
ServletContextEvent
event
)
{
try
{
// 1) Create a Jetty server with the 8081 port.
InetAddress
addr
=
InetAddress
.
getLocalHost
();
this
.
server
=
new
Server
();
Connector
connector
=
new
SelectChannelConnector
();
connector
.
setPort
(
8081
);
connector
.
setHost
(
addr
.
getHostAddress
());
server
.
addConnector
(
connector
);
// 2) Register ChatWebSocketHandler in the
//Jetty server instance.
ChatWebSocketHandler
chatWebSocketHandler
=
new
ChatWebSocketHandler
();
chatWebSocketHandler
.
setHandler
(
new
DefaultHandler
());
server
.
setHandler
(
chatWebSocketHandler
);
// 2) Start the Jetty server.
server
.
start
();
}
catch
(
Throwable
e
)
{
e
.
printStackTrace
();
}
}
....
}
Now we’ll create a method to observe CDI events and send the fired Member
events to all active connections. This relays a very simple cdievent
JavaScript object, which will be pushed to all connected clients and then evaluated on the browser through a JavaScript interpreter.
public
void
observeItemEvent
(
@Observes
Member
member
)
{
try
{
for
(
ChatWebSocket
webSocket
:
websockets
)
{
webSocket
.
connection
.
sendMessage
(
"{\"cdievent\":{\"fire\":function(){"
+
"eventObj.initEvent(\'memberEvent\', true, true);"
+
"eventObj.name = '"
+
member
.
getName
()
+
"';\n"
+
"document.dispatchEvent(eventObj);"
+
"}}}"
);
}
}
catch
(
IOException
x
)
{
//...
}
}
The above code observes the following event when a new Member
is registered through the web interface. As you can see below, memberEventSrc.fire(member)
is fired when a user registers through the provided RESTful URL.
@POST
@Consumes
(
MediaType
.
APPLICATION_FORM_URLENCODED
)
@Produces
(
MediaType
.
APPLICATION_JSON
)
public
Response
createMember
(
@FormParam
(
"name"
)
String
name
,
@FormParam
(
"email"
)
String
,
@FormParam
(
"phoneNumber"
)
String
phone
)
{
...
//Create a new member class from fields
Member
member
=
new
Member
();
member
.
setName
(
name
);
member
.
setEmail
(
);
member
.
setPhoneNumber
(
phone
);
try
{
//Fire the CDI event
memberEventSrc
.
fire
(
member
);
Finally, we set up the WebSocket JavaScript client and safely avoid using the eval()
method to execute the received
JavaScript.
...
var
location
=
"ws://192.168.1.101:8081/"
this
.
_ws
=
new
WebSocket
(
location
);
....
_onmessage
:
function
(
m
)
{
if
(
m
.
data
)
{
//check to see if this message is a CDI event
if
(
m
.
data
.
indexOf
(
'cdievent'
)
>
0
){
try
{
//$('log').innerHTML = m.data;
//avoid use of eval...
var
event
=
(
m
.
data
);
event
=
(
new
Function
(
"return "
+
event
))();
event
.
cdievent
.
fire
();
}
catch
(
e
){
alert
(
e
);
}
}
else
{
//... append data in the DOM
}
}
},
Here is the JavaScript code that listens for the CDI event and executes the necessary client-side code:
window
.
addEventListener
(
'memberEvent'
,
function
(
e
)
{
alert
(
e
.
name
+
' just registered!'
);
},
false
);
As you can see, this is a very prototyped approach to achieve a running WebSocket server, but it’s a step forward in adding a usable programming layer on top of the WebSocket protocol.
As of this writing, JBoss has just begun to implement WebSockets natively on JBoss AS7. The same example from above has been converted for native WebSocket support (without embedding Jetty) on JBoss AS 7.1.2 and beyond. This gives you the benefit of having both HTTP and WS traffic over the same port without needing to worry about managing data across threads. To see a chat room example that uses native WebSocket, check out https://github.com/html5e/HTML5-Mobile-WebSocket. You can find the JBoss WebSocket source at https://github.com/mikebrock/jboss-websockets.
Another cool use of WebSockets is the ability to use binary data instead of just JSON strings. For example:
objWebSocket
.
onopen
=
function
(
evt
)
{
var
array
=
new
Float32Array
(
5
);
for
(
var
i
=
0
;
i
<
array
.
length
;
++
i
)
array
[
i
]
=
i
/
2
;
ws
.
send
(
array
,
{
binary
:
true
});
};
Why send binary data? This allows you to stream audio to connected clients using the Web Audio API. Or you could give users the ability to collaborate with a real-time screen sharing application using canvas and avoid the need to base64-encode the images. The possibilities are limitless!
The following code sets up a Node.js server to demo an example of sending audio over a WebSocket connection. See https://github.com/einaros/ws-audio-example for the full example.
var
express
=
require
(
'express'
);
var
WebSocketServer
=
require
(
'ws'
).
Server
;
var
app
=
express
.
createServer
();
function
getSoundBuffer
(
samples
)
{
var
header
=
new
Buffer
([
0x52
,
0x49
,
0x46
,
0x46
,
// "RIFF"
0
,
0
,
0
,
0
,
// put total size here
0x57
,
0x41
,
0x56
,
0x45
,
// "WAVE"
0x66
,
0x6d
,
0x74
,
0x20
,
// "fmt "
16
,
0
,
0
,
0
,
// size of the following
1
,
0
,
// PCM format
1
,
0
,
// Mono: 1 channel
0x44
,
0xAC
,
0
,
0
,
// 44,100 samples per second
0x88
,
0x58
,
0x01
,
0
,
// byte rate: two bytes per sample
2
,
0
,
// aligned on every two bytes
16
,
0
,
// 16 bits per sample
0x64
,
0x61
,
0x74
,
0x61
,
// "data"
0
,
0
,
0
,
0
// put number of samples here
]);
header
.
writeUInt32LE
(
36
+
samples
.
length
,
4
,
true
);
header
.
writeUInt32LE
(
samples
.
length
,
40
,
true
);
var
data
=
new
Buffer
(
header
.
length
+
samples
.
length
);
header
.
copy
(
data
);
samples
.
copy
(
data
,
header
.
length
);
return
data
;
}
function
makeSamples
(
frequency
,
duration
)
{
var
samplespercycle
=
44100
/
frequency
;
var
samples
=
new
Uint16Array
(
44100
*
duration
);
var
da
=
2
*
Math
.
PI
/
samplespercycle
;
for
(
var
i
=
0
,
a
=
0
;
i
<
samples
.
length
;
i
++
,
a
+=
da
)
{
samples
[
i
]
=
Math
.
floor
(
Math
.
sin
(
a
/
300000
)
*
32768
);
}
return
getSoundBuffer
(
new
Buffer
(
Array
.
prototype
.
slice
.
call
(
samples
,
0
)));
}
app
.
use
(
express
.
static
(
__dirname
+
'/public'
));
app
.
listen
(
8080
);
var
wss
=
new
WebSocketServer
({
server
:
app
,
path
:
'/data'
});
var
samples
=
makeSamples
(
20000
,
10
);
wss
.
on
(
'connection'
,
function
(
ws
)
{
ws
.
on
(
'message'
,
function
(
message
)
{
ws
.
send
(
'pong'
);
});
ws
.
send
(
samples
,
{
binary
:
true
});
});
With new technology comes a new set of problems. In the case of WebSockets, the challenges relate to compatibility with the proxy servers that mediate HTTP connections in most company networks. A firewall, proxy server, or switch always is the lynchpin of an enterprise, and these devices and servers limit the kind of traffic you’re allowed to send to and from the server.
The WebSocket protocol uses the HTTP upgrade system (which is normally used for HTTPS/SSL) to “upgrade” an HTTP connection to a WebSocket connection. Some proxy servers are not able to handle this handshake and will drop the connection. So, even if a given client uses the WebSocket protocol, it may not be possible to establish a connection.
When you use WebSocket Secure (wss://), wire traffic is encrypted and intermediate transparent proxy servers may simply allow the encrypted traffic through, so there is a much better chance that the WebSocket connection will succeed. Using encryption is not free of resource costs, but often provides the highest success rate.
Some proxy servers are harmless and work fine with WebSockets. Others will prevent WebSockets from working correctly, causing the connection to fail. In some cases, additional proxy server configuration may be required, and certain proxy servers may need to be upgraded to support WebSocket connections.
If unencrypted WebSocket traffic flows through an explicit or a transparent proxy server on its way to the WebSocket server, then, whether or not the proxy server behaves as it should, the connection is almost certainly bound to fail. Therefore, unencrypted WebSocket connections should be used only in the simplest topologies. As WebSockets become more mainstream, proxy servers will become WebSocket aware.
If you use an encrypted WebSocket connection, then use Transport Layer Security (TLS)
in the WebSocket Secure connection to ensure that an HTTP
CONNECT
command is issued when the browser is configured to use an
explicit proxy server. This sets up a tunnel, which provides low-level end-to-end
TCP communication through the HTTP proxy, between the WebSocket Secure client and
the WebSocket server. In the case of transparent proxy servers, the browser is
unaware of the proxy server, so no HTTP CONNECT
is sent. Because the wire traffic is encrypted, however, intermediate transparent
proxy servers may simply allow the encrypted traffic through, so there is a much
better chance that the WebSocket connection will succeed if you use WebSocket
Secure. Using encryption is not free of resource cost, but often provides the
highest success rate.
A mid-2010 draft (version hixie-76) broke compatibility with reverse proxies and gateways by
including 8 bytes of key data after the headers, but not advertising that data
in a Content-Length: 8
header. This data was
not forwarded by all intermediates, which could lead to protocol failure. More
recent drafts (such as hybi-09) put the key data in a Sec-WebSocket-Key
header, solving this problem.
Things have changed since the days of fronting our servers with Apache for tasks like static resource serving. Apache configuration changes result in killing hundreds of active connections, which in turn, kills service availability.
With today’s private cloud architectures, there is a high demand for throughput and availability. If we want our services like Apache or Tomcat to come up or go down at any time, then we simply have to put something in front of those services that can handle routing the traffic correctly, based on the cloud topology at the moment. One way to take down servers and bring up new ones without affecting service availability is to use a proxy. In most cases, HAProxy is the go to-choice for high throughput and availability.
HAProxy is a lightweight proxy server that advertises obscenely high throughput. Such companies as github, Fedora, Stack Overflow, and Twitter all use HAProxy for load balancing and scaling their infrastructure. Not only can HAProxy handle HTTP traffic, but it’s also a general-purpose TCP/IP proxy. Best of all, it’s dead simple to use.
The code that follows adds HAProxy to the previous example. The result is a reverse proxy on the WebSocket port (8081), which allows all traffic (HTTP and WS) to be sent across a common port (8080, in this case). Here is a simple reverse proxy from the example WebSocket server:
global maxconn 4096 # Total Max Connections. This is dependent on ulimit nbproc 1 defaults mode http frontend all 0.0.0.0:8080 timeout client 86400000 default_backend www_backend acl is_websocket hdr(Upgrade) -i WebSocket acl is_websocket hdr_beg(Host) -i ws use_backend socket_backend if is_websocket backend www_backend balance roundrobin option forwardfor # This sets X-Forwarded-For timeout server 30000 timeout connect 4000 server apiserver 192.168.1.101:8080 weight 1 maxconn 4096 check backend socket_backend balance roundrobin option forwardfor # This sets X-Forwarded-For timeout queue 5000 timeout server 86400000 timeout connect 86400000 server apiserver 192.168.1.101:8081 weight 1 maxconn 4096 check
This approach is universal to any HTTP server that embeds a separate WebSocket server on a different port.
There are just about as many Comet, AJAX push-based, WebSocket frameworks and servers as there are mobile web frameworks. So sorting out which ones are built for lightweight mobile environments and which ones may be suitable only for desktop browsers is essential. Keep in mind that graceful degradation comes at a cost. If you choose a WebSocket framework that degrades in 10 different ways, you do not want your mobile clients to be penalized with a heavy framework download. To provide real-time connectivity to every browser, you need a framework that will detect the most capable transport at runtime.
You may already be familiar with projects such as Node.js, Ruby EventMachine, or Python Twisted. These projects use an event-based API to allow you to create network-aware applications in just a few lines of code. But what about enterprise-grade performance and concurrency? Take a look at how a few of your options stack up.
A fully asynchronous, general-purpose application container for JVM languages, Vert.x) takes inspiration from such event-driven frameworks as Node.js, then combines it with a distributed event bus and sticks it all on the JVM. The result is a runtime with real concurrency and unrivalled performance. Vert.x then exposes the API in Ruby, JavaScript, Groovy, and Java. Vert.x supports TCP, HTTP, WebSockets, and many more modules. You can think of it as Node.js for JVM languages.
Vert.x recommends SockJS to provide a WebSocket-like object on the client. Under the hood, SockJS tries to use native WebSockets first. If that fails, it can use a variety of browser-specific transport protocols and presents them through WebSocket-like abstractions. SockJS is intended to work for all modern browsers and in environments that don’t support WebSocket protcol, such as behind restrictive corporate proxies.
Vert.x requires JDK 1.7.0. It uses such open source projects as Netty, JRuby, Mozilla Rhino, and Hazelcast, and is under MIT and Apache 2.0 license.
The code for SockJS page set-up is:
<!DOCTYPE html>
<html>
<head>
<title>
my app</title>
</head>
<body>
<script
src=
"http://cdn.sockjs.org/sockjs-0.1.min.js"
></script>
</body>
</html>
To use SockJS:
var
sock
=
new
SockJS
(
'http://mydomain.com/my_prefix'
);
sock
.
onopen
=
function
()
{
console
.
log
(
'open'
);
};
sock
.
onmessage
=
function
(
e
)
{
console
.
log
(
'message'
,
e
.
data
);
};
sock
.
onclose
=
function
()
{
console
.
log
(
'close'
);
};
Specifically built for use with a Node.js server, Socket.IO (http://socket.io) has the capability to be used with any backend after you set fallback capabilities via Flash. Socket.IO aims to make real-time apps possible in every browser and mobile device, blurring the differences between the different transport mechanisms. Specifically, Socket.IO supports iOS, Android, WebOs, and WebKit License, and is under MIT license.
The page setup for Socket.IO is simple:
<!DOCTYPE html>
<html>
<head>
<title>
my app</title>
</head>
<body>
<script
src=
"http://cdn.socket.io/stable/socket.io.js"
></script>
</body>
</html>
To set up a server, use:
var
io
=
require
(
'socket.io'
).
listen
(
80
);
io
.
sockets
.
on
(
'connection'
,
function
(
socket
)
{
socket
.
emit
(
'news'
,
{
hello
:
'world'
});
socket
.
on
(
'my other event'
,
function
(
data
)
{
console
.
log
(
data
);
});
});
Finally, set up your client with:
var
socket
=
io
.
connect
(
'http://localhost'
);
socket
.
on
(
'news'
,
function
(
data
)
{
console
.
log
(
data
);
socket
.
emit
(
'my other event'
,
{
my
:
'data'
});
});
Atmosphere is the only portable WebSocket/Comet framework supporting Scala, Groovy, and Java. Atmosphere (https://github.com/Atmosphere) can run on any Java-based web server, including Tomcat, Jetty, GlassFish, Weblogic, Grizzly, JBoss, Resin, and more. The Atmosphere framework has both client (JavaScript, iQuery, GWT) and server components. You can find many examples of how to use Atmosphere in your project at https://github.com/Atmosphere/atmosphere/tree/master/samples (Figure 5-1).
The main concern when using WebSockets is graceful degradation, because most mobile browsers and servers have mixed support. All the frameworks mentioned (plus many more) support some kind of fallback when WebSockets is not available within the browser. All of these fallbacks, however, share the same problem: they carry the overhead of HTTP, which doesn’t make them well suited for low-latency mobile applications. Until all mobile browsers support WebSockets, this is a problem users and developers are forced to deal with.