Chapter 9. Web Sockets

HTTP is a request and response protocol. It was designed to request files and still operates around the idea of file requests. For the type of application that has to load data and then save it later, this works pretty well.

However, for an application that needs real-time data from the server, this works quite poorly. Many classes of applications require real-time or semi-real-time access to the server. Applications such as chat, or those that share data in real time like many of the Google Office applications, really need a way for the server to push data to the browser when things happen on the server. There are a few ways to do this with HTTP, but none of them really work well.

Some applications, such as Gmail, simply make a large sequence of HTTP requests, more than one per second, as shown in Figure 9-1. This has a lot of overhead and is not a particularly efficient way to poll the server. It can also create a huge amount of server load, as each request involves a setup and teardown that may need to happen on the server. Plus, there is the network overhead of HTTP headers, as well as user authentication. The HTTP headers can add a few hundred bytes to each request. In a busy server this can add a significant amount of load to the servers and network.

A second method is to open up an HTTP request to the server and let it hang open. When the server needs to send some data, it sends it to the client and then closes the HTTP request. At this point the browser will open up a new connection and repeat. Depending on the specific server technology employed, this can still cause a significant load on the server, as a large pool of threads and connections are kept running, even if in a waiting state, though this would be less of an issue using a nonblocking server such as Node.js. A further complication is that the browser may allow only a limited number of Ajax requests to a given server at a time, so holding a request or two open may cause other things to block, making this a less-than-optimal way to do things.

Firebug in Gmail

Figure 9-1. Firebug in Gmail

HTML5 introduces the idea of web sockets, which work a lot like classic TCP/IP sockets. A socket is opened by the browser back to the server from which it was loaded and can be kept open until it is no longer needed, whereupon it is explicitly closed. A socket is a bidirectional real-time data channel, while an HTTP request is a simple polling system. If you were to send each keystroke to the server over HTTP with Ajax, you would incur an overhead of 300–400 bytes at a minimum, maybe as much as a kilobyte or two with cookies, for each keystroke. A socket uses no HTTP headers, so much of that overhead will just go away. The overhead would be reduced to just a few bytes.

As of this writing (August 2011), web sockets are supported by Chrome version 8 and later and Safari version 5. As of Firefox version 6, web sockets are available, but the constructor is MozWebSockets. Opera has implemented the web sockets spec but leaves it turned off by default, pending work on security issues. For browsers that do not support web sockets, fallbacks using classic HTTP or Flash can work. There are also some libraries such as socket.io that will provide a constant interface for web sockets and the fallback to older-style HTTP communications for browsers that may not support web sockets. It is also possible to emulate web sockets via Flash for browsers that support Flash but not web sockets.

The Web Sockets specification document also appears to be a work in progress. While web sockets have been deployed in several browsers, there is still very little documentation on how to implement them. There have also been several earlier versions of the web sockets standard that are not always compatible.

The Web Sockets Interface

To use a web socket, start by creating a WebSocket object. As a parameter, pass a web socket URL. Unlike an HTTP URL, a web socket URL will start with ws or wss. The latter is a secure web socket that will use SSL, similar to HTTPS under Ajax:

var socket = new WebSocket("ws://example.com/socket");

Once a socket connection is opened, the socket’s socket.onopen() callback will be called to let the program know that everything is ready. When the socket closes, the socket.onclose() method will be called. If the browser wishes to close the socket, it should call socket.close().

To send data over the socket, use the socket.send("data") method. Data is limited to strings, so if it’s more complex, you need to encode it to JSON, XML, or some other data interchange format. In addition, sockets are text-only, so if binary data must be sent it should be encoded into text through some method.

Setting Up a Web Socket

A web socket connection starts out much like an HTTP connection. It opens a connection on port 80 (ws) or 443 (wss) to the server. In addition to the standard HTTP headers, it also includes some new headers that tell the server that this is a web socket connection and not an HTTP connection. It also includes some handshaking bytes to provide some security. Since the WebSocket protocol uses ports 80 and 443, most proxies and firewalls should deal with it correctly. Web sockets can also specify a different port in the same way that an HTTP protocol can, but like an Ajax call, the web socket must be on the same port as the web server that spawned it.

Once a connection is set up, both ends can send data over it. Any valid UTF-8 string can be sent. It is up to the server and the client to agree on a data format. Normally, data will probably be JSON or XML, but there is no reason that some other format could not be used if desired.

Web Socket Example

To illustrate web sockets, consider Example 9-1. Here a very simple JavaScript function opens up a socket to a server that serves up stock prices. The JavaScript sends a stock ticker symbol that it is interested in (IBM). The server will find a price for that stock and send it back to the client as JSON. The server could be set up to poll every five seconds for a new price and send it to the client when it changes. The client will just refresh the element every time the price changes.

Example 9-1. Socket client example

$(function ()
{
  var socket = new WebSocket('ws://localhost/stockprice'),
 
  // wait for socket to open 
  socket.onopen = function ()
  {
    socket.send(JSON.stringify(
    {
      ticker: "ibm"
    }));
  };

  socket.onmessage = function (msg)
  {
    var prices = $.parseJSON(msg.data);
    var html = "IBM: " + prices.ibm;
    $('div.prices').html(html);
  }
});

The browser code for working with web sockets should look pretty familiar to any programmer who has worked with Ajax. A web socket object is created with the appropriate URL. Once the socket is opened (be sure to wait for it to open), data can be sent via the socket.send event to the server. When the server sends data back to the browser, the socket.onmessage event is called with the string in the data field of the event object. In this case, the data is in JSON, so it can be parsed with the standard browser JSON parsing methods and then displayed in the browser.

The web socket client does not make much sense without a server to go with it. In general, web sockets lend themselves well to event-driven data such as a shared document, stock ticker, or chat service. Although PHP has often been the standby of web server development, in this case having a language with a programming model set up for long-running processes and events makes more sense.

There are several good choices here. Node.js works well, and has the advantage of being JavaScript, which the web programmer will already be familiar with. Other possibilities include Erlang and Yaws, which have a web socket interface and a multiprocessor model that could be ideal for this kind of programming. There are also a number of options for Java and the other languages based on the JVM, including Scala and Clojure. In addition, there are implementations for Ruby and probably most of the .NET/CLR languages. Most languages that are used for web server programming will be able to use web sockets.

In the following example of server code, done in Node.js (see Example 9-2), a server is set up using the websocket-server package, which can be found via NPM or on GitHub. The server waits for a connection on port 8080, and when one comes in, it calls the callback. That connection callback waits for a message to arrive via the connection object. In this example, it then calls a function called tickerUpdate, which somehow finds stock prices. When the relevant stock symbol has changed, the server invokes the callback, which sends the new price back to the client. For a more complete guide to programming Node.js, see Node: Up and Running by Tom Hughes-Croucher (O’Reilly).

Example 9-2. Socket server example

var ws = require("websocket-server");

var server = ws.createServer();

server.addListener("connection", function (connection)
{
  connection.addListener("message", function (msg)
  {
    var tickerSymbol = msg.ticker;
    tickerUpdate(tickerSymbol, function (price)
    {
      var msg =
      {
      };
      msg[tickerSymbol] = price;
      server.send(connection.id, JSON.stringify(msg));
    });
  });
});

server.listen(8080);

Web Socket Protocol

Most of the time, the low-level details of web sockets will not be of great concern to the programmer. The interfaces in the browser and on the server will take care of the details and just provide an API that can send data.

That being said, sometimes it may be useful to know the low-level details of how things operate, to understand why something is not working, or to implement a web socket client in some other environment. In particular, it is important to understand how a socket is set up.

Web sockets carry data between browser and server using a TCP socket instead of an HTTP envelope. When the browser tries to open a socket, it sends what looks like an HTTP GET request but with a few extra headers (see Example 9-3).

Example 9-3. Socket headers

GET /socket HTTP/1.1
Upgrade: WebSocket
Connection: Upgrade
Origin: http://www.test.com
Host: www.test.com
Content-Length: 0

After the connection has been set up, frames of data are sent back and forth. Each frame is started with a null byte of 0x00 and ends with the 0xFF byte. Inside the envelope is data in UTF-8 format.

There are server-side implementations for web sockets that work with Python, Ruby, Erlang, Node.js, and Java, as well as other languages. The state of libraries for web sockets is advancing, and there are packages in various states of development for pretty much all the major languages used in web development. In general, the choice of a server-side web socket implementation will be dictated by the other needs of a project. So it makes sense to find the web sockets package for the environment that is being used by a given project.

Ruby Event Machine

Ruby’s Event Machine also provides an ideal platform for working with web sockets, as the programmer is given an event-based interface from which a stream of data can be sent to the client. The EventMachine::WebSocket interface closely matches the interface in JavaScript. As in the client, the EventMachine interface has standard event handlers for onopen, onmessage, and onclose, as well as a ws.send method to send data back to the client.

Example 9-4 shows a very trivial “hello world” type of web socket interface in Ruby.

Example 9-4. Ruby Event Machine web socket handler

require 'em-websocket'
 
EventMachine::WebSocket.start(:host => "0.0.0.0", :port => 8080) do |ws|
  ws.onopen    { ws.send "Hello Client!"}
  ws.onmessage { |msg| ws.send "Pong: #{msg}" }
  ws.onclose   { puts "WebSocket closed" }
end

Erlang Yaws

Erlang is a pretty rigorously functional language that was developed several decades ago for telephone switches and has found acceptance in many other areas where massive parallelism and strong robustness are desired. The language is concurrent, fault-tolerant, and very scalable. In recent years it has moved into the web space because all of the traits that make it useful in phone switches are very useful in a web server.

The Erlang Yaws web server also supports web sockets right out of the box. The documentation can be found at the Web Sockets in Yaws web page, along with code for a simple echo server.

Example 9-5. Erlang Yaws web socket handler

out(A) -> 
    case get_upgrade_header(A#arg.headers) of 
    undefined ->
        {content, "text/plain", "You're not a web sockets client! Go away!"};
    "WebSocket" ->
        WebSocketOwner = spawn(fun() -> websocket_owner() end),
        {websocket, WebSocketOwner, passive}
    end.

websocket_owner() ->
    receive
    {ok, WebSocket} ->
        %% This is how we read messages (plural!!) from websockets on passive mode
        case yaws_api:websocket_receive(WebSocket) of
        {error,closed} ->
            io:format("The websocket got disconnected right from the start. "
                  "This wasn't supposed to happen!!~n");
        {ok, Messages} ->
            case Messages of
            [<<"client-connected">>] ->
                yaws_api:websocket_setopts(WebSocket, [{active, true}]),
                echo_server(WebSocket);
            Other ->
                io:format("websocket_owner got: ~p. Terminating~n", [Other])
            end
        end;
    _ -> ok
    end.

echo_server(WebSocket) ->
    receive
    {tcp, WebSocket, DataFrame} ->
        Data = yaws_api:websocket_unframe_data(DataFrame),
        io:format("Got data from Websocket: ~p~n", [Data]),
            yaws_api:websocket_send(WebSocket, Data), 
            echo_server(WebSocket);
    {tcp_closed, WebSocket} ->
        io:format("Websocket closed. Terminating echo_server...~n");
    Any ->
        io:format("echo_server received msg:~p~n", [Any]),
        echo_server(WebSocket)
    end.

get_upgrade_header(#headers{other=L}) ->
    lists:foldl(fun({http_header,_,K0,_,V}, undefined) ->
                        K = case is_atom(K0) of
                                true ->
                                    atom_to_list(K0);
                                false ->
                                    K0
                            end,
                        case string:to_lower(K) of
                            "upgrade" ->
                                V;
                            _ ->
                                undefined
                        end;
                   (_, Acc) ->
                        Acc
                end, undefined, L).

The code is obscure if you don’t know Erlang’s syntax, but the key point is that the client can send various combinations of arguments (such as a TCP connection, a web socket, and an argument containing data) and have each message handled correctly depending on the arguments sent.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.213.44