Chapter 10: WebSocket

So far, most website and web application developers are accustomed to communicating exclusively with a server by making HTTP requests that are followed by HTTP responses.

The model of requesting a resource by specifying its URL, Content-Type, and other attributes that you saw in previous chapters works well if you keep in mind the use case that the World Wide Web was crafted to solve. The web was created to deliver documents that were heavily interlinked to each other. URLs have paths because documents typically have hierarchies in file systems. And each level of hierarchy can contain indexes with hyperlinks.

Consider the following, for example:

GET /animals/index.html

GET /animals/mammals/index.html

GET /animals/mammals/ferrets.html

With time, however, the web became more and more interactive. The traditional web that was about retrieving entire documents every time the user clicked is less common nowadays, especially with all the tools that HTML5 makes available. You can now create very sophisticated web applications that often have completely deprecated desktop application counterparts, games, text editors, and more.

AJAX

The Web 2.0 marked the uprise of the web application. One of its key ingredients was AJAX, which translated into a snappier user experience for a fundamental reason: you no longer had to retrieve an entire HTML document every time the user interacted with the server.

For example, if you are updating your profile on a social networking application, you can make an asynchronous POST request and get a simple OK in return. Then with one of the readily available JavaScript frameworks, you can alter the view to represent the user action.

Alternatively, when you click Remove on a table, you can send a DELETE request and erase the row (<tr>) element without having the browser fetch a lot of unnecessary data, images, scripts, and stylesheets and then rerender the entire page.

In essence, AJAX was important because it allowed you to get rid of a lot of data transfer and rendering overhead that you didn’t need for many of the things that web applications were trying to do with web applications.

In recent times, however, many applications have been transferring data in ways in which the traditional HTTP request+response model results in significant overhead. Consider the example of the application you are going to build in this chapter. Say you want to show where the cursors of every visitor of your website are in real time. Every time a visitor moves her mouse, you send her coordinates.

Say you use jQuery to send AJAX requests. The first idea that comes to mind is using $.post to send a POST request with the cursor location every time the mousemove event is triggered, as shown here:

$(document).mousemove(function (ev) {

  $.post(‘/position’, { x: ev.clientX, y: ev.clientY });

});

This code, despite looking straightforward, has a fundamental problem: you have no control over the order in which the server receives requests.

When your code makes a request, the browser can send it through any of its available sockets because browsers open multiple sockets to the target server to enhance performance. For example, while an image is being downloaded, an AJAX request can still be sent. If the browser operates with only one socket, this is impossible and websites are extremely slow to interact with.

If three requests are made in parallel through three different sockets, you have no guarantee of the order in which they are received. As a result, you need to adjust your code to send only one request at a time and wait for the response to send the next one:

var sending = false;


$(document).mousemove(function (ev) {

  if (sending) return;

  sending = true;

  $.post(‘/position’, { x: ev.clientX, y: ev.clientY }, function () {

    sending = false;

  });

});

Now consider what the TCP traffic would look like using Firefox as an example:

Request

POST / HTTP/1.1

Host: localhost:3000

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:8.0.1) Gecko/20100101 Firefox/8.0.1

Accept: */*

Accept-Language: en-us,en;q=0.5

Accept-Encoding: gzip, deflate

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

Content-Type: application/x-www-form-urlencoded; charset=UTF-8

X-Requested-With: XMLHttpRequest

Referer: http://localhost:3000/

Content-Length: 7

Pragma: no-cache

Cache-Control: no-cache


x=6&y=7

Response

HTTP/1.1 200 OK

Content-Type: text/plain

Content-Length: 2

Connection: keep-alive


OK

As you can see, a great amount of text surrounds a minimal amount of data. A lot of unneeded headers for this particular use case are sent back and forth, and they greatly outweigh how much data we’re sending.

Even if you could remove some of those headers, do you really need a response in this case? If you’re sending something as volatile and unimportant as the position of a mouse, you don’t really need to wait for an OK to send more.

The ideal case for this particular web application starts to resemble raw TCP (like that in the chat application in Chapter 6) more than it resembles HTTP. Ideally, you would want to write the positions to a socket sequentially with minimal framing (that is, the data that surrounds the data you care about).

If you think in terms of telnet, ideally you would like the browser to send

x=6&y=7

x=10&y=15

. . .

Now, thanks to HTML5, you have a solution: WebSocket. WebSocket is the TCP of the web, a low-level bidirectional socket that gives control of the communication back to you.

HTML5 WebSocket

When you discuss WebSocket, you’re talking about two distinct parts: the WebSocket API implemented by browsers, and the WebSocket Protocol implemented by servers. Both have been designed and developed in conjunction with other technologies as part of the HTML5 initiative and movement, but are not a formal part of the HTML5 specification. The former
is being standardized by the W3C, and the latter has been standardized by the IETF as
RFC 6455.

The API as it’s implemented by the browser looks like this:

var ws = new WebSocket(‘ws://host/path’);

ws.onopen = function () {

  ws.send(‘data’);

}

ws.onclose = function () {}

ws.ondata = function (ev) {

  alert(ev.data);

}

The simplicity of its API is, not coincidentally, reminiscent of the TCP client you wrote in Chapter 6. As you can see, unlike XMLHttpRequest (AJAX), it’s not oriented around requests and responses, but messages sent with the send method. You can send and receive messages in UTF-8 or binary encoding very easily, through the data event, and learn about the connection being opened or closed through the open and close events.

Connection must first be established with a handshake. The handshake looks like a normal HTTP request, but after the server responds to it, the client and server begin exchanging data with minimal framing:

Request

GET /ws HTTP/1.1

Host: example.com

Upgrade: websocket

Connection: Upgrade

Sec-WebSocket-Version: 6

Sec-WebSocket-Origin: http://pmx

Sec-WebSocket-Extensions: deflate-stream

Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==

Response

HTTP/1.1 101 Switching Protocols

Upgrade: websocket

Connection: Upgrade

Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=

WebSockets are still based on HTTP, which means it’s fairly easy to implement the protocol on top of existing servers. The main difference is that as soon as the handshake is complete, a minimalistic TCP-like socket is available to you.

To better understand these concepts, let’s write an example app.

An Echo Example

The first example will consist of a server and a client exchanging simple ping and pong strings. When the client sends a ping, you’ll record the time, and measure how many milliseconds it takes for the server to respond.

Setting it up

For this example, you use websocket.io, which I created while working at LearnBoost.

It’s important to keep in mind that websocket.io handles only the WebSocket requests. All the other requests in your website or application are still handled by a regular web server, which is why you also include express in your package.json file:

“name”: “ws-echo”

  , “version”: “0.0.1”

  , “dependencies”: {

        “express”: “2.5.1”

      , “websocket.io”: “0.1.6”

  }

}

The server simply responds to messages by echoing them back to the browser. The browser measures how long it takes for the server to respond.

Setting up the server

The first thing you need to do is initialize express and attach websocket.io to it so that it can handle the WebSocket requests:

var express = require(‘express’)

  , wsio = require(‘websocket.io’)


/**

* Create express app.

*/


var app = express.createServer();


/**

* Attach websocket server.

*/


var ws = wsio.attach(app);


/**

* Serve your code

*/


app.use(express.static(‘public’));


/**

* Listening on connections

*/


ws.on(‘connection’, function (socket) {

  // . . .

});


/**

* Listen

*/


app.listen(3000);

Now focus on the connection handler. I explicitly designed websocket.io to closely resemble how you would implement a net.Server. Because you want to echo back messages, all you need to do is listen on the message event and send it back.

ws.on(‘connection’, function (socket) {

  socket.on(‘message’, function (msg) {

    console.log(‘ 33[96mgot:33[39m ‘ + msg);

    socket.send(‘pong’);

  });

});

Setting up the client

Now you’re ready to move onto the code, which goes into the public folder:

index.html

<!doctype html>

<html>

  <head>

    <title>WebSocket echo test</title>

    <script>

      var lastMessage;


      window.onload = function () {

        // create socket

        var ws = new WebSocket(‘ws://localhost:3000’);

        ws.onopen = function () {

          // send first ping

          ping();

        }

        ws.onmessage = function (ev) {

          console.log(‘ got: ‘ + ev.data);

          // you got echo back, measure latency

          document.getElementById(‘latency’).innerHTML = new Date – lastMessage;

          // ping again

          ping();

        }

        function ping () {

          // record the timestamp

          lastMessage = +new Date;

          // send the message

          ws.send(‘ping’);

                };

    </script>

  </head>

  <body>

    <h1>WebSocket Echo</h1>

    <h2>Latency: <span id=”latency”></span>ms</h2>

</body>

</html>

If you look at the HTML, it’s fairly self-explanatory. It just sets up a placeholder to display the latency (which is the number of milliseconds that a message takes to complete a round trip).

The JavaScript code is relatively straightforward also. You keep track of the last message timestamp:

var lastMessage

Initializing WebSocket opens the connection:

var ws = new WebSocket(‘ws://localhost:3000’);

You register the connection as open and send the first message to the server:

ws.onopen = function () {

  ping();

}

When the server replies, you measure the latency and ping again:

ws.onmessage = function () {

  console.log(‘ got: ‘ + ev.data);

  // you got echo back, measure latency

  document.getElementById(‘latency’).innerHTML = new Date – lastMessage;

  // ping again

  ping();

}

Finally, let’s define the ping function, which tracks the timestamp to measure the response against (so that we can determine the latency), and sends a simple string:

function ping () {

  // record the timestamp

  lastMessage = +new Date;

  // send the message

  ws.send(‘ping’);

};


Running the server

Now you run the server:

$ node server.js

Then you point the browser to http://localhost:3000 (see Figure 10-1). Make sure you test with a modern web browser that supports WebSocket, like Chrome 15+ or IE 10+. If unsure, go to http://websocket.org and look at the “Does your browser support WebSocket?” box.

You successfully created a single-user realtime application. Check the terminal output and your web browser’s console for a log of the messages exchanged. In most modern computers, it will take on average between 1 and 5 milliseconds to exchange this message. As an exercise, try writing this same example leveraging AJAX and Express routes, and compare how long it takes to complete a ping-pong cycle.

9781119963103-fg1001.eps

Figure 10-1: The time it takes for a packet to go to the server and back to the client

For the next example, you’ll write an application where the server’s role is to connect multiple users on a single screen.

Mouse cursors

You are going to display the image of a cursor representing the position of all the connected users in the screen.

Through this example, you learn the concept of broadcasting, which consists of one user relaying a message to everyone but himself.

Setting up the example

The requirements for this project are the exact same as for the previous example. In your package.json include:

{

     “name”: “ws-cursors”

   , “version”: “0.0.1”

   , “dependencies”: {

         “express”: “2.5.1”

       , “websocket.io”: “0.1.6”

     }

}

Setting up the server

The basic server setup is equivalent to the previous app. You serve static HTML with express, and you attach a websocket.io server to it in your server.js:

var express = require(‘express’)

  , wsio = require(‘websocket.io’)


/**

* Create express app.

*/


var app = express.createServer();


/**

* Attach websocket server.

*/


var ws = wsio.attach(app);


/**

* Serve your code

*/


app.use(express.static(‘public’))


/**

* Listening on connections

*/


ws.on(‘connection’, function (socket) {

  // . . .

});


/**

* Listen

*/


app.listen(3000);

In this case, however, you want to take a different action when a user connects. You want to keep track of everyone’s positions in memory in a simple object. You also keep track of the total number of clients that have connected so that you can give each client a unique ID. That ID identifies the client’s position in the positions object:

var positions = {}

  , total = 0


ws.on(‘connection’, function (socket) {

  // . . .

});

When a user first connects, you want to send everyone’s positions to him as the first message. That way, when the user first loads the page, he can see everyone who is connected.

To that end, you encode the positions object as JSON:

ws.on(‘connection’, function (socket) {

  // you give the socket an id

  socket.id = ++total;


  // you send the positions of everyone else

  socket.send(JSON.stringify(positions));

});

When a client sends a message, you assume he’s sending his position as JSON (as an object with x and y coordinates). You then store it in the positions object:

socket.on(‘message’, function (msg) {

  try {

    var pos = JSON.parse(msg);

  } catch (e) {

    return;

  }


  positions[socket.id] = pos;

});

Finally, when the user disconnects, you clear his position:

socket.on(‘close’, function () {

  delete positions[socket.id];

});

What is missing here? Broadcasting, of course. When a position is received, you want to send it to everyone else. And when the socket closes, you want to notify everyone else that the user disconnected so his cursor is removed from the screen.

You declare a broadcast function to go through the rest of the clients and send them a message. Include it right after you register the ws connection listener:

function broadcast (msg) {

  for (var i = 0, l = ws.clients.length; i < l; i++) {

    // you avoid sending a message to the same socket that broadcasts

    if (ws.clients[i] && socket.id != ws.clients[i].id) {

      // you call ‘send’ on the other clients

      ws.clients[i].send(msg);

    }

  }

}

Because you have two distinct types of data to send, you send a small JSON packet with a type identifier.

When you relay a position, you send an object that looks like this:

{

    type: ‘position’

  , pos: { x: <x>, y: <y> }

  , id: <socket id>

}

When a user disconnects, you send

{

    type: ‘disconnect’

  , id: <socket id>

}

Therefore,

socket.on(‘message’, function () {

  // . . .

  broadcast(JSON.stringify({ type: ‘position’, pos: pos, id: socket.id }));

});

And upon close, you send

socket.on(‘close’, function () {

  // . . .

  broadcast(JSON.stringify({ type: ‘disconnect’, id: socket.id }));

});

Now you’re done with the server and can move on to the client.

Setting up the client

For the client, you start with a simple HTML document and an onload handler for the window in your index.html:

<!doctype html>

<html>

  <head>

    <title>WebSocket cursors</title>

    <script>

      window.onload = function () {

        var ws = new WebSocket(‘ws://localhost’);

        // . . .

      }

    </script>

  </head>

  <body>

    <h1>WebSocket cursors</h1>

  </body>

</html>

For this task, you want to concentrate on the two main events: open and message.

When the connection first opens, you attach a mousemove handler to start relaying the cursor position to others:

ws.onopen = function () {

  document.onmousemove = function (ev) {

    ws.send(JSON.stringify({ x: ev.clientX, y: ev.clientY }));

  }

}

When a message is received, as you saw in the previous section, it can signal either someone’s cursor moving or someone disconnecting:

// we instantiate a variable to keep track of initialization for this client

var initialized;


ws.onmessage = function (ev) {  

   var obj = JSON.parse(ev.data);


  // the first message is the position of all existing cursors

  if (!initialized) {

    initialized = true;

    for (var id in obj) {

      move(id, obj[id]);

    }

  } else {

    // other messages can either be a position change or

    // a disconnection

    if (‘disconnect’ == obj.type) {

      remove(obj.id);

    } else {

      move(obj.id, obj.pos);

    }

  }

}

You then declare the functions move and remove.

For the move function, you first want to make sure the element for the cursor exists. You look for a DOM element with the ID cursor-{id}. If the element is missing, you create the image element and set the image URL and a basic style to make it float around.

Then you adjust its position on the screen:

function move (id, pos) {

  var cursor = document.getElementById(‘cursor-’ + id);


  if (!cursor) {

    cursor = document.createElement(‘img’);

    cursor.id = ‘cursor-’ + id;

    cursor.src = ‘/cursor.png’;

    cursor.style.position = ‘absolute’;

    document.body.appendChild(cursor);

  }


  cursor.style.left = pos.x + ‘px’;

  cursor.style.top = pos.y + ‘px’;

}

For removing, you simply detach the element from the DOM:

function remove (id) {

  var cursor = document.getElementById(‘cursor-’ + id);

  cursor.parentNode.removeChild(cursor);

}

Running the server

As in the other example, all you need to do is run the server and point your browser to it. Make sure to open multiple tabs (shown in Figure 10-2) to fully experience the real-time interaction.

9781119963103-fg1002.eps

Figure 10-2: Several clients connected, relaying their cursor positions in real time.

cursor.png courtesy of http://thenounproject.com

The Challenges Ahead

Even though you got the basic example functioning, these applications need some more work to become ready for real-world usage.

Close doesn’t mean disconnect

When a close event fires on a WebSocket server or client, it means something specific: the TCP connection was appropriately close. In the real world, however, this is not always the case. Your computer can shut down unexpectedly, a network error can occur, or you spill a glass of water on your motherboard. In a lot of scenarios, close might never fire!

The solution for this problem is to rely on timeouts and heartbeats. For your application to handle these scenarios, you need to send dummy messages every number of seconds to make sure the client is alive and otherwise consider him forcefully disconnected.

JSON

As examples get more complicated, the variety of messages that the server and clients exchange increases.

The second example here relied heavily on encoding and decoding JSON packets manually. Because this is a common pattern in applications, that work should be done for you as part of another abstraction.

Reconnections

What happens if the client temporarily disconnects? Most applications try to get the user reconnected automatically. In these examples, if a disconnection occurs, the only way to reconnect is to refresh the browser.

Broadcasting

Broadcasting is a common pattern in real-time applications where interaction with other clients is expected. You shouldn’t need to define your own broadcasting mechanism manually.

WebSockets are HTML5: Older browsers
don’t support them

WebSocket is a recent technology. Many browsers, proxies, firewalls, and antivirus software are still not ready to work completely with this new protocol and way of communicating. A solution for older browsers is needed.

The solution

Fortunately, all these problems have solutions. In the next chapter, you work with a module called socket.io whose goal is to fix all the aforementioned issues while retaining the simplicity and speed of WebSocket-based communication.

Summary

You now understand the fundamentals of the WebSocket API and the WebSocket protocol, and how you can use Node.JS to leverage it for very fast message exchange. In the first example you familiarized yourself with its most basic usage.

You created a multi-user application that exhibits the strengths of WebSocket: its minimal framing allows for sending lots of short messages that arrive to other clients as fast as possible.

Finally, I described the weaknesses in terms of API and browser support that we can improve upon thanks to the socket.io framework, described in the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.193.55