For incoming streams, Node provides pause
and resume
methods, but not so for outbound streams. Essentially, this means we can easily throttle upload speeds in Node but download throttling requires a more creative solution.
We'll need a new server.js
along with a good-sized file to serve. With the dd
command-line program, we can generate a file for testing purposes.
dd if=/dev/zero of=50meg count=50 bs=1048576
This will create a 50 MB file named 50meg
which we'll be serving.
For a similar Windows tool that can be used to generate a large file, check out http://www.bertel.de/software/rdfc/index-en.html.
To keep things as simple as possible our download server will serve just one file, but we'll implement it in a way which would allow us to easily plug in some router code to serve multiple files. First, we will require our modules and set up an options
object for file and speed settings.
var http = require('http'), var fs = require('fs'), var options = {} options.file = '50meg'; options.fileSize = fs.statSync(options.file).size; options.kbps = 32;
If we were serving multiple files, our options
object would be largely redundant. However, we're using it here to emulate the concept of a user-determined file choice. In a multifile situation, we would be loading file specifics based upon the requested URL instead.
To see how this recipe could be configured to serve and throttle more than one file, check out the routing recipes In Chapter 1, Making a Web Server
The http
module is for the server while the fs
module is for creating a readStream
and grabbing the size of our file.
We're going to restrict how much data is sent out at once, but we first need to get the data in. So let's create our server and initialize a readStream
.
http.createServer(function(request, response) { var download = Object.create(options); download.chunks = new Buffer(download.fileSize); download.bufferOffset = 0; response.writeHeader(200, {'Content-Length': options.fileSize}); fs.createReadStream(options.file) .on('data', function(chunk) { chunk.copy(download.chunks,download.bufferOffset); download.bufferOffset += chunk.length; }) .once('open', function() { //this is where the throttling will happen }); }).listen(8080);
We've created our server and specified a new object called download
, which inherits from our options
object. We add two properties to our request-bound download
object: a chunks
property that collects the file chunks inside the readStream
data event listener and a bufferOffset
property that will be used to keep track of the amount of bytes loaded from disk.
All we have to do now is the actual throttling. To achieve this, we simply apportion out the specified number of kilobytes from our buffer every second, thus achieving the specified kilobytes per second. We'll make a function for this, which will be placed outside of http.createServer
and we'll call our function throttle
.
function throttle(download, cb) { var chunkOutSize = download.kbps * 1024, timer = 0; (function loop(bytesSent) { var remainingOffset; if (!download.aborted) { setTimeout(function () { var bytesOut = bytesSent + chunkOutSize; if (download.bufferOffset > bytesOut) { timer = 1000; cb(download.chunks.slice(bytesSent,bytesOut)); loop(bytesOut); return; } if (bytesOut >= download.chunks.length) { remainingOffset = download.chunks.length - bytesSent; cb(download.chunks.slice(remainingOffset,bytesSent)); return; } loop(bytesSent); //continue to loop, wait for enough data },timer); } }(0)); return function () { //return a function to handle an abort scenario download.aborted = true; }; }
throttle
interacts with the download
object created on each server request to measure out each chunk according to our predetermined options.kbps
speed. For the second parameter (cb
), throttle
accepts a functional callback. cb
in turn takes one parameter, which is the chunk of data that throttle
has determined to send. Our throttle
function returns a convenience function that can be used to end the loop on abort, avoiding infinite looping. We initialize download throttling by calling our throttle
function in the server callback when the readStream
opens.
//...previous code
fs.createReadStream(options.file)
.on('data', function (chunk) {
chunk.copy(download.chunks,download.bufferOffset);
download.bufferOffset += chunk.length;
})
.once('open', function () {
var handleAbort = throttle(download, function (send) {
response.write(send);
});
request.on('close', function () {
handleAbort();
});
});
}).listen(8080);
The key to this recipe is our throttle
function. Let's walk through it. To achieve the specified speed, we send a chunk of data of a certain size every second. The size is determined by the desired amount of kilobytes per second. So, if download.kbps
is 32, we'll send 32 KB chunks every second.
Buffers work in bytes, so we set a new variable called chunkOutSize
and multiply download.kbps
by 1024 to realize the appropriate chunk size in bytes. Next, we set a timer
variable which is passed into setTimeout
. It is first set to 0
on two accounts. For one, it eliminates an unnecessary initial 1000 millisecond overhead, allowing our server the opportunity to immediately send the first chunk of data, if available. Secondly, if the download.chunks
buffer is not full enough to accommodate the demand of chunkOutSize
, the embedded loop
function recurses without changing timer
. This causes the CPU to cycle in real time until the buffer loads enough data to deliver a whole chunk (a process which should take less than a second).
Once we have enough data for the first chunk, timer
is set to 1000 because from here on out we want to push a chunk every second.
loop
is the guts of our throttling engine. It's a self-recursive function which calls itself with one parameter: bytesSent
. The bytesSent
parameter allows us to keep track of how much data has been sent so far, and we use it to determine which bytes to slice out of our download.chunks
buffer using Buffer.slice. Buffer.slice
takes two parameters, start
and end
. These two parameters are fulfilled with bytesSent
and bytesOut
respectively. bytesOut
is also used against download.bufferOffset
to ensure we have enough data loaded for a whole chunk to be sent out.
If there is enough data, we proceed to set the timer
to 1000 to initiate our chunk per second policy, then pass the result of download.chunks.slice
into cb
which becomes our send
parameter.
Back inside our server, our send
parameter is passed to response.write
within our throttle
callback, so each chunk is streamed to the client. Once we've passed our sliced chunk to cb
we call loop(bytesOut)
for a new iteration (thus bytesOut
transforms into bytesSent)
, then we return from the function to prevent any further execution.
The third and final place bytesOut
appears is in the second conditional statement of the setTimeout
callback, where we use it against download.chunks.length
. This is important for handling the last chunk of data. We don't want to loop again after the final chunk has been sent, and if options.kbps
doesn't divide exactly into the total file size, the final bytesOut
would be larger than the size of the buffer. If passed into the slice
method unchecked, this would cause an object out of bounds (oob
) error.
So if bytesOut
equals, or is greater than, the memory allocated to the download.chunks
buffer (that is, the size of our file), we slice
the remaining bytes from our download.chunks
buffer and return from the function without calling loop
, effectively terminating recursion.
To prevent infinite looping when the connection is closed unexpectedly (for instance during connection failure or client abort) throttle
returns another function, which is caught in the handleAbort
variable and called in the close
event of response
. The function simply adds a property to the download
object to say the download has been aborted. This is checked on each recursion of the loop
function. As long as download.aborted
isn't true
it continues to iterate, otherwise the looping stops short.
There are (configurable) limits on operating systems defining how many files can be opened at once. We would probably want to implement caching in a production download server to optimize file system access. For file limits on Unix systems, see http://www.stackoverflow.com/questions/34588/how-do-i-change-the-number-of-open-files-limit-in-linux.
If a connection breaks, or a user accidentally aborts a download, the client may initiate a resume request by sending a Range
HTTP header to the server. A Range
header would look something like this:
Range: bytes=512-1024
When a server agrees to handle a Range
header, it sends a 206 Partial Content
status and adds a Content-Range
header in the response. Where the entire file is 1 MB, a Content-Range
reply to the preceding Range
header might look as follows:
Content-Range: bytes 512-1024/1024
Notice that there is no equals sign (=) after bytes
in a Content-Range
header. We can pass an object into the second parameter of fs.createReadStream
, which specifies where to start and end reading. Since we are simply handling resumes, we only need to set the start
property.
//requires, options object, throttle function, create server etc... download.readStreamOptions = {}; download.headers = {'Content-Length': download.fileSize}; download.statusCode = 200; if (request.headers.range) { download.start = request.headers.range.replace('bytes=','').split('-')[0]; download.readStreamOptions = {start: +download.start}; download.headers['Content-Range'] = "bytes " + download.start + "-" + download.fileSize + "/" + download.fileSize; download.statusCode = 206; //partial content } response.writeHeader(download.statusCode, download.headers); fs.createReadStream(download.file, download.readStreamOptions) //...rest of the code....
By adding some properties to download
, and using them to conditionally respond to a Range
header, we can now handle resume requests.
3.140.186.201