When we create a site, server and site logic is all tied up in one process. Whereas with other platforms, the server code is already in place. If our site code has bugs, the server is very unlikely to crash, and thus in many cases the site can stay active even if one part of it is broken.
With a Node-based website, a small bug can crash the entire process, and this bug may only be triggered once in a blue moon.
As a hypothetical example, the bug could be related to character encoding on POST requests. When someone like Felix Geisendörfer completes and submits a form, suddenly our entire server crashes because it can't handle umlauts.
In this recipe, we'll look at using Upstart, an event-driven init service available for Linux servers, which isn't based upon Node, but is nevertheless a very handy accomplice.
We will need Upstart installed on our server. http://upstart.ubuntu.com contains instructions on how to download and install. If we're already using an Ubuntu or Fedora remote server then Upstart will already be integrated.
Let's make a new server that purposefully crashes when we access it via HTTP:
var http = require('http'), http.createServer(function (req, res) { res.end("Oh oh! Looks like I'm going to crash..."); throw crashAhoy; }).listen(8080);
After the first page loads, the server will crash and the site goes offline.
Let's call this code server.js
placing it on our remote server under /var/www/crashingserver
Now we create our Upstart configuration file, saving it on our server as /etc/init/crashingserver.conf
.
start on started network-services respawn respawn limit 100 5 setuid www-data exec /usr/bin/node /var/www/crashingserver/server.js >> /var/log/crashingserver.log 2>&1 post-start exec echo "Server was (re)started on $(date)" | mail -s "Crashing Server (re)starting" [email protected]
Finally, we initialize our server as follows:
start crashingserver
When we access http://nodecookbook.com:8080
and refresh the page, our site is still accessible. A quick look at /var/log/crashingserver.log
reveals that the server did indeed crash. We could also check our inbox to find the server restart notification.
The name of the Upstart service is taken from the particular Upstart configuration filename. We initiate the /etc/init/crashingserver.conf
Upstart service with start crashingserver
.
The first line of the configuration ensures our web server automatically recovers even when the operating system on our remote server is restarted (for example, due to a power failure or required reboot, and so on).
respawn
is declared twice, once to turn on respawning and then to set a respawn limit — a
maximum of 100 restarts every 5 seconds. The limit must be set according to our own scenario. If the website is low traffic this number might be adjusted to say 10 restarts in 8 seconds.
We want to stay alive if at all possible, but if an issue is persistent we can take that as a red flag that a bug is having a detrimental effect on user experience or system resources.
The next line initializes our server as the www-data
user, and sends output to /var/log/crashingserver.log
.
The final line sends out an email just after our server has been started, or restarted. This is so we can be notified that there are probably issues to address with our server.
Let's implement another Upstart script that notifies us if the server crashes beyond its respawn limit
, plus we'll look at another way to keep our server alive.
If our server exceeds the respawn limit
, it's likely there is a serious issue that should be solved as soon as possible. We need to know about it immediately. To achieve this in Upstart, we can create another Upstart configuration file that monitors the crashingserver
daemon, sending an email if the respawn limit
is transgressed.
task start on stopped crashingserver PROCESS=respawn script if [ "$JOB" != '' ] then echo "Server "$JOB" has crashed on $(date)" | mail -s $JOB" site down!!" [email protected] fi end script
Let's save this to /etc/init/sitedownmon.conf
.
Then we do:
start crashingserver
start sitedownmon
We define this Upstart process as a task (it only has one thing to do, after which it exits). We don't want it to stay alive after our server has crashed.
The task is performed when the crashingserver
daemon has stopped during a respawn (for example, when the respawn limit
has been broken).
Our script stanza (directive) contains a small bash script that checks for the existence of the JOB
environment variable (in our case, it would be set to crashingserver)
and then sends an email accordingly. If we don't check its existence, a sitedownmon
seems to trigger false positives when it is first started and sends an email with an empty JOB
variable.
We could later extend this script to include more web servers, simply by adding one line to sitedownmon.conf
per server:
start on stopped anotherserver PROCESS=respawn
There is a simpler Node-based alternative to Upstart called forever:
npm -g install forever
If we simply initiate our server with forever
as follows:
forever server.js
And then access our site, some of the terminal output will contain the following:
warn: Forever detected script exited with code: 1 warn: Forever restarting script for 1 time
But we'll still be able to access our site (although it will have crashed and been restarted).
To deploy our site on a remote server, we log in to our server via SSH, install forever
and say:
forever start server.js
While this technique is certainly less complex, it's also less robust. Upstart provides core kernel functionality and is therefore system critical. If Upstart fails, the kernel panics and the whole server restarts.
Nevertheless, forever
is used widely in production on Nodejitsu's PaaS stack, and its attractive simplicity may be viable for less mission-critical production environments.
3.147.42.168