Chapter 2. Searching for Problems in Log Files

Nginx really is a breakthrough technology powering a great part of modern Web. And as with all great technologies, it stands on the shoulders of giants. Nginx would not be possible without Apache. One very important Unix tradition that Nginx embraces fully is thorough logging.

Logs are what you turn to the moment there is a problem with your Nginx instance. For a daemon, there are not really many ways to communicate its state to the administrator in a simple, reliable, and guaranteed to work way other than logs.

You will find the following topics in this chapter:

  • A comprehensive description of how Nginx logging is configured and what mistakes could be made in the configuration
  • A special section on how to log POST request bodies
  • A section on how log rotation works and why there is some potential for problems
  • A series of real-life error records from logs with analysis

Configuring Nginx logging

There are two types of logs that Nginx may write. One could also say that there are infinite types because of the log_format directive that allows you to create your own types of logs.

To refresh your memory about what directives are used to configure Nginx logging, here they are:

  • The error_log directive configures the logging of exceptional events that the developers of Nginx consider worth noting. Usually, this is all kinds of errors.

    The format of the directive is this:

    error_log <destination> <log level>;

    Note

    The first parameter is usually a path to the file with the log. Recent versions of Nginx starting with 1.7.1 also support logging via syslog, to a local or remote syslog server. There is also a rarely used misnamed special value stderr, which, by the way, does not redirect logging to stderr (the third standard stdio stream or &2 in shell terms) because it does not make much sense to log to stderr from a daemon—daemonization involves closing all standard file descriptors. The stderr value means "log into the file that was configured during compilation time" and that depends on the package or even the OS distribution you use. You will mostly want to specify an actual file instead of stderr just to be sure where it ends. By the way, to make things more confusing, there is a way to specify logging to actual stderr during compile time. It is not very useful for daemons; you will not probably ever use it.

    You often want several error logs. Remember that in Chapter 1, Searching for Problems in Nginx Configuration, we discussed multiline configuration directives named contexts. They provide a topic, that is, a narrow scope for the directives inside them. You may (and that is usually a very good idea) have different log files in different contexts. And using stderr prevents that because everything will get written to the same place.

    The log level parameter of the error_log directive is a way of specifying a threshold of severity of events that end up in the log. Most of the time, you will want to set this to warn, but feel free to increase up to debug whenever you have a reproducible problem that you want more information about.

    The debug level requires a special compile-time switch. The reason for this is that debug logging makes some performance compromises and the code for it should not be included in production systems, ideally. Unless Nginx is really your bottleneck (a rare situation), you may safely use --with-debug when compiling Nginx. See a little more about it at http://nginx.org/en/docs/debugging_log.html.

  • The other logging directive is access_log. And it includes much more functionality than error_log and also more potential for mistakes. Let's look at it more closely.

    This is how access logs are configured:

    access_log <destination> <log format> <misc arguments>

    The idea of access log is to have a journal of all request-response pairs processed by Nginx. As opposed to the error log, the records in access logs have a thoroughly specified format, usually a chain of whitespace-delimited values that contain some information about the current request-response pair or general state of Nginx. All access log records have this format. Access logs and error logs work together. In case Nginx has something unusual to say about a request it processes, you will find a strictly formatted line of data in your access log and then some warnings or errors of mostly free text nature in your error log.

    The destination parameter takes the same values as the respective parameter of the error_log directive. You may still log to syslog or a file.

    Note

    Modern Nginx also has an interesting performance feature of buffered access logging. You will find more information about turning buffered logging on with flush or gzip arguments at http://nginx.org/en/docs/http/ngx_http_log_module.html#access_log. Do understand what buffering means before turning it on. One of the expected features of all error reporting mechanisms is being real time and buffered logs are exactly the opposite, that is, log records are not written to disk and not made available for inspection immediately. They are held in the buffer for some time. You will need this only in high-load scenarios where writing logs starts to take noticeable time because of the disk waits.

    The log format parameter of the access_log directive is the heart of access logging. It expects a name of a template that models each record in the log. You create such templates with the log_format directive. There is a predefined format named combined, which is also a good example to show here:

    log_format combined '$remote_addr - $remote_user [$time_local] '
                        '"$request" $status $body_bytes_sent '
                        '"$http_referer" "$http_user_agent"';

As you can see, the log_format directive's second argument is a long line of variables with "talking" names. All characters between and around variables will be included in the log. Variables will be evaluated at the time of logging, and their values will take their places.

Let's look at a real example of a log record generated with this very template:

85.90.193.224 - - [01/Feb/2016:12:01:34 +0400] "GET / HTTP/1.0" 200 137426 "http://example.com/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0"

You are probably very familiar with the combined log format from previous experience with Nginx, Apache, or some other web server software. Still, going through the individual items of each combined log line with us may provide you with some nonobvious insights. Let's parse the example record and learn some facts about those variables along the way:

$remote_addr 85.90.193.224

This is the IP address of the computer that made the request to our server. Never ever parse it as four decimal integers delimited by dots. Even the [0–9.]+ regexp is not good enough. Can you guess the reason? Here it is:

2001:470:1f10:1::2 - - [28/Jan/2015:02:28:19 +0300] "HEAD / HTTP/1.1" 200...

We are living in the age of IPv6 in production. Big websites see 1–7% of their traffic on IPv6 (data from the end of 2015). Nginx is fully ready, so make sure your log parsers are too.

"- -"

The first dash is legacy. When this particular log format was born long ago and long before Nginx, there was this interesting protocol named ident, which allowed a host to make a connection back to the client computer and ask for the name of the user that initiated a particular TCP connection. See RFC 1413 (https://tools.ietf.org/html/rfc1413) if you are curious, but we should say that ident is long dead and not used anywhere but IRC networks. Nginx didn't even bother with implementing it; this field should be hardcoded to - always.

The next dash is for "remote user" as identified by the HTTP auth mechanism. Which is a bit more popular than ident but not by a big margin. There is one case where HTTP auth is used relatively often, that is, closing test versions of websites from prying eyes (read: GoogleBot and other less discriminating crawlers). See the online documentation for how to configure HTTP auth at http://nginx.org/en/docs/http/ngx_http_auth_basic_module.html.

"[01/Feb/2016:12:01:34 +0400]" $time_local

This is the date/timestamp of the log record. Not the most convenient date/time format to parse, for sure. Be careful dealing with time zones. It still allows prefix matching, and you probably often do something along the lines of:

% fgrep "01/Feb/2016:12:01:" /var/log/nginx/access.log to filter all the page hits processed during a particular minute.

This is a more complex version that should be in your toolkit too:

% cat /var/log/nginx/access.log | awk '{print $4}' | awk -F : '{print $2 ":" $3}' | uniq -c

It will print the number of hits you had during each minute of the day. With this command, you can identify spikes that may signal a problem.

Interestingly, the original version of this was easier:

% cat /var/log/nginx/access.log | awk -F : '{print $2 ":" $3}' | uniq -c but then again, IPv6 came into our lives.

"GET / HTTP/1.0" "$request"

This is a string representation of the whole HTTP request. What to look for? You will be surprised by how often, along with GET and POST, you will see HEAD requests. It is a rarely discussed younger brother of GET, which is not supposed to return an actual body—only the headers of the response.

You will not see HTTP/1.0 as a protocol very often. Modern browsers will issue HTTP/1.1 requests. All other values here should raise a flag. You will see things such as SIP/2.0 or RTSP/1.0 there; these are legitimate protocols indeed but requests for those on a website and not a SIP or RTSP endpoint are signs of scanning from malicious actors (or researchers).

200  $status

This is the HTTP status code. Anything besides 2xx or 3xx here indicates an error. For a comprehensive, modern, and authoritative list of HTTP status codes, please look no further than RFC 7231 (https://tools.ietf.org/html/rfc7231)—a rather new and long-awaited update on the HTTP/1.1 specification released in June 2014.

137426 $body_bytes_sent

This one does not need an explanation. We should add that it already accounts for any compression. It may also be used as a quick indicator of problems on the backend. After some time, you will learn to spot unusually small response sizes, which mean that the backend tumbled and generated a short error page instead of a normal response. Proper backends will also send a non-2xx status code but not all (and not even many) backends behave.

This small Perl script searches for response sizes that are less than a tenth of the average for that URL and also less than some hard-coded chunk size threshold that is commonly used to download a part of a file: http://kapranoff.ru/~kappa/nginx-troubleshooting/blips.pl.

We will not go over it line by line; it is just an example anyway. The idea is to make two passes of the log. First, to calculate the average bytes sent for each URI served, and second, to actually find outsiders.

"http://example.com/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0" '"$http_referer" "$http_user_agent"'

These are taken directly from HTTP request headers; they are sent by the HTTP client, usually a browser. This makes them interesting, but also non-reliable. They are, basically, strings sent to your server over the network. You cannot trust anything sent by the client. You will routinely see some fantastic user agent strings claiming to be from the future or from the past. You will also see referrer URLs that point to some totally bogus websites that do not contain any links to your site and instead try to infect you with all kinds of malware du jour.

On the bright side, we do remember the excitement of seeing the first iPhones in our access logs during the late summer of 2007. That was fun.

There is a lot of information that you can add to your access logs using different variables that Nginx provides during processing of each request.

The whole list of them is at http://nginx.org/en/docs/varindex.html.

There are also several variables that are available only during log record generation and are listed in the description of the log_format directive at http://nginx.org/en/docs/http/ngx_http_log_module.html#log_format.

The recommendation is to keep saving logs in the combined format to be able to use a huge number of tools that community has created over the years. In addition to these, you may create some extended logs with more data to help you debug problems.

Here is a list of variables that are often useful but not included in the default combined format:

$gzip_ratio

The ratio of compressed response size to the original or "-" if the response was not compressed. This does not seem important, but it makes $body_bytes_sent more useful. Having this variable helps you to spot clients that do not support gzip compression. For them, $body_bytes_sent will be higher than usual.

$msec

The exact timestamp up to milliseconds. This is the same information that is available in human-readable form with $time_local, but milliseconds are important once you have a lot of hits each second.

$request_length

The size of the HTTP request. GET requests are generally short, but once they get beyond a kilobyte, you should think about having too many cookies accompanying each request. POST requests may be of any size and if your application has to accept important data from users, such as files or filled forms, you will want to monitor the size of those requests. A technique to log the contents of POST requests is described later in this chapter.

$request_time

The time between the beginning of the request and the end of the response phases. Basically, this is your atom of performance data that includes both the network and processing delays.

$sent_http_content_type

This is the content type of the response in the familiar form of something like text/html or application/pdf. It is not essential but helps when looking at logs of modern web applications and spotting that some JSON handler suddenly emitted a simple text/html response. It is also useful to calculate the total traffic divided by types of data. There is a whole family of $sent_http_* variables that correspond to the generated HTTP response headers. You may want to research what else is there.

MIME types that we mention here are also discussed in Chapter 1, Searching for Problems in Nginx Configuration.

$cookie_*

The asterisk should be replaced by the name of one of your cookies. Most modern websites have some mechanism of stateful user sessions. Usually, there is a cookie named session or session_id that allows the restoration of a chain of requests that were made by one user inside one session. The remote IP address is used for that when analyzing standard combined format logs, but this may and will fail on users with the same IP or the same user hopping between IP addresses (both are absolutely normal situations).

$host

This one contains the hostname that processed the request. It may seem redundant because, generally, different hosts will log in to different files. However, you would be surprised to know how often logs from several hosts are processed together whether just on the same log storage cluster or even using the same log analyzer software. Having the hostname right there in the logs creates some additional freedom of not caring about filenames of the logs, and once you get tired of running greps against files and load everything into a database, you will remember the time you decided to include $host and thank yourself.

Logging POST requests

Once you start debugging a problem with a web application that runs behind one of your Nginx instances by tracing user requests and application responses via access logs, you will see that GET/HEAD requests are logged fully while POST request log records lack any information except the URI to which the data was posted. This is one of the questions that many system administrators ask, especially after trying to get away with tcpdumps only. tcpdump is a wonderful Swiss army knife of protocol tracing, but it requires active participation during the events that need to be traced. And tracing HTTPS with tcpdump is very hard.

Nginx is able to log POST request bodies and many more. You should already be fully equipped to at least try to implement such a logging yourself.

Remember that we talked about custom log formats and using variables to record the state of requests and responses. If you search through the list of variables available during request processing, you will see the variable named $request_body. See http://nginx.org/en/docs/http/ngx_http_core_module.html#var_request_body.

Let's invent a simple log format including $request_body:

log_format request_body_log 'body: "$request_body"';

Now we enable logging with this format by adding this directive:

access_log /var/log/nginx/requests.log request_body_log;

Remember that the log_format directive should be used in one of the higher contexts, for example, the http context. Multiple access_log directives may be in effect for all the requests, and because of this, we do not need to specify the rest of the variables in the template for the request_body_log format. Your usual preconfigured combined-formatted logs will still get written to.

What would we see in requests.log for some simple GET requests to download a couple of static files?

body: "-"
body: "-"
body: "-"

Make sure that you understand the result before proceeding.

Now, we need POST requests. And POST requests to static files are useless. They never happen in real life. Clients POST data to web applications, and for Nginx administrators, a web application is an upstream to which Nginx proxies the requests and from which it proxies back the responses.

Suppose that we build something like this. It will be a very simple Dancer application in Perl accepting a simple POST and responding with a piece of dynamic HTML.

The source code is at http://kapranoff.ru/~kappa/nginx-troubleshooting/simple-post.pl:

Logging POST requests

Now we will set up a proxy inside our Nginx instance:

location /simple-post {
    proxy_pass http://localhost:3000/;
}

We will point our browser to http://localhost/simple-post.

If the Dancer app is running, you will see a simple form of one field and a button. Type in something, click on the button and rush to your requests.log:

body: "-"
body: "a=Nginx+rules%21"

The first line is the empty body of the GET request for the form, whereas the second contains the body of the POST that the form generated with the help of your browser. There are two ways an HTML form may be encoded into a POST body; this one is the default application/x-www-form-urlencoded. The other one is multipart/form-data; it is widely used for forms that allow file uploads. This is a little bit out of scope of this book already. We should add that form encodings are quickly becoming a thing of the past because more and more POST bodies are constructed by the client-side JavaScript and the browsers themselves.

What is important here is that you now have a simple way to log what is coming your way via POST requests.

Conditional logging

This example will also allow demonstration of one of the more recent Nginx logging features named conditional logging.

The directive access_log has a number of optional parameters and among them is a parameter if that specifies a condition on which a record is appended to this particular access log. When we configured request body logging in the previous section, we still ended up with a log full of "-"; those are empty bodies of all the non-POST requests. Let's fix that. First, we add a condition to our access_log directive:

access_log /var/log/nginx/requests.log request_body_log if=$method_is_post;

The condition that we use is a simple custom variable. We intentionally show this technique using syntax very similar to what is documented in the official documentation at http://nginx.org/en/docs/http/ngx_http_log_module.html#access_log.

So the next step for us is to create this variable. There are several ways to create a variable in Nginx. The most straightforward is using the set directive inside an if context. But it is a good habit to cringe any time you see an if directive in Nginx configuration. if should always be the last choice. Remember that there is no programming inside configuration files; everything should be as declarative as possible.

And there is a good declarative way to create a variable:

map $request_method $method_is_post {
    POST 1;
    default 0;
}

This is everything you need to do to enable conditional logging. If your Nginx version is modern enough, you will get only bodies of POST requests in your requests.log from now on.

There is a probability that your Nginx is not modern enough (at least 1.7.0 is required). Use nginx -t to test the configuration. Can you think of a way to work around the problem without upgrading Nginx? This is not a hypothetical question. Running Nginx installed from packages provided by your distribution is highly recommended, and they are notoriously not up to date.

Logging big request bodies

There is one more thing to tell you about logging request bodies. Actually, two things that will manifest in exactly the same way while having different reasons.

The variable $request_body is not guaranteed to have any content even in the case of a good POST request with data inside. The first possible reason for an empty $request_body is a situation where Nginx has decided that parsing the body is not needed and optimized it away. That is a documented behavior that still strikes in the least expected moments. The documentation says clearly:

"The variable's value is made available in locations processed by the proxy_pass, fastcgi_pass, uwsgi_pass, and scgi_pass directives."

See for yourself: http://nginx.org/en/docs/http/ngx_http_core_module.html#var_request_body.

These are the only four cases in which Nginx populates the $request_body variable. Fortunately, POST requests to locations that do not contain any of those directives are very rare. POSTs are intended to accept data from clients and feed to server applications for which Nginx is acting as a proxy.

Be careful not to harm yourself debugging empty request bodies for some uncommon configuration with POST requests and no proxying directives in that context.

The other reason for empty $request_body is the request being too large. If the size of the request body exceeds the value set up by the client_body_buffer_size directive, it is not available via $request_body variable. Instead, the whole body is saved to a temporary file on the file system, and its name is written into the new $request_body_file variable.

There is also another very interesting directive named client_body_in_file_only that provides a way to always save requests to files. It may be used instead of the mechanism that we showed earlier altogether! You will add $request_body_file to one of your log formats and turn on client_body_in_file_only. After this, Nginx will create an ever-growing store of files containing all your request bodies. Do not forget to clean them up from a crontab or they will fill the filesystem.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.255.87