Chapter 3. Installing and Configuring HTTP Modules

In this chapter, we will explore the installation and configuration of standard HTTP modules. Standard HTTP modules are built into Nginx by default unless you disable them while running the configure script. The optional HTTP modules are only installed if you specify them explicitly while running configure. These modules deal with functionalities such as SSL, HTTP authentication, HTTP proxy, gzip compression, and many others. We will look at some optional HTTP modules in the next chapter.

All the configuration directives we have talked about so far and the ones that we will be discussing in this and the remaining chapters are specified in the nginx.conf file. The default location of this file is /usr/local/conf/nginx.conf.

Standard HTTP modules

As mentioned earlier, standard HTTP modules are built into Nginx by default unless you explicitly disable them. As the name suggests, these modules provide standard HTTP functionality to the web server. We will now have a look at some of the important standard HTTP modules.

The core module (HttpCoreModule)

The core module deals with the core HTTP features. This includes the protocol version, HTTP keepalive, location (different configurations based on URI), documents' roots, and so on. There are over 74 configuration directives and over 30 environment variables related to the HTTP Core module. We will discuss the most important ones briefly.

Explaining directives

The following is an explanation of some of the key core module directives. This list is not exhaustive, and you can find the full list at http://wiki.nginx.org/HttpCoreModule.

server

The server directive defines the server context. It is defined as a server {...} block in the configuration file. Each server block refers to a virtual server. You have to specify a listen directive inside a server block to define the host IP and the port for this virtual server. Alternatively, you can specify a server_name directive to define all the hostnames of this virtual server.

server {
  server_name www.acme.com *.acme.com www.acme.org;
  ....
  ....
}
server {
  listen myserver.com:8001;
  ....
  ....
}

server_name

The server_name directive defines the name of the virtual server. It can contain a list of hostnames, and the first one becomes the default name of the server. The hostnames can be exact string literals, wildcards, regular expressions, or a combination of all of these. You can also define an empty hostname as "". This allows the processing of requests when the host HTTP header is empty.

The wildcard name can only use the asterisk (*) on the dot border and at the beginning or ending of the name. For example, *.example.com is a valid name; however, ac*e.example.com is an invalid name.

The regular expression server name can be any PCRE-compatible regular expression that must start with ~.

server_name ~^wwwd+.acme.org$

If you specify the environment variable $hostname in this directive, the hostname of the machine is used.

listen

The listen directive specifies the listen address of the server. The listen address can be a combination of an IP address and a port, hostname and port, or just a port.

server {
  listen 8001
  server_name www.acme.com *.acme.com www.acme.org
  ...
}

If no port is specified in the listen directive, the port 80 is used by default if the Nginx server is running as a superuser, otherwise the port 8000 is used.

Nginx can also listen on a UNIX socket using the following syntax:

listen unix:/var/lock/nginx

IPv6 addresses can be specified using the [] brackets:

listen [::]:80
listen [2001:db8::1]

Specifying an IPv6 address can enable the IPv4 address as well. In the first of the preceding examples, when you enable the [::]:80 address, binding port 80 using IPv6 in the listen directive, the IPv4 port 80 is also enabled by default in Linux.

The listen directive accepts several parameters as well; a couple of important ones are stated in the following paragraphs.

SSL

The listen parameter allows you to specify that the connection accepted on this listen address will work in the SSL mode.

default_server

The default_server parameter sets the listen address as the default location. If none of the listen addresses have a default specification, the first listen declaration becomes the default. For an HTTP request, Nginx tests the request's header field, Host, to determine which server the request should be routed. If its value does not match any server name or the request does not contain this header field at all, Nginx will route the request to the default server.

listen  8001
listen  443 default_server ssl

The ssl option specifies that all connections on this address should work with SSL. The ssl option will only work if the server was compiled using SSL support.

There are other parameters of the listen directive that correspond to the listen and bind system calls. For example, you can modify the send and receive buffers of the listening socket by providing the rcvbuf and sndbuf parameters. You can read about them in more detail in the official documentation at http://nginx.org/en/docs/http/ngx_http_core_module.html.

location

The location directive is a server context configuration. There can be several location configuration blocks inside the server block, each referring to a unique URI within that server. It is one of the most important and widely used directives, which allows you to specify a configuration based on a URI. A location matching the user request URI will result in that specific configuration block to be the handler of user request. You have a lot of flexibility in how you want to specify the configuration. This can be a string literal or a regular expression. The regular expressions can be used to do a case-sensitive (prefixed with ~) or a case-insensitive comparison (prefixed with ~*). You can also disable the regular expression matching by prefixing the string with ^~.

The order of matching is as follows:

  1. First, string literals with = are evaluated, and the searching stops on a match.
  2. Remaining strings are matched; a match encountering ^~ also stops the search. Among all the non-regular-expression strings, the one with the longest matched prefix is chosen.
  3. Regular expressions are searched in the order in which they appear in the nginx.conf file.
  4. In case there are two matches, one from a regular expression and one from a string, the string is used.
    location = / matches only /
    location / matches any URI
    location ~/index matches a lower case /index as a subsring in anyposition

It does not matter in which order the configurations are defined. They will always be evaluated in the order mentioned previously.

location ^~/index/main.jpg
location ~^/index/.*.jpg$

In the example, a URI such as /index/main.jpg will select the first rule even though both the patterns match. This is due to the ^~ prefix, which disables regular expression search.

It is also possible to define named locations with @, which are for internal use. For example:

location @internalerror (
  proxy_pass http://myserver/internalerror.html
)

You can then use the @internalerror in another configuration, that is:

location / (
  error_page /500.html @internalerror;
)

server_names_hash_bucket_size

Nginx stores static data in hash tables for quick access. There is a hash table maintained for each set of static data, such as server names. The identical names go into a hash bucket, and the server_names_hash_bucket_size parameter controls the size of a hash bucket in the server name hash table.

This parameter (and other hash_bucket_size parameters) should be a multiple of the processor's cache line size. This allows for an optimized search within a hash bucket ensuring that any entry can be found in a maximum of two memory reads. On Linux, you can find the cache line size as follows:

$ getconf LEVEL1_DCACHE_LINESIZE

server_names_hash_max_size

The server_names_hash_max_size directive specifies the maximum size of the hash table, which contains the server names. The size of the hash table calculated using the server_names_hash_bucket_size parameter cannot exceed this value. The default value is 512.

http {
  ...
  ...
  server_names_hash_bucket_size 128;
  server_names_hash_max_size 1024;

  server {
  ...
  ...

  }
}

tcp_nodelay/tcp_nopush

The tcp_nodelay and tcp_nopush directives allow you to control the socket settings of tcp_nodelay and tcp_nopush or tcp_nocork for Linux. tcp_nodelay is useful for servers that send frequent small bursts of packets without caring about the response. This directive essentially disables the Nagle algorithm on the TCP/IP socket. tcp_nopush or tcp_nocork will only have an effect if you use the sendfile() kernel option.

sendfile

The sendfile directive activates or deactivates the usage of Linux kernel's sendfile(). This offers significant performance benefits to applications such as web servers that need to efficiently transfer files. A web server spends much of its time transferring files stored on a disk to a network connection connected to a client running a web browser. Typically, this includes the read() and write() calls, which require context switching and data copying to and from user or kernel buffers. The sendfile system call allows Nginx to copy files from the disk to the socket using the fast track sendfile(), which stays within the kernel space. As of Linux 2.6.22, if you want to use the Aio with direct I/O (O_DIRECT) you should turn off sendfile. This can be more efficient if the web server serves large files ( > 4 MB). In FreeBSD before 5.2.1 and Nginx 0.8.12, you must disable sendfile support as well.

sendfile_max_chunk

When set to a nonzero value, the sendfile_max_chunk directive limits the amount of data that can be transferred in a single sendfile() call.

root

root specifies the document root for the requests by appending a path to the request. For example, with the following configuration:

location  /images/ {
  root  /var/www;
}

A request for /web/logo.gif will return the file /var/www/images/logo.gif.

resolver/resolver_timeout

This allows you to specify the DNS server address or name. You can also define the timeout for name resolution, for example:

resolver 192.168.220.1;
resolver_timeout 2s;

aio

The aio directive allows Nginx to use the POSIX aio support in Linux. This asynchronous I/O mechanism allows multiple nonblocking reads and writes.

location /audio {
  aio on;
  directio 512;
  output_buffers 1 128k;
}

On Linux this will disable the sendfile support. In FreeBSD before 5.2.1 and Nginx 0.8.12, you must disable the sendfile support.

location /audio {
  aio on;
  sendfile off;
}

As of FreeBSD 5.2.1 and Nginx 0.8.12, you can use it with sendfile.

alias

The alias directive is similar to the root directive with a subtle difference. When you define an alias for a location, the alias path is searched instead of the actual location. This is slightly different from the root directive where the root path is appended to the location. For example:

location  /img/ {
  alias  /var/www/images/;
}

A request for /img/logo.gif will instruct Nginx to serve the file /var/www/images/logo.gif.

Aliases can also be used in a location specified by a regular expression.

error_page

The error_page directive allows you to show error pages based on error code. For example:

error_page   404          /404.html;
error_page   502 503 504  /50x.html;

It is possible to show a different error code instead of the original error. It is also possible to specify a script like a php file (which in turn generates the content of the error page). This can allow you to write one generic error handler that creates a customized page depending on the error code and type:

error_page 404 =200 /empty.gif;
error_page 500 =errors.php;

If there is no need to change the URL in the browser during redirection, it is possible to redirect the processing of error pages to a named location:

location / (
  error_page 404 @errorhandler;
)
location @ errorhandler (
  proxy_pass http://backend/errors.php;
)

keepalive_disable, keepalive_timeout, and keepalive_requests

The keepalive_disable directive allows you to disable the HTTP keepalive for certain browsers.

keepalive_timeout assigns the timeout for the keepalive connections with the client. The server will close connections after this time. You can also specify a zero value to disable the keepalive for client connections. This adds an HTTP header Keep-Alive: timeout=time to the response.

keepalive_requests parameter determines how many client requests will be served through a single keepalive connection. Once this limit is reached the connection is closed, and new keepalive session will be initiated.

Controlling access (HttpAccessModule)

The HttpAccessModule allows IP-based access control. You can specify both IPv4 and IPv6 addresses. Another alternative is using the GeoIP module.

Rules are checked according to the order of their declaration. There are two directives called allow and deny which control the access. The first rule that matches a particular address or a set of addresses is the one that is obeyed.

location / {
  deny    192.168.1.1;
  allow   192.168.1.0/24;
  allow   10.1.1.0/16;
  allow   2620:100:e000::8001;
  deny    all;
}

In this example access is granted to the networks 10.1.1.0/16 and 192.168.1.0/24 with the exception of the address 192.168.1.1, which is denied access together with all the other addresses as defined by the deny all rule that is matched last in this location block. In addition, it allows one specific IPv6 address. All others would be denied.

The order is of utmost importance. The rules are interpreted according to the order. So, if you move deny all to the top of the list, all requests will be denied because that's the first rule that is encountered, and therefore, it takes precedence.

Authenticating users (HttpBasicAuthModule)

You can use the HttpBasicAuthModule to protect your site or parts of it with a username and password based on HTTP Basic authentication. It is the simplest technique for enforcing access controls to web resources because it doesn't require cookies, session identifier, and login pages. Rather, HTTP Basic authentication uses static, standard HTTP headers, which mean that no handshakes have to be done in anticipation.

The following is an example configuration:

location  /  {
  auth_basic            "Registered Users Only";
  auth_basic_user_file  htpasswd;
}

Explaining directives

Now let us look at some of the important directives of this module.

auth_basic

This auth_basic directive includes testing the name and password with HTTP Basic authentication. The assigned value is used as authentication realm.

auth_basic_user_file

The auth_basic_user_file directive sets the password filename for the authentication realm. The path is relative to the directory of the Nginx configuration file.

The format of the file is as follows:

user:pass
user2:pass2:comment
user3:pass3

Passwords must be encoded by the function crypt (3). You can use PLAIN, MD5, SSHA, and SHA1 encryption methods. If you have Apache installed on your system, you can use the htpasswd utility to generate the htpasswd file.

This file should be readable by Nginx worker processes, running from an unprivileged user.

Load balancing (HttpUpstreamModule)

The HttpUpstreamModule allows simple load balancing based on a variety of techniques such as Round-robin, weight, IP address, and so on to a collection of upstream servers.

Example:

upstream servers  {
  server server1.example.com weight=5;
  server server2.example.com:8080;
  server unix:/tmp/server3;
}
server {
  location / {
    proxy_pass  http://servers;
  }
}

Explaining directives

Some of the important directives of the HttpUpstreamModule are as follows:

ip_hash

The ip_hash directive causes requests to be distributed between servers based on the IP address of the client.

The key for the hash is the IP address (IPv4 or IPv6) of the client. This method guarantees that the client request will always be transferred to the same server. If the server is not available, the request is transferred to another server.

You can combine ip_hash and weight based methods. If one of the servers needs to be taken offline, you must mark that server as down.

For example:

upstream backend {
  ip_hash;
  server   server1.example.com weight=2;
  server   server2.example.com;
  server   server3.example.com  down;
  server   server4.example.com;
}

server

The server directive is used to specify the name of the upstream server. It is possible to use a domain name, address, port, or UNIX socket. If the domain name resolves to several addresses, all are used.

This directive accepts several parameters, which are given as follows:

  • weight: This sets the weight of the server. If it is not set, weight is equal to one.
  • max_fails: This is the number of unsuccessful attempts at communicating with the server within the time period fail_timeout after which it is considered down. If it is not set, only one attempt is made. A value of 0 turns off this check. What is considered a failure is defined by proxy_next_upstream or fastcgi_next_upstream (except http_404 errors, which do not count toward max_fails).
  • fail_timeout: The time period within which failed attempts to connect to an upstream server are attempted before the server is considered down. It is also the time for which the server will be considered inoperative (before another attempt is made). The default value is 10 seconds.
  • down: This parameter marks the server as offline.

If you use only one upstream server, Nginx will ignore the max_fails and fail_timeout parameters. This may cause your request to be lost if the upstream server is not available. You can use the same server name several times to simulate retries.

upstream

The upstream directive describes a set of upstream or backend servers to which the requests are sent. These are the servers that can be used in the directives proxy_pass and fastcgi_pass as a single entity. Each of the defined servers can be on different ports. You can also specify servers listening on local sockets.

Servers can be assigned different weights. If it is not specified, the weight is equal to one.

upstream servers {
  server server1.example.com weight=5;
  server 127.0.0.1:8080       max_fails=3  fail_timeout=30s;
  server unix:/tmp/localserver;
}

Requests are distributed according to the servers in the Round-robin manner with respect to the server weight.

For example, for every seven requests given previously, their distribution will be as follows: five requests will be sent to server1.example.com and one request each to the second and the third server. If there is an error in connecting to the server, the request is sent to the next server. In the previous example, there will be three tries within 30 s for server 2 in case of a failure before the request is forwarded to server 3.

Acting as a proxy (HttpProxyModule)

The HttpProxyModule allows Nginx to act as a proxy and pass requests to another server.

location / {
  proxy_pass        http://app.localhost:8000;
}

Note when using the HttpProxyModule (or even when using FastCGI), the entire client request will be buffered in Nginx before being passed on to the proxy server.

Explaining directives

Some of the important directives of the HttpProxyModule are as follows:

proxy_pass

The proxy_pass directive sets the address of the proxy server and the URI to which the location will be mapped. The address may be given as a hostname or an address and port, for example:

proxy_pass http://localhost:8000/uri/;

Or, the address may be given as an UNIX socket path:

proxy_pass http://unix:/path/to/backend.socket:/uri/;

path is given after the word unix between two colons.

You can use the proxy_pass directive to forward headers from the client request to the proxied server.

proxy_set_header Host $host;

While passing requests, Nginx replaces the location in the URI with the location specified by the proxy_pass directive.

If inside the proxied location, URI is changed by the rewrite directive and this configuration will be used to process the request. For example:

location  /name/ {
  rewrite      /name/([^/] +)  /users?name=$1  break;
  proxy_pass   http://127.0.0.1;
}

A request URI is passed to the proxy server after normalization as follows:

  • Double slashes are replaced by a single slash
  • Any references to current directory like "./" are removed
  • Any references to the previous directory like "../" are removed.

If proxy_pass is specified without a URI (for example in "http://example.com/request", /request is the URI part), the request URI is passed to the server in the same form as sent by a client.

 location /some/path/ {
     proxy_pass http://127.0.0.1;
 }

If you need the proxy connection to an upstream server group to use SSL, your proxy_pass rule should use https:// and you will also have to set your SSL port explicitly in the upstream definition. For example:

upstream https-backend {
  server 10.220.129.20:443;
}
 
server {
  listen 10.220.129.1:443;
  location / {
    proxy_pass https://backend-secure;
  }
}

proxy_pass_header

The proxy_pass_header directive allows transferring header lines forbidden for response.

For example:
location / {
  proxy_pass_header X-Accel-Redirect;
}

proxy_connect_timeout

The proxy_connect_timeout directive sets a connection timeout to the upstream server. You can't set this timeout value to be more than 75 seconds. Please remember that this is not the response timeout, but only a connection timeout.

This is not the time until the server returns the pages which is configured through proxy_read_timeout directive. If your upstream server is up but hanging, this statement will not help as the connection to the server has been made.

proxy_next_upstream

The proxy_next_upstream directive determines in which cases the request will be transmitted to the next server:

  • error: An error occurred while connecting to the server, sending a request to it, or reading its response
  • timeout: The timeout occurred during the connection with the server, transferring the request, or while reading the response from the server
  • invalid_header: The server returned an empty or incorrect response
  • http_500: The server responded with code 500
  • http_502: The server responded with code 502
  • http_503: The server responded with code 503
  • http_504: The server responded with code 504
  • http_404: The server responded with code 404
  • off: Disables request forwarding

Transferring the request to the next server is only possible if there is an error sending the request to one of the servers. If the request sending was interrupted due to an error or some other reason, the transfer of request will not take place.

proxy_redirect

The proxy_redirect directive allows you to manipulate the HTTP redirection by replacing the text in the response from the upstream server. Specifically, it replaces text in the Location and Refresh headers.

The HTTP Location header field is returned in response from a proxied server for the following reasons:

  • To indicate that a resource has moved temporarily or permanently.
  • To provide information about the location of a newly created resource. This could be the result of an HTTP PUT.

Let us suppose that the proxied server returned the following:

Location: http://localhost:8080/images/new_folder

If you have the proxy_redirect directive set to the following:

proxy_redirect http://localhost:8080/images/ http://xyz/;

The Location text will be rewritten to be similar to the following:

Location: http://xyz/new_folder/.

It is possible to use some variables in the redirected address:

proxy_redirect http://localhost:8000/ http://$location:8000;

You can also use regular expressions in this directive:

proxy_redirect ~^(http://[^:]+):d+(/.+)$ $1$2;

The value off disables all the proxy_redirect directives at its level.

proxy_redirect off;

proxy_set_header

The proxy_set_header directive allows you to redefine and add new HTTP headers to the request sent to the proxied server.

You can use a combination of static text and variables as the value of the proxy_set_header directive.

By default, the following two headers will be redefined:

proxy_set_header Host $proxy_host;
proxy_set_header Connection Close;

You can forward the original Host header value to the server as follows:

proxy_set_header Host $http_host;

However, if this header is absent in the client request, nothing will be transferred.

It is better to use the variable $host; its value is equal to the request header Host or to the basic name of the server in case the header is absent from the client request.

proxy_set_header Host $host;

You can transmit the name of the server together with the port of the proxied server:

proxy_set_header Host $host:$proxy_port;

If you set the value to an empty string, the header is not passed to the upstream proxied server. For example, if you want to disable the gzip compression on upstream, you can do the following:

proxy_set_header  Accept-Encoding  "";

proxy_store

The proxy_store directive sets the path in which upstream files are stored, with paths corresponding to the directives alias or root. The off directive value disables local file storage. Please note that proxy_store is different from proxy_cache. It is just a method to store proxied files on disk. It may be used to construct cache-like setups (usually involving error_page-based fallback). This proxy_store directive parameter is off by default. The value can contain a mix of static strings and variables.

proxy_store   /data/www$uri;

The modification date of the file will be set to the value of the Last-Modified header in the response. A response is first written to a temporary file in the path specified by proxy_temp_path and then renamed. It is recommended to keep this location path and the path to store files the same to make sure it is a simple renaming instead of creating two copies of the file.

Example:

location /images/ {
  root                 /data/www;
  error_page           404 = @fetch;
}
 
location /fetch {
  internal;
  proxy_pass           http://backend;
  proxy_store          on;
  proxy_store_access   user:rw  group:rw  all:r;
  proxy_temp_path      /data/temp;
  alias                /data/www;
}

In this example, proxy_store_access defines the access rights of the created file.

In the case of an error 404, the fetch internal location proxies to a remote server and stores the local copies in the /data/temp folder.

proxy_cache

The proxy_cache directive either turns off caching when you use the value off or sets the name of the cache. This name can then be used subsequently in other places as well. Let's look at the following example to enable caching on the Nginx server:

http {
  proxy_cache_path  /var/www/cache levels=1:2 keys_zone=my-cache:8m max_size=1000m inactive=600m;
  proxy_temp_path /var/www/cache/tmp;


  server {
    location / {
      proxy_pass http://example.net;
      proxy_cache my-cache;
      proxy_cache_valid  200 302  60m;
      proxy_cache_valid  404      1m;
    }
  }
}

The previous example creates a named cache called my-cache. It sets up the validity of the cache for response codes 200 and 302 to 60m, and for 404 to 1m, respectively.

The cached data is stored in the /var/www/cache folder. The levels parameter sets the number of subdirectory levels in the cache. You can define up to three levels.

The name of key_zone is followed by an inactive interval. All the inactive items in my-cache will be purged after 600m. The default value for inactive intervals is 10 minutes.

Compressing content (HttpGzipModule)

The HttpGzipModule allows for on-the-fly gzip compression.

  gzip             on;
  gzip_min_length  1000;
  gzip_proxied     expired no-cache no-store private auth;
  gzip_types       text/plain application/xml;

The achieved compression ratio, computed as the ratio between the original and the compressed response size, is available via the variable $gzip_ratio.

Explaining directives

Some of the important directives of the HttpGzipModule are as follows:

gzip

The gzip directive enables or disables gzip compression.

gzip_buffers

The gzip_buffers directive assigns the number and size of the buffers in which the compressed response will be stored. If unset, the size of one buffer is equal to the size of the page; depending on the platform, this is either 4K or 8K.

gzip_comp_level

The gzip_comp_level directive sets a gzip compression level of a response. The compression level, between 1 and 9, where 1 is the least compression (fastest) and 9 is the most compression (slowest).

gzip_disable

The gzip_disable directive disables gzip compression for browsers or user agents matching the given regular expression. For example, to disable gzip compression for Internet Explorer 6 use:

gzip_disable     "msie6";

This is a useful setting to have since some browsers such as MS Internet Explorer 6 don't handle the compressed response correctly.

gzip_http_version

The gzip_http_version directive turns gzip compression on or off depending on the HTTP request version, which is 1.0 or 1.1.

gzip_min_length

The gzip_min_length directive sets the minimum length, in bytes, of the response that will be compressed. Responses shorter than this byte length will not be compressed. Length is determined from the Content-Length header.

gzip_proxied

The gzip_proxied directive enables or disables compression for proxied requests. The proxied requests are identified through the Via HTTP header. This header informs the server of proxies through which the request was sent. Depending on various HTTP headers, we can enable or disable the compression for proxied requests as follows:

  • off: This disables compression for requests having a Via header
  • expired: This enables compression if a response header includes the field Expires with a value that disables caching
  • no-cache: This enables compression if the Cache-Control header is set to no-cache
  • no-store: This enables compression if the Cache-Control header is set to no-store
  • private: This enables compression if the Cache-Control header is set to private
  • no_last_modified: This enables compression if Last-Modified isn't set
  • no_etag: This enables compression if there is no ETag header
  • auth: This enables compression if there is an Authorization header
  • any: This enables compression for all proxied requests

gzip_types

The gzip_types directive enables compression for additional MIME types besides text or html. text/html is always compressed.

Controlling logging (HttpLogModule)

The HttpLogModule controls how Nginx logs the requests for resources, for example:

access_log  /var/log/nginx/access.log  gzip  buffer=32k;

Please note that this does not include logging errors.

Explaining directives

Some of the important directives of HttpLogModule are the following.

access_log

The access_log directive sets the path, format, and buffer size for the access logfile. Using off as the value disables logging at the current level. If the format is not indicated, it defaults to combined. The size of the buffer must not exceed the size of the atomic record for writing into the disk file. This size is not limited for FreeBSD 3.0-6.0. If you specify gzip, the log is compressed before it's written to the disk. The default buffer size is 64K with compression level as 1.

The atomic size that can be written is called PIPE_BUF. The capacity of a pipe buffer varies across systems.

Mac OS X, for example, uses a capacity of 16,384 bytes by default but can switch to 65,336 byte capacities if large writes are made to the pipe. Or it will switch to a capacity of a single system page if too much kernel memory is already being used by pipe buffers (see xnu/bsd/sys/pipe.h and xnu/bsd/kern/sys_pipe.c; since these are from FreeBSD, the same behavior may happen here too).


According to the Linux pipe(7) man page, pipe capacity is 65,536 bytes since Linux 2.6.11 and a single system page prior to that (for example, 4096 bytes on 32-bit x86 systems). The buffer for each pipe can be changed using fcntl system call to the maximum of /proc/sys/fs/pipe-max-size.

log_format

The log_format directive describes the format of a log entry. You can use general variables in the format as well as variables that exist only at the moment of writing into the log. An example of log_format is as follows:

log_format gzip '$msec $request $remote-addr $status $bytes_sent';

You can specify the format of a log entry by specifying what information should be logged. Some of the options you can specify are as follows:

  • $body_bytes_sent: This is the number of bytes transmitted to the client minus the response headers
  • $bytes_sent: This is the number of bytes transmitted to the client
  • $connection: This is the number of connections
  • $msec: This is the current time at the moment of writing the log entry (microsecond accuracy)
  • $pipe: This is p if request was pipelined
  • $request_length: This is the length of the body of the request
  • $request_time: This is the time it took Nginx to work on the request, in seconds, with millisecond precision
  • $status: This is the status of the answer
  • $time_iso8601: This is the time in ISO 8601 format, for example, 2011-03-21T18:52:25+03
  • $time_local: This is the local time in common log format

Setting response headers (HttpHeadersModule)

The HttpHeadersModule allows setting arbitrary HTTP headers.

Explaining directives

Some of the important directives of the HttpHeadersModule are the following:

add_header

The add_header directive adds a header to the header list of the response when the response code is 200, 201, 204, 206, 301, 302, 303, 304, or 307. The value can contain variables and can contain negative or positive time value.

Note that you should not use this directive to replace or override the value of a header. The headers specified with this directive are simply appended to the header list.

expires

The expires directive is used to set the Expires and Cache-Control headers in the response. You can set the value to off to leave these headers as it is. The time in this field is computed as a sum of the current time and the time specified in the directive. If the modified parameter is used, time is computed as a sum of the file's modification time and the time specified in the directive.

  • epoch: This sets the Expires header to the absolute value of 1 January, 1970 00:00:01 GMT.
  • max: This sets the Expires header to 31 December 2037 23:59:59 GMT, and the Cache-Control header to 10 years.

You can specify a time interval using @:

@5h40m

The contents of the Cache-Control header depend on the sign of the specified time. A negative value of time sets it to no-cache. A positive value sets it to time in seconds.

The following is an example configuration:

expires    12h;
expires    modified +14h;
expires    @5h;
expires    0;
expires    -1;
expires    epoch;
add_header X-Name example.org

Rewriting requests (HttpRewriteModule)

The HttpRewriteModule is used to change request URIs using regular expressions, redirect the client, and select different configurations based on conditions and variable values. In order to use this module, you should compile Nginx with PCRE support.

The processing of the directives starts at the server level. After this, the location block matching the request is searched and any rewrite directives there are executed. If this processing results in further rewrites, a new location block is search for the changed URI. This cycle continues 10 times before the server throws the 500 error.

Explaining directives

Some of the important directives of the HttpRewriteModule are the following:

break

The break directive stops the processing of any other rewrite block directives in the current block.

if ($slow) {
  limit_rate  10k;
  break;
}

if

The if directive checks a condition. If the condition evaluates to true, the code indicated in the curly braces is carried out and the request is processed in accordance with the configuration within the following block. The configuration inside the if block is inherited from the previous level.

Following are considered to be valid conditions.

  • The name of a variable is a condition. The condition evaluates to false if the variable contains an empty string "" or a 0.
  • Using comparison operator with the variable to compare it to another variable or a string.
  • Matching a variable against a regular expression using ~, *~, or !~ operator. *~ is used for case-insensitive comparison, while !~ is a not-equals operator.
  • You can check for the existence of a file using the -f or !-f operators (similar to BASH tests).
  • Checking for the existence of a directory using -d or !-d.
  • Checking for the existence of a file, directory, or symbolic link using -e or !-e.
  • Checking whether a file is executable using -x or !-x.

By placing part of a regular expression inside round brackets or parentheses, you can group that part of the regular expression together. This allows you to apply a quantifier to the entire group or to restrict alternation to part of the regular expression. These parts can be accessed in the $1 to $9 variables.

Example:

if ($http_user_agent ~ MSIE) {
  rewrite ^(.*)$  /msie/$1  break;
}
if ($http_cookie ~* "val=([^;] +)(?:;|$)" ) {
  set $val $1;
}
if ($request_method = GET ) {
  return 405;
}
if ($args ~ post=140){
  rewrite ^ http://acme.com/ permanent
}

return

The return directive stops execution and returns a status code. It is possible to use any HTTP return code ranging in number from 0 to 999.

If you want to terminate the connection and don't want to send any headers in response, use the return code 444.

rewrite

The rewrite directive does the actual rewrite and changes URI according to the regular expression and the replacement string. Directives are carried out in the order of definition in the configuration file. The flag parameter makes it possible to stop the rewriting process in the current block.

If the replacement string begins with http://, the client will be redirected and any further rewrite directives will be terminated.

The value of the flag parameter can be one of the following:

  • last: This completes the processing of current rewrite directives and searches for a new block that matches the rewritten URI
  • break: This stops the rewriting process in the current block
  • redirect: This returns a temporary redirect with the code 302, and is used if a replacement string does not start with http:// or https://
  • permanent: This returns a permanent redirect with code 301

Note that outside location blocks, last and break are effectively the same.

Example:

rewrite  ^(/media/.*)/video/(.*)..*$  $1/mp3/$2.avi last;
rewrite  ^(/media/.*)/audio/(.*)..*$  $1/mp3/$2.ra break;
return 403;

But if we place these directives in the location block, it is necessary to replace the flag last by break, otherwise Nginx will hit the 10-cycle limit and return error 500:

location /download/ {
  rewrite  ^(/media/.*)/video/(.*)..*$  $1/mp3/$2.avi  break;
  rewrite  ^(/media/.*)/audio/(.*)..*$  $1/mp3/$2.ra   break;
  return   403;
}

If there are arguments in the replacement string, the rest of the request arguments are appended to them. To avoid having them appended, place a question mark as the last character:

rewrite  ^/pages/(.*)$  /show?page=$1?  last;

Note that for curly braces ( { and } ), as they are used both in regex and for block control, to avoid conflicts, regex with curly braces are to be enclosed with double quotes (or single quotes). For example, to rewrite URLs such as /users/123456 to /path/to/users/12/1234/123456.html, use the following (note the quotes):

rewrite  "/users/([0-9]{2})([0-9]{2})([0-9]{2})"/path/to/users/$1/$1$2/$1$2$3.html;

If you specify a ? at the end of a rewrite, Nginx will drop the original query string. A good use case is when using $request_uri, you should specify the ? at the end of the rewrite to avoid Nginx doubling the query string.

An example of using $request_uri in a rewrite from www.acme.com to acme.com:

server {
  server_name www.acme.com;
  rewrite ^ http://acme.com$request_uri? permanent;
}

Also, rewrite operates only on paths, not on parameters. To rewrite a URL with parameters to another URL, use the following instead:

if ($args ~ post=200){
  rewrite ^ http://acme.com/new-address.html?;
}

rewrite_log

The rewrite_log directive enables the logging of information about rewrites to the error log at notice level.

set

The set directive establishes the value for the variable indicated. It is possible to use text, variables, and their combination as the value.

You can use set to define a new variable. Note that you can't set the value of a $http_xxx header variable.

uninitialized_variable_warn

The uninitialized_variable_warn directive enables or disables warnings of variables that are not initialized.

Interacting with FastCGI (HttpFastcgiModule)

The HttpFastcgiModule allows Nginx to interact with the FastCGI processes (that is, PHP) and controls which parameters will be passed to the process.

Example:

location / {
  fastcgi_pass   localhost:9090;
  fastcgi_index  index.php;
   fastcgi_param  SCRIPT_FILENAME$document_root/php/$fastcgi_script_name;
  fastcgi_param  QUERY_STRING     $query_string;
  fastcgi_param  REQUEST_METHOD   $request_method;
  fastcgi_param  CONTENT_TYPE     $content_type;
  fastcgi_param  CONTENT_LENGTH   $content_length;
}

The name of the FastCGI server is provided in the fastcgi_pass parameter. This name can be an IP address or a domain name with a port. This can also be an UNIX domain socket.

If you want to pass a parameter to the FastCGI server, you use the fastcgi_param parameter. The value of this parameter can be a static value, a variable, or a combination of both.

Following is a minimum configuration for PHP:

fastcgi_param SCRIPT_FILENAME /php$fastcgi_script_name;
fastcgi_param QUERY_STRING    $query_string;

Simple caching (HttpMemcachedModule)

You can use this module to perform simple caching using memcached. Memcached is an in-memory, key-value store for small chunks of arbitrary data (strings, objects) from the results of database calls, API calls, or page rendering.

Example:

server {
  location / {
    set $memcached_key $uri$args;
    memcached_pass     http://mem-server:1211
    default_type       text/html;
      error_page         404 502 504 @error;
  }
  location @error {
    proxy_pass http://backend;
  }
}

Explaining directives

Some of the important directives of the HttpMemcachedModule are as follows:

memcached_pass

The memcached_pass directive specifies the memcached server name as an IP or domain name. It can also contain a port. If the domain name translates into various addresses, all of them are tried in the Round-robin fashion.

memcached_connect_timeout

The memcached_connect_timeout directive is the timeout for connecting to the memcached server. The time of the timeout usually can be 75s at maximum. The default value is 60s.

memcached_read_timeout

The memcached_read_timeout directive is the timeout for reading keys from the memcached server. This time is measured between two successive reads, and if the memcached server does not respond, the timeout occurs. The default value is 60s.

memcached_send_timeout

The memcached_send_timeout directive is the timeout for sending a request to the memcached server. A timeout is only set between two successive write operations and not for the transmission of the whole request. If a memcached server does not receive anything within this time, a connection is closed.

memcached_buffer_size

The memcached_buffer_size directive is the receive or send buffer size in bytes. It sets the size of the buffer used for reading a response received from the memcached server. A response is passed to a client synchronously and immediately when it is received. Default value is 4K or 8K.

memcached_next_upstream

Which failure conditions should cause the request to be forwarded to another memcached upstream server? The answer is only when the value in memcached_pass is an upstream block with two or more servers.

Limiting requests (HttpLimitReqModule)

The HttpLimitReqModule allows limiting the request processing rate by key, in particular by the address. The limitation is done using the leaky bucket method. A counter associated with each address transmitting on a connection is incremented whenever the user sends a request and is decremented periodically. If the counter exceeds a threshold upon being incremented, Nginx delays the request.

The following is an example configuration:

http {
    limit_req_zone  $binary_remote_addr  zone=one:10m   rate=1r/s;
 
    ...
 
    server {
 
        ...
 
        location /search/ {
            limit_req   zone=one  burst=5;
        }

Explaining directives

Some of the important directives of the HttpLimitReqModule are as follows:

limit_req

The limit_req directive sets a shared memory zone and the maximum burst size of requests. Excessive requests are delayed until their number exceeds the maximum burst size in which case the request is terminated with an error 503 (Service Temporarily Unavailable). By default, the maximum burst size is equal to zero. For example, for the directive limit_req_zone:

  $binary_remote_addr  zone=one:10m   rate=1r/s;
     server {
        location /search/ {
            limit_req   zone=one  burst=5;
        }

It allows a user no more than one request per second on average with bursts of no more than five requests.

If delaying excess requests within a burst is not necessary, you should use the option nodelay:

limit_req zone=one burst=5 nodelay;

limit_req_log_level

The limit_req_log_level directive controls the log level of the delayed or rejected requests. The log levels can be info, notice, warn, or error. The default log level is error for rejected requests. Delays are logged at the next lower level, for example when limit_req_log_level is set to "error", delayed requests are logged at "warn".

limit_req_zone

The limit_req_zone directive sets the name and parameters of a shared memory zone that keeps states for various keys. The state stores the current number of excessive requests in particular. The key is any nonempty value of the specified variable (empty values are not accounted). An example usage of this is as follows:

limit_req_zone $binary_remote_addr zone=myzone:20m rate=5r/s;

In this case, there is a 20 MB zone called myzone, and the average speed of queries for this zone is limited to 5 requests per second.

The sessions are tracked per user in this case. A 1 MB zone can hold approximately 16,000 states of 64 bytes. If the storage for a zone is exhausted, the server will return error 503 (Service Temporarily Unavailable) to all further requests.

The speed is set in requests per second or requests per minute. The rate must be an integer; so if you need to specify less than one request per second, say, one request every two seconds, you would specify it as 30r/m.

Limiting connections (HttpLimitConnModule)

The HttpLimitConnModule makes it possible to limit the number of concurrent connections for a key such as an IP address.

An example configuration:

http {
    limit_conn_zone $binary_remote_addr zone=addr:10m;

    ...

    server {

        ...

        location /download/ {
            limit_conn addr 1;
        }

Explaining directives

Some of the important directives of HttpLimitConnModule are as follows:

limit_conn

The value of the limit_conn directive defines the limit of connection per zone. When this limit is exceeded, the server will return a status error 503 (Service Temporarily Unavailable) in reply to the request.

Multiple limit directives for different zones can be used in the same context. For example:

limit_conn_zone $binary_remote_addr zone=addr:10m;

server {
    location /download/ {
        limit_conn addr 1;
    }

This is allowed for only a single connection at a time per unique IP address.

limit_conn_zone

The limit_conn_zone directive sets the parameters for a zone that keeps the state for various keys. This state stores the current number of connections in particular. The key is the value of the specified variable. For example:

limit_conn_zone $binary_remote_addr zone=addr:10m;

Here, an IP address of the client serves as a key. If the storage for a zone is exhausted, the server will return error 503 (Service Temporarily Unavailable) to all further requests.

limit_conn_log_level

The limit_conn_log_level directive sets the error log level, which is used when a connection limit is reached. The default log level is error.

limit_conn_status

The limit_conn_status directive defines the response code when a limit is reached. The default value is 503 (Service Unavailable).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.227.26.217