Chapter 11. Multi Processing and Protocol Modules

The Evolution of Apache’s Architecture

Apache is not a monolithic server. New modules can be added to provide enhanced functionality, and existing modules can be removed to reduce the size of the server and improve performance. Apache 2 takes this modularization concept further and introduces three new ways of extending the server:

  • Multi Processing Modules (MPMs): Allow you to change the way Apache serves requests and improve the performance and scalability of the server.

  • Filtering Modules: Provide a way for modules to process the content provided by other modules.

  • Protocol Modules: The protocol layer has been abstracted, so it is possible for Apache to serve content using other protocols, such as FTP.

Selecting a Multi Processing Module

--with-mpm=worker
--with-mpm=prefork

Although MPM selection depends on many factors, including support for specific third-party modules and functionality, some MPMs perform better on certain platforms. For example, processes on AIX are “heavy” and a threaded MPM is preferred on this platform for scalability. It is not possible to change the request processing mechanism in Apache 1.3. For Apache 2, you select an MPM during the configuration and build process with the --with-mpm option. Currently, Windows has its own thread-based MPM, and Unix has two stable MPMs: prefork and worker. A number of additional modules are distributed with the server and considered experimental. The following sections explain the features of the different MPMs and how to configure them.

Understanding Process-Based MPMs

In a process-based server, the server forks several children. Forking means that a parent process makes identical copies of itself, called children. Each one of the children can serve a request independently of the others. This approach has the advantage of improved stability; if one of the children misbehaves, for example, by leaking memory, it can be killed without affecting the rest of the server.

The increased stability comes with a performance penalty: Each one of the children occupies additional memory, and the operating system spends a certain amount of time in context switching. In addition, this approach makes inter-process communication and data sharing difficult.

Apache 1.3 is a process-based server and Apache 2 provides a prefork MPM that allows it to perform as a process-based server. Prefork means that children can be forked at startup, instead of when a request comes. Apache allows you to configure the number of children to fork at startup and the maximum number of possible children, as described in the next section.

Configuring the Prefork MPM

StartServers 5
MinSpareServers 5
MaxSpareServers 10
MaxClients 150
MaxRequestsPerChild 0

You can control the number of processes that will be created at startup by using the StartServers directive. It takes a single argument, indicating the number of servers to fork when the server starts. The default value is 5 and is appropriate for most websites. You should change this setting only if you run a very busy website.

MaxClients enables you to control the maximum number of processes spawned, up to the operating system limits or Apache’s maximum number of possible children. In Apache 1.3, the maximum number of possible children is hard coded to 256. To change this value, you will need to change the HARD_SERVER_LIMIT setting in httpd.h and recompile the server. In Apache 2, it can be changed in the configuration using the ServerLimit directive.

The MinSpareServers directive defines the minimum number of processes that can be idle (not serving any request) at any time. If the number of idle servers goes below the setting of MinSpareServers, Apache will spawn additional children. Conversely, MaxSpareServers sets the maximum number of idle processes allowed. If the number of idle servers grows beyond this setting, some of them will be killed. The default values, shown in the example, should be enough for most servers.

Finally, you can limit the number of requests that a specific process will serve using the MaxRequestsPerChild directive. It does not count multiple requests reusing the same connection. As explained earlier in the chapter, this is useful to prevent memory leaks from becoming an issue with processes that are running for a long time. The server will kill the process and replace it with a new one after the specified number of requests. You can set MaxRequestsPerChild to 0 if you do not want processes to be killed after a specific number of requests.

Understanding Threaded and Hybrid MPMs

Threads are similar to processes, but they can share memory and data with other threads. This has the advantage that there is no context switching (threads are part of the same process), and the disadvantage that poorly written code can take the whole server down with it. This can happen because a misbehaving thread is able to overwrite and corrupt data and code that belongs to other threads.

The Apache MPM for the Windows platform is an example of a threaded server MPM. Both threaded and process-based servers have their own sets of advantages and disadvantages. The Apache developers created a threaded MPM named Worker MPM that allows for a mixed approach. A server can spawn different processes, each one of them containing a number of threads.

Configuring the Worker MPM

StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0

The Worker MPM is an Apache 2 MPM that allows you to combine processes and threads. You can specify the number of processes that will be created at startup by using the StartServers directive, as with the Prefork MPM. Each of the processes will have several threads; how many each will have is specified by the ThreadsPerChild directive. The number of threads in each process is fixed, but processes are created or destroyed to maintain the total number of threads between specified limits. Those limits can be configured using MinSpareThreads and MaxSpareThreads. These directives are the counterparts of the MaxSpareServers and MinSpareServers directives in process-based servers.

Apache monitors the total number of threads across all processes and creates or destroys processes accordingly. As in Prefork, MaxClients specifies the maximum number of processes. In the Worker MPM, each process has several threads in turn, so the maximum number of simultaneous clients is MaxClients times the setting of ThreadsPerChild. MaxThreadsPerChild specifies the maximum number of threads per process and can be changed between restarts. ThreadLimit specifies an upper limit that cannot be changed between restarts. The StartServers and MaxClients directives are identical to the ones described in the section named “Configuring the Prefork MPM.”

Using Alternate MPMs

--with-mpm=event
--with-mpm=perchild

Apache 2 includes a number of additional MPMs that are classified as experimental. Two of the most interesting ones are the Event MPM and the Perchild MPM. The Event MPM, present only in Apache 2.1, is a variant of the Worker MPM. In this MPM, a separate thread handles all listening sockets and keep-alive connections. This significantly increases scalability, as it allows the remaining threads to process requests instead of waiting for a client to close a connection or issue a new request. This MPM solves some of the issues described in Chapter 10, “Apache Proxy and Caching Support.” The Perchild MPM allows Apache to run different virtual hosts under separate user IDs. This can help improve security and provides an alternative to running separate server instances.

In addition to those, the Metux MPM provides an alternative to the Perchild MPM. It can be downloaded from http://www.metux.de/mpm.

Understanding Apache 2 Filters

<Directory /usr/local/apache/htdocs/>
SetOutputFilter INCLUDES;PHP
</Directory>
AddOutputFilter INCLUDES .inc .shtml

You can think about the filtering architecture in Apache as a factory assembly line. Filters are workers in the factory, and requests and responses are the items traveling in the line. Each filter processes the content and passes the result to the next filter. Filters can process the information in a variety of ways, and a number of Apache modules are implemented as filters, such as SSL, Server Side Includes, and compression. Filters can be automatically added by modules at runtime or set up in the configuration file. This example shows how to use SetOutputFilter to add two filters to all documents under a particular directory and AddOutputFilter to associate filters with particular file extensions. Additionally, AddOutputFilterByType can be used to associate filters with specific file types.

If several directives, such as AddOutputFilter and SetOutputFilter, apply to the same file, the filter lists from both directives will be merged. Input filters can be configured via the AddInputFilter, AddInputFilterBytype, and SetInputFilter directives, which have identical syntax to their output filter counterparts.

Apache 2.1/2.2 includes mod_filter, which provides increased flexibility in defining and manipulating filter chains. This can be done, for example, based on the existence of a particular HTTP header or environment variable.

Using Apache As an FTP Server

Listen 10.0.0.1:21
<VirtualHost 10.0.0.1:21>
FTP On
DocumentRoot /usr/local/apache/ftpdocs
ErrorLog /usr/local/apache/logs/ftp_error_log
<Location />
    AuthName "FTP"
    AuthType basic
    AuthUserFile /usr/local/apache/conf/htusers
    Require valid-user
</Location>
</VirtualHost>

As mentioned earlier in this chapter, Apache 2 is more than a web server—it is a generic server framework. By building a server on top of Apache, a developer can take advantage of a solid, portable infrastructure; an extension mechanism; and the possibility of using many other third-party modules that exist for Apache. That is the case for mod_ftp, which adds FTP capabilities to Apache. Most of the configuration settings, such as authentication directives, are shared with the rest of the server. You can enable FTP support simply by adding FTP On inside the appropriate Virtual Host section. Additional directives, such as FTPUmask, FTPTimeoutLogin, FTPBannerMessage, and FTPMaxLoginAttempts, allow you to configure features common with other FTP servers.

At the time of this writing, mod_ftp is in the process of becoming an official ASF project and can be downloaded from http://incubator.apache.org/projects/mod_ftp.html.

Using Apache As a POP3 Server

Listen 110
<VirtualHost *:110>
POP3Protocol on
POP3MailDrops /usr/local/apache/pop
<Directory /usr/local/apache/pop>
  AuthUserFile /usr/local/apache/conf/htusers
  AuthName pop3
  AuthType Basic
  Require valid-user
</directory>
</VirtualHost>

This module allows Apache 2 to act as a POP3 server. POP3 stands for Post Office Protocol, version 3, and is a commonly used protocol that allows mail clients (such as Outlook, Eudora, or Netscape Mail) to retrieve messages from a central server. Note that this module will not allow your mail reader to send messages. For that you will need a SMTP (Simple Mail Transfer Protocol) server such as Sendmail or PostFix. You enable support for POP3 by placing a POP3Protocol On directive inside the appropriate virtual host container. POP3MailDrops specifies the location of the user’s mailboxes. The user Apache is running as must be able to read and write to those mailboxes.

You can download mod_pop3 from http://svn.apache.org/viewcvs.cgi/httpd/mod_pop3/.

Compressing Content on the Fly

#Apache 2 mod_deflate
AddOutputFilterByType DEFLATE text/html text/plain
      text/xml
SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png)$
     no-gzip dont-vary
BrowserMatch ^Mozilla/4 gzip-only-text/html
#Apache 1.3 mod_gzip
mod_gzip_static_suffix .gz
AddEncoding gzip .gz
mod_gzip_item_include file .html$

The mod_deflate filtering module included with Apache 2 provides a filter, DEFLATE, that can compress outgoing data. Compressing can be expensive in terms of CPU, but has the advantage of minimizing the amount of data that will be transferred to the client. This is useful when clients connect to the Internet via slow links and the content can be compressed significantly, such as with HTML pages. Other content that is already compressed, such as ZIP files or JPEG images, will benefit very little (if at all) from additional compression. Of course, for content compression to work, the client must support the opposite functionality: decompression. This is true for most modern browsers.

If you know that a specific client has trouble processing compressed content of a certain type, you can set up the environment variable no-gzip by using the SetEnvIf or BrowserMatch directive. This will prevent mod_deflate from compressing the content delivered to the client, as shown in the example.

Apache 1.3 has an equivalent module, mod_gzip, that can compress dynamic and static content: http://sourceforge.net/projects/mod-gzip/.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.189.23