This chapter provides a quick introduction to the Apache web server, its architecture, and the differences between major versions (1.3, 2.x). It explains how to download and compile Apac he from source or using binary packages, how to enable or disable common modules, the layout of server files, and the structure and syntax of the server configuration files. It also covers how to start/stop/restart Apache and the minimum configuration changes required to get Apache up and running.
Apache is the most popular web server on the Internet, with around 68% of the market share, according to Netcraft (http://www.netcraft.com).
Apache is
Portable—. It runs on Linux, Windows, Mac OS X, and many other operating systems.
Flexible—. It has a modular, extensible architecture and can be configured in a variety of ways.
Open Source—. You can download and use Apache for free. Availability of the source code means you can create custom builds of Apache.
There are two main versions of Apache in widespread use today: the 1.3 series and the 2.x series.
Apache 2.0 includes a number of improvements and features over Apache 1.3; however, it is incompatible with all modules written for Apache 1.3. As a basic rule of thumb, use Apache 2.x if you
Run Apache 1.3 if you
Need to run in-house or third-party modules that have not yet been ported to Apache 2.x.
Need to run software such as PHP with nonthread-safe extensions (though the same code will probably run equally well on Apache 2.0 with the prefork MPM).
Are already familiar with Apache 1.3 and have no specific need to upgrade.
If you are running a Linux system, chances are that Apache is already installed. If your distribution uses the rpm package management system, you can check to see whether Apache is installed with the preceding commands. There are several different commands because not all distributions use the same name for packages.
In most Unix-like systems, including Mac OS X, you can also directly check whether the Apache binary is installed with one of
httpd -v /usr/sbin/httpd –v
If found, it should return something similar to
Server version: Apache/2.0.54 Server built: Apr 16 2005 14:25:31
You can get an even more detailed response using httpd -V
.
On Windows systems, you can check whether Apache is installed in the Add/Remove Programs section of the Control Panel. The installation path is under C:Program FilesApache Group
.
tar xvfz apache_1.3.33.tar.gz cd apache_1.3.33 ./configure --prefix=/usr/local/apache --enable- shared=max make make install
You can use the package management tools of your operating system to install pre-built versions of the server. This is often preferred because they integrate well with the existing file system layout and with other vendor-provided packages. It is, however, important to know how to build your own version of Apache from source code. This will allow you, for example, to build a server customized to your needs as well as to quickly apply security patches as they are released.
The first step is to visit the http://httpd.apache.org website and download the appropriate source tarball. When referring to 1.3-specific functionality, the rest of the book assumes you installed Apache 1.3.33. That is the most recent version in the 1.3 series at the time of this writing. The source tarball will be named apache_1.3.33.tar.gz.
You can now uncompress, configure, compile, and install Apache with the commands in the preceding listing.
The option --prefix
indicates the path under which the server will be installed and --enable-shared=max
activates loadable module support. Loadable module support is necessary to extend or customize the functionality later on without having to recompile the server.
You may find Apache releases with a .tar.bz2 ending. This means that they were compressed with the bzip2 tool. While slower to compress and decompress, this format can reduce the size of the distribution files and is now commonly used by many open source projects. To decompress this kind of file, you can do one of the following on most modern Linux systems:
bunzip2 < apache_1.3.33.tar.bz2 | tar xvf - tar xvfj apache_1.3.33.tar.bz2
tar xvfz apache_2.0.54.tar.gz cd apache_2.0.54 ./configure --prefix=/usr/local/apache --enable-so - -enable-mods-shared=most make make install
The process is similar to the one described earlier for 1.3, though the options to enable support for loadable modules are different.
Installing Apache on Windows is even easier than on Unix. The installation process for both Apache 1.3 and 2.x is quite similar. You simply need to download and launch the binary installer package from http://httpd.apache.org.
The wizard will ask you where to install the server and a few other pieces of information:
The network domain name
The fully qualified domain name of the server
The administrator’s email address
The server name will be the name that your clients will use to access your server. The email address will be displayed in error messages so that visitors know how to contact you if there is a problem. You will also be offered the choice of running Apache as a service. That option is appropriate if you require Apache to always run when the server boots up, for example. Otherwise, you can always start Apache from the command line.
The following table provides the default location of the main Apache configuration file on multiple operating systems. Notice that since versions 1.3 and 2 of the server may need to coexist side by side, the name of the file may be different for each version.
Table 1.1. The Location of httpd.conf on Different Systems
Configuration File Location | Platform |
---|---|
/etc/httpd/httpd.conf/etc/httpd/httpd2.conf | Suse, Mandrake, olderRed Hat systems |
/etc/httpd/conf/httpd.conf/etc/httpd/conf/httpd2.conf | Newer Red systems, Fedora Core |
/usr/local/apache2/conf/usr/local/apache/conf | When compiling from source as explained earlier in this chapter |
c:Program FilesApache GroupApache2confhttpd.conf | Windows |
c:Program FilesApache GroupApache2confhttpd.conf | |
/private/etc/httpd/httpd.conf | Mac OS X |
The main Apache configuration file is called httpd.conf. The location of this file varies depending on whether you are using Windows or Linux, and whether you compiled Apache from source code or used the binary provided by your distribution. Check the locations suggested in the previous table.
Apache uses plain text files for configuration. The configuration files can contain directives and containers (also known as “sections”). You can place comments inside the file by placing a hash mark (#) at the beginning of a line. Comment lines will be ignored by Apache. A directive can span several lines if you end the previous line with a backslash character ().
Directives control every aspect of the server. You can place directives inside containers, so they only apply to content served from a certain directory or location, requests served by a particular virtual host, and so on.
When an argument to a directive is a relative path, it is assumed to be relative to the server installation path (server root). For example, if you installed Apache from source as described earlier in this chapter, the server root is /usr/local/apache
or /usr/local/apache2
. You can change the default with the ServerRoot
directive.
It is sometimes useful to split the server configuration into multiple files. The Include
directive allows you to include individual files, all of the files in a particular directory, or files matching a certain pattern, as shown in these examples. If a relative path is specified, then it will be considered relative to the path specified by the ServerRoot directive.
This is usually done by Linux distributions that distribute Apache modules as RPMs. Each one of those packages can place its own configuration file in a specific directory, and Apache will automatically pick it up.
To start, stop, or restart Apache, you can issue any of these commands. Depending on how you installed Apache, you may need to provide an absolute path to apachectl
, such as /usr/sbin/apachectl
or /usr/local/apache/bin/apachectl
. Although it is possible to control Apache on Unix using the httpd binary directly, it is recommended that you use the apachectl tool. The apachectl support program is distributed as part of Apache and wraps common functionality in an easy-to-use script.
On Unix, if Apache binds to a privileged port (those between 1–1024), you will need root privileges to start the server.
If you make some changes to the configuration files and you want them to take effect, it is necessary to signal Apache that the configuration has changed. You can do this by stopping and starting the server, by sending a restart signal, or by performing a graceful restart. This tells Apache to reread its configuration. To learn the difference between a regular restart and a graceful restart, please read the next section.
As an alternative to using the apachectl
script, you can use the kill
command directly to send signals to the parent Apache process. This is explained in detail in the “Alternate Ways of Stopping Apache” section in Chapter 2.
On Windows, you can signal Apache directly using the apache.exe executable:
You can access shortcuts to these commands in the Start menu entries that the Apache installer created. If you installed Apache as a service, you can start or stop Apache by using the service management tools in Windows as follows: In Control Panel, select Administrative Tasks, and then click on the Services icon.
Additionally, Apache 2.0 can place a program, Apache Monitor, in the system tray. It is a simple GUI that you can use to start and stop the server directly or as a service. It is either installed at startup or you can launch it from the Apache entry in the Start menu.
Apache needs to know in which IP addresses and ports to listen for incoming requests. You can specify those using the Listen
directive. The Listen
directive takes a port to listen to and (optionally) an IP address. If no IP address is specified, Apache will use all available IP addresses. In this example, Apache will listen for requests on port 80 at the IP address 192.168.200.4 and on port 8080 at all available addresses. You can use multiple Listen
directives to specify multiple IP addresses and ports to listen to.
You can also use Port
to specify the port Apache listens to, but if a Listen
directive is specified, the Port
directive will not have an effect. Please refer to Chapter 4 for information on how the Port
directive is also used for constructing self-referential URLs.
There is more configuration involved when you need to support name-based virtual hosts. Please see Chapter 5 for details.
In addition to Listen
, Apache 1.3 provides a related directive, BindAddress
. It is obsolete and its use is discouraged.
You can specify the user and group Apache runs under with the User
and Group
directives. For security reasons, it is not a good idea to run any kind of server as root because a configuration or programming flaw can expose the whole server. When Apache is run as root, it will perform all the actions that require superuser privileges (such as binding to port 80) and then it will serve the actual requests as the user and group specified in the Apache configuration. This user ID will typically have reduced privileges and capabilities.
Sometimes Apache needs to construct self-referential URLs. That is, it needs to construct a URL that refers to the server itself. For example, it may need to redirect a request to a different page or print the website address at the end of a generated error page. By default, this is done using the domain specified with the ServerName
directive. Please see Chapter 2 for details on how to use UseCanonicalName
and Port
to control this behavior.
If no server name is present, Apache will try to infer a valid server name by performing a reverse DNS lookup on the server’s IP address. If the DNS server is not properly set up, this can take a long time and the requester may have to wait through a rather long pause.
AliasMatch /favicon.ico /usr/local/apache2/icons/site.ico
Many modern browsers, such as Internet Explorer, Mozilla, and Konqueror, enable you to associate an icon with a bookmark. When you bookmark a page, the browser sends a request for a favicon.ico
file to the same directory containing the bookmarked document. The favicon.ico
file is an icon in the Windows icon format.
You can use the AliasMatch
directive to redirect all requests for a favicon.ico
to a single location containing the icon for your site, as shown in this example.
This command lists the compiled-in modules in your server binary and should return something similar to the following:
Compiled in modules: core.c prefork.c http_core.c mod_so.c
If you compiled Apache with loadable module support, your modules will be built as shared-libraries and placed by default in a directory named modules/
(Apache 2.x) or libexec/
(Apache 1.3). To take a look at what shared modules are loaded into the server at runtime, you will need to take a look at the httpd.conf
file and look for the appropriate LoadModule
directives. With Apache 2.1/2.2, this is not necessary, as httpd -M
will list all modules including those loaded at runtime.
You can enable/disable individual modules at compile time using the --enable-
module
and --disable-
module
options of the configure
command. The preceding example explains how to do so for the mod_status module distributed as part of Apache.
If your server has been compiled with loadable module support, you can disable a module by simply commenting the line that loads the module in the server:
#LoadModule mod_status modules/mod_status.so
In Apache 1.3, you can clear the list of active modules, including those compiled-in, using a ClearModuleList
directive. In that case, you will need to use an AddModule
directive for each module you want to use. The functionality provided by ClearModuleList
is not available in Apache 2.x
If you disable a module, make sure you remove it from your htttp.conf file directives provided by that module or include them inside a <ifModule> section as shown. Otherwise, the server may fail to start.
<ifModule mod_status.c> ExtendedStatus On </ifModule>
Yes, you can add modules to Apache without recompiling, but only if mod_so is already compiled into your server. To find out whether mod_so is compiled into your Apache installation, please read the earlier section, “Discovering the Modules Available on the Server.”
You can build a module from sources using apxs, which is a tool for building and installing extension modules that is included by default with Apache.
To compile and install a module with apxs, you just need to change your current directory to the one containing the module and type the following:
# apxs –c mod_usertrack.c
This will compile the module. You will need now to copy the module to the Apache modules directory and edit the configuration file. You can let apxs automatically handle all this with
# apxs –cia mod_usertrack.c
This approach will work for simple modules, such as those included with the Apache distribution. For complex third-party modules, such as PHP or mod_python, there is usually a --with-apxs
or --with-apxs2
switch to pass to the configure script.
If you have a binary version of the module available, you don’t need to do any of these apxs-related steps.
This may be the case if you already compiled many of the optional modules when building the server or the module is already provided as part of your Linux distribution or Windows installation package.
If you are using Apache 1.3, you can add the new module to the server by editing your httpd.conf file and adding the following lines:
LoadModule usertrack_module libexec/mod_usertrack.so AddModule mod_usertrack.c
If you are using Apache 2.2, you will only need to add the LoadModule
directive, in this case using modules/
instead of libexec/
as the directory where the loadable modules are installed by default.
By default, Apache serves content from the htdocs/
directory (which historically stands for HTML documents) in the installation directory. You can place documents there and they will automatically appear in the URL space of the document. For example, if you create a directory inside htdocs named foo and place the file bar.html inside it, it will be accessible from the outside as
http://www.example.com/foo/bar.html
You can change the location for the documents directory with the DocumentRoot
directive, as shown. If a relative path is specified then it will be considered relative to the path specified by the ServerRoot
directive.
You don’t necessarily need to place your content under the document root directory. One of the strengths of Apache is that it provides a number of powerful and flexible mechanisms for mapping URLs requested by clients into files on disks or resources provided by modules. Please see Chapter 4 for details.
The following directive containers are the default containers used in Apache configuration files.
<VirtualHost>
—. A VirtualHost
directive specifies a virtual server. Apache enables you to host different websites with a single Apache installation, as described in Chapter 5.
<Directory>
and <DirectoryMatch>
—. These containers apply directives to a certain directory or group of directories in the file system. The DirectoryMatch
container allows regular expression patterns to be used as arguments.
<Location>
and <LocationMatch>
—. Applies directives to certain requested URLs or URL patterns. They are similar to Directory
containers.
<Files>
and <FilesMatch>
—. Similar to Directory
and Location
containers, Files
sections apply directives to certain files or file patterns.
These are not the only directive containers available. Modules, such as mod_proxy, may provide their own containers, as explained in Chapter 10. See also Chapter 8 for details on containers that limit access based on HTTP methods.
Directory
, Files
, and Location
sections can also take regular expression arguments by preceding them with a ~. Regular expressions are strings that describe or match a set of strings, according to certain syntax rules. For example, the following regular expression will match all requests asking for an image file with a .jpg or .gif extension: <Files ~ ".(gif|jpg)">
. However, the DirectoryMatch
, LocationMatch
, and FilesMatch
directives are preferred for clarity. You can learn more about regular expressions at http://en.wikipedia.org/wiki/Regular_expression.
Apache provides support for conditional containers. Directives enclosed in these containers will be processed only if certain conditions are met.
<IfDefine>
—. Directives in this container will be processed if a specific command-line switch is passed to the Apache executable. In the following example, the command-line switch should be –DSSL
. Similarly, you can negate the argument with a “!”, as in <IfDefine !SSL>
, if you want the directives to apply if the switch was not passed.
<IfModule>
—. Directives in an IfModule
section will be processed only if the module passed as an argument is present in the web server. The default Apache configuration file includes such examples for different MPMs modules.
For example, in the httpd.conf file, you would see
<IfDefine SSL> LoadModule ssl_module modules/mod_ssl.so </IfDefine>
And you would enable it at the command line like this:
# httpd -DSSL
18.226.34.197