In terms of the main flow of this book, the Apache access, authentication, and authorization (AAA) framework falls broadly within the scope of metadata modules (Chapter 6). However, it has historically been an extremely popular area for module developers. Furthermore, it has changed significantly in Apache 2.1/2.2 compared to earlier versions. Given that this is the most substantial change since the original framework inherited from the NCSA HTTPD in 1995, it is of sufficient interest to merit its own chapter.
Before we dig into the details of the security phase in Apache’s request processing, we should perhaps take a broader look at the issue of security. Since a comprehensive discussion of security belongs in a book for system administrators—which this is not—we’ll be very brief, but we should at least set the scene for what this chapter does and does not cover.
This chapter deals specifically with determining who is permitted to access a resource or perform an operation over the Web, including the concept of login. It discusses different methods of dealing with this issue, and widely different levels of security.
This chapter explicitly does not deal with other important aspects of web security, including these issues:
mod_ssl
or mod_gnutls
.[1]mod_security
.[2]HTTP offers two levels of security for web authentication.
HTTP basic authentication is a simple, low-security method. The username: password
combination is base-64 encoded and passed over the Web. It is secure to the extent that the tokens passed are obscure and unmemorable to human readers, and will appear as gibberish to a non-computer-person such as the boss. But these tokens are trivial for a programmer to decode using, for example, Apache’s apr_base64
, and they can be reused verbatim to impersonate a user.
Digest authentication uses MD5 one-way encryption to protect passwords. This is cryptographically secure: A password cannot be reconstructed from an MD5 token, at least not without considerable resources. It is also secure against replay attacks, because passwords passed from the client to the server are constructed using a private token that is regenerated every few minutes. The downside (which is mostly historical and of little relevance today) is that digest authentication is harder to work with and carries a higher system overhead than basic authentication; also, browser support is not universal.
These methods of authentication can be supplemented by other measures, such as limited sessions with expiry times enforced by the server.
Both basic and digest authentication are associated with authentication dialog popup boxes presented by browsers when challenged (Figure 7-1). These dialogs are firmly outside the scope of a site designer.
Figure 7-1. Authentication dialog pop-up box
Sometimes we may wish to avoid this dialog-based scheme and implement alternative authentication methods. We’ll look at alternatives later in the chapter, but bear in mind that it is not possible to reproduce the level of cryptographic security of digest authentication, except by relying on nonstandard (and inevitably far less well-supported) client capabilities or by resorting to SSL/client certificates.
The term login is sometimes used interchangeably with the term authentication on the Web. Strictly speaking, this is a misnomer: Login implies a session, but HTTP is a stateless protocol and so doesn’t support sessions. Session management can be built on top of HTTP, but this requires that a session token is passed not just once at login time, but with every request. There is no way to avoid this duplication of effort.
We’ll avoid confusing authentication as such with login, but at the end of the chapter we’ll discuss session management under the title of login.
The basic premise of access control and authentication is that we may wish to permit certain operations to some users, but deny them to others. Determining who a user is and whether that entity is permitted the current operation is the business of these aaa
modules. Apache provides a number of standard modules for this purpose in the modules/aaa
directory, and a wide range of third-party modules are also available. The number of third-party modules is likely to be reduced in Apache 2.2 compared to earlier versions, because the new AAA framework reduces the amount of duplication of very similar functionality required between the various modules.
Access control was originally determined in two ways:
REMOTE_HOST
or REMOTE_ADDR
).REMOTE_USER
).These two fundamental control methods still lie at the heart of Apache AAA, but the scope of the tasks has been greatly broadened. In particular, user-based control has been generalized to concepts such as sessions managed by a cookie or URL hash.
There are three request processing hooks concerned with access control:
ap_hook_access_checker
ap_hook_check_user_id
ap_hook_auth_checker
These hooks are, respectively, responsible for three tasks:
This underlying structure is common to all Apache versions to date, and reflects the forms of configuration available to AAA modules
Host Access
Specifying an Authentication Protocol
Identifying the User
AuthUserFile /etc/apache/users
Determining Whether the Remote User Has Access
Determining Whether to Require Both or Just One of Host and User
Satisfy Any
Apache has traditionally supported two basic forms of access control:
mod_access
.mod_auth
, or any of numerous equivalents from the standard mod_auth_dbm
to a wide range of third-party options. These modules are responsible for both the check_user_id
and auth_checker
phases.mod_access
is configured using the Order
, Allow
, and Deny
directives, which specify allowed or disallowed IP addresses. mod_auth
is controlled primarily by the Require
directive, which specifies users or groups permitted access. The two modules are linked by the Satisfy
directive, which determines whether a request needs to satisfy both forms of access control or whether either one alone is sufficient.
In this framework, mod_access
works cleanly and well, but the mod_auth
family is less well specified. The basic problem with authentication is that each module has to perform several distinct tasks that would be better factored out into common functions.
HTTP distinguishes between basic and digest authentication by specifying different methods of encoding the user identification data. An authentication module has to decode the data according to the encoding used. This has left us with mod_auth_digest
as separate from mod_auth
, and other modules such as mod_auth_dbm
not supporting digest authentication at all because it doesn’t reimplement that code.
A module has first to identify the user using one of the above schemes or its own method (which could be something completely different, such as a cookie or a directory service) and then to determine whether the user is authorized for the attempted operation. That’s two separate functions—indeed, two separate request processing hooks—in a single module.
Access, authentication, and authorization in Apache 2.1/2.2 have been refactored into a four-part process, as shown in Figure 7-2.
Figure 7-2. AAA: access control, authentication, and authorization
mod_access
has been renamed to mod_authz_host
(“host-based authorization”), but is otherwise not substantially changed. It is the only standard module to use the access_checker
hook. Other modules implementing access control based on network or hardware information, such as a module implementing ARP lookup and permitting access by MAC address, would also use this hook.
Authentication is the process of reading a token from the client, and converting it from the external representation sent over the wire to Apache’s internal representation—in particular, setting the user
field of the request_rec
object. For example, mod_auth_basic
implements HTTP basic authentication by extracting a username/password pair from a base-64-encoded token sent from the client. The process of verifying a password is now handled by a separate authorization (authn)
module. The advantage of this approach is that it decouples password lookup from protocol support. Now, for example, mod_authn_dbd
has only to look up passwords in an SQL database, and it automatically supports both basic and digest authentication.
Two standard modules implement the check_user_id
hook. These are known as auth modules:
mod_auth_basic
implements HTTP basic authentication.mod_auth_digest
implements HTTP digest authentication.These two modules deal with implementing their respective HTTP protocols, as before, but differ from earlier versions in that they delegate the password lookup.
Authentication (authn) modules are helpers for the auth (user-checking) modules. The authn API is an ap_provider
, as introduced in Chapter 10. The standard Apache distribution includes the following authn modules:
mod_authn_alias
—support complex configuration options by delegating to other providersmod_authn_anon
—permit arbitrary user-supplied passwords or variants such as anon-ftp style e-mail addressesmod_authn_dbd
—look up passwords in an SQL databasemod_authn_dbm
—look up passwords in a DBM databasemod_authn_default
—a fallback to reject users if no other authn module deals with themmod_authn_file
—look up passwords in a flat file (the old htpasswd/htdigest)mod_authnz_ldap
—look up passwords in an LDAP directoryAuthorization (authz) is the decision of whether the user is authorized to carry out the attempted operation. The old mod_access
module has become mod_authz_host
, as it makes that decision based on the client host and Allow/Deny From
directives. User-based authorization uses the auth_checker
hook and grants or denies access based on the username, as set in the authentication phase.
Standard authorization modules are listed here.
mod_authz_dbd
—look up the user’s groups in an SQL database ("Require dbd-group"
) and provide hooks for login/logoutmod_authz_dbm
—look up the user’s groups in a DBM databasemod_authz_default
—a fallback to reject users if no other authz module takes any decisionmod_authnz_ldap
—look up the user’s groups in an LDAP directorymod_authz_owner
—authorization based on the system user and group of a resource requestedmod_authz_user
—implements “Require valid-user
” (allow anyone authenticated), as well as “Require user
” and “Require group
” (list of permitted users or groups, respectively)The logic of the security phase in the Apache core is shown here, in pseudocode form:
This scheme is, of course, simplified, in that any hook can divert the processing into an internal error if a hook fails or if authentication is misconfigured. Nevertheless, the fundamental logic is sound: Host-based access control always runs, but user-based control may be skipped according to the configuration. At this level, the logic is unchanged in Apache 2.2 from earlier versions.
One bit of logic needs further explanation. What does it mean to be “configured for authentication”?
This is entirely predicated on the Require
directive. If there is any Require
directive in scope in httpd.conf
or an applicable .htaccess
, then some authentication is required. Require
alone is not sufficient to configure authentication, but it is the arbiter of whether authentication is required. Require
is implemented by the server core, which exports API methods for modules to use. The function that determines whether we are configured for authentication is ap_some_auth_required
.
Apache uses three HTTP response codes to deny access in this phase:
401
(Unauthorized)403
(Forbidden)407
(Proxy Authentication Required)Response code 403
is an unconditional denial of access: There is nothing the client can do to get in. This response is what will be returned when mod_access
(now mod_authz_host
) denies access based on a Deny From
directive.
Response codes 401
and 407
tell the client that access was denied, but would be allowed if the client had sent the appropriate credentials (typically a username and password). The HTTP protocol requires that a 401
or 407
request must include an authentication challenge, which tells the client the authentication protocol to use.
This challenge, in turn, causes the client to display a username/password dialog when that client is a browser. Here is a typical response:
The crucial header here is the challenge WWW-Authenticate
. It invites the browser to try again, using HTTP basic authentication. The realm is displayed by most browsers in a login dialog box, which varies a little between browsers but is basically the same.
A 407
response replaces WWW-Authenticate
with Proxy-Authenticate
, but is otherwise exactly the same.
The authentication method is part of the client/server communication protocol and is, therefore, constrained to be a method supported by browsers. On the Web, that means we have two options: Basic and digest authentication are implemented by mod_auth_basic
and mod_auth_digest
, respectively. Although we could implement a different method in Apache, it won’t be useful (except perhaps within a specialist private network) because it will generate an authentication challenge that browsers won’t understand and respond to.
If we are determined to implement a different “login” scheme, we can either “fake” HTTP basic authentication or avoid it altogether, provided we avoid sending a 401
or 407
response to the client.
Let’s look at a nonstandard authentication task. Suppose we wish to develop a module that permits anonymous access on selected days specified by a server administrator, while requiring normal username/passwords to access the system on other days. Setting aside other possible implementations of this scheme, let’s develop it using the authn/authz framework, which will integrate fully with standard authenticated access. Our goal is to create an authentication dialog appropriate for all users, so that users having normal username/password credentials can freely use either those data or anonymous access (using the name of the day as the username) on open days. We’ll use the common convention for weekdays, and accept but ignore anything beyond the first three characters.
The pivotal control is the Require
directive. We’ll need a new keyword for our method. Let’s use “day
”. Thus our configuration takes the following form:
Require day saturday sunday
Because we’re integrating the new framework with normal authentication, we need to piggyback onto either basic or digest authentication. That means we want an authn provider to “verify” a “password” for the day. We’ll allow a server administrator to configure the system to ignore passwords altogether or require today’s date as a password. This approach is simpler than the normal authn function of looking up a password for the user.
This function needs to be wrapped in an ap_provider
:
We register this in our module’s register_hooks
function:
This provider will work with the standard mod_auth_basic
module to implement the check_user_id
hook and set r->user
. We’ll leave digest authentication for the time being.
mod_auth_basic
, together with the authn provider developed in Section 7.6.1, will set the day’s name as r->user
and mark a “password” as accepted. But it won’t check whether the day is, in fact, one for which access is permitted. To perform this task, we’ll need an authorization (authz) handler. This is what actually implements our “Require day
” directive:
We need to register this handler as an auth_checker
. We also need to be careful here: We want to go before mod_authz_user
, so that a “Require valid-user
” directive doesn’t just automatically pass us. We do so by explicitly declaring that mod_authz_user
comes after us, whenever both modules are active. When put together with our authn provider, our register hooks function becomes
The configuration of this module is extremely simple; all we have to manage is the administrator choice of whether to require the date as the password. The remainder of the module is trivial:
Note that our module could (and normally would) have been two separate modules, as is the usual practice with the standard authentication and authorization modules. Of course, then we would have had to use a different mechanism for our authn provider to set a flag for the authz handler, or we would have had to implement an alternative logic.
The configuration of our little module itself is trivial. But the point of it was to integrate our scheme with standard authentication. So how does that work?
First, let’s configure for day-based anonymous authentication alone:
Now suppose we have a large number of users having standard username/password access seven days a week, with their passwords being held in a DBM database. We want to combine this access method with our scheme allowing anonymous day-based authentication at weekends. This process is almost as simple, but raises some subtleties:
Only one AuthName
appears in the challenge, so for our normal users it would be misleading to call it “Weekend Access.” We can, of course, call it anything we like—ideally something that describes the service being accessed.
The first interesting line here is AuthBasicProvider
. This line can list multiple providers, which will run in order. We put dbm
ahead of day
, so our provider doesn’t risk catching normal users (as noted in the comments).
The second point is the two Require
lines. Their order is immaterial, as our authorization handler (rather than anything in the core) specifies the order in which these schemes run. Our handler runs first and deals with anonymous users, but passes any other users through to the module that implements the other Require
directive.
In the preceding example, we were able to fake basic authentication. This is a reasonably tried-and-tested approach: For example, cookie authentication modules and mod_auth_anon
have used similar techniques since the 1990s. Digest authentication is more complex, and can be faked only if we know the actual password sent by the client.
Recall our authentication provider from mod_authn_day
:
This is an instance of struct authn_provider
, defined in mod_auth.h
:
Whereas the first function check_password
serves to verify a supplied password for the username, the second serves only to look up an MD5 hash and return it for mod_auth_digest
to process. This approach works well when we are performing a simple lookup, and we can even fake it for mod_authnz_day
(provided we drop the option of ignoring the password altogether). Of course, we can’t just look up a password, because it’s one-way encrypted and we can’t extract it.
Of course, this manufactured example is not typical. The usual function of an authn provider is to look up a password or hash from an authentication source such as a password file or directory, and most authorization providers implement group lookup for a user. Readers interested in examples of this functionality should look at the Apache source in /modules/aaa/
(this author recommends mod_authn_dbd
and mod_authz_dbd
, which he wrote, or mod_authn_file
and mod_authz_user
, which are the direct successors to the mod_auth
of older Apache versions).
The authentication dialog presented to the user by a typical browser is strongly reminiscent of logging in. However, this is an illusion: Login implies a session, but authentication doesn’t give us one. In particular, there is no logout or relogin, unless we build it ourselves. Because HTTP is stateless, we cannot simply log a client out by unsetting or expiring a cookie or application-level token; a user can easily forge that data to access the system after logout. Neither should we just expire sessions on the server and invalidate a client’s credentials. Although this approach secures the server, it is deeply unfriendly and confusing to deny access that the user legitimately believes to be authorized. We need to manage sessions twice over: once on the server, once on the client. The general Apache framework presented earlier in this chapter supports neither of those concepts, so we need to implement it ourselves.
Although the general framework doesn’t support sessions and login, one module that does support it is mod_authz_dbd
, when used in conjunction with mod_authn_dbd
for password lookup. The basis for this is that the users table in the authentication database should contain an additional “logged in” field, which is updated whenever a user logs in or out. Then mod_authn_dbd
can use a query of the form
SELECT password FROM users WHERE username = %s AND login = 1
to allow access only when the user is logged in.
mod_authz_dbd
supports this scheme by implementing custom Require
variants,
which cause it to execute SQL queries of the form
respectively.
This provides us with a basis for session management, but we’re not there yet. Because authentication precedes authorization, the user is authenticated when the query runs, and the scheme basically works. However, for precisely the same reason, it’s not secure. If a user has logged out but the browser still has the credentials, then hitting the login URL (e.g., by unwinding a browser history stack and using force-refresh) will automatically log the user in again!
If login is to be secure, we need an alternative method to check the user’s credentials. For example, we could use an HTML form for login, with a handler in the content generator phase checking a one-time token (to prevent replay) together with the username and password entered before setting the login flag in the database. This can be coupled with setting the ErrorDocument
for 401
responses to the login form.
The other part of the task is managing the client session. For this purpose, mod_authnz_dbd
exports an optional hook that is run whenever a user successfully logs in or out (i.e., executes dbd-login
or dbd-logout
), as described in Chapter 10. This hook can be used to perform client-side session management such as setting and unsetting a login cookie.
Sometimes we may wish to avoid browser built-in authentication dialogs altogether. Since the dialog is automatically triggered by an HTTP 401
or 407
response, we must avoid sending these codes to the client. It is no longer sufficient even to send a login form as ErrorDocument
. Instead, we must either (a) present the unauthenticated user with a login form immediately, or (b) redirect the unauthenticated user to a login form with an HTTP 302 response.
In either case, we should embed the URL that the user originally tried into the challenge response, so that we can send the user back to the original resource after successful authentication.
The handler for the login form is then responsible for verifying the credentials entered, setting the client’s credentials and (if a session is required) server-side session information, and redirecting the user back to the resource that was originally requested.
Once we have set the client-side credentials, we need to note that they are not in a standard HTTP form (only the authentication dialog can give us that). To use the token, we need to check for it ahead of the authentication phase, and set up “faked” basic authentication from it. The header_parser
hook is the appropriate place for this operation. Let’s see an example implementing it with a login cookie. The basic logic is
When using cookies for authentication (or anything else), take care to deal with users who have cookies disabled, either in the browser or in other privacy/security software (which the end user may not even be aware of). A surprisingly common, but serious, error is to send such users into a loop that sets a cookie, then on receiving a request without the cookie, redirects the user back to the set-cookie handler, and repeats ad infinitem. For general-purpose Web use, you should provide a cookie-free alternative. Failing that, send the cookie-free user to a page that explains why the user can’t log in and what he or she may be able to do about it.
This chapter introduced the Apache 2.2 AAA framework and demonstrated the basics of writing authn and authz modules. The following topics were covered:
Now that you’ve seen this trivial case, you are equipped to read and understand the more complex authn/authz modules in the Apache distribution (/modules/aaa/
) and to write your own. However, in view of the more modular framework, it is likely that fewer new authentication modules should be required for Apache 2.2 than for earlier versions. For example, the existence of the DBD authentication modules mod_authn_dbd/mod_authz_dbd
obsoletes all existing modules for authenticating against an SQL database such as MySQL, PostgreSQL, or Oracle.
18.190.156.93