In the Web Application flow (also known as the Authorization Code
flow), the resource owner is first redirected by the application to the
OAuth authorization server at the API provider. The authorization server
checks to see if the user has an active session. If she does, the
authorization server prompts her for access to the requested data. After she
grants access, she is redirected back to the web application and an
authorization code is included in the URL as the code
query parameter:
http://www.example.com/oauth_callback?code=ABC1234
Because the code
is passed as a
query parameter, the web browser sends it along to the web server that is
acting as the OAuth client. This authorization code is then exchanged for an
access token using a server-to-server call from the application to the
authorization server. This access token is used by the client to make API
calls.
Sound confusing? Figure 2-1 shows the flow step-by-step, based on a diagram from the specification.
The Authorization Code flow should be used when
Long-lived access is required.
The OAuth client is a web application server.
Accountability for API calls is very important and the OAuth token shouldn’t be leaked to the browser, where the user may have access to it.
The Authorization Code flow does not expose the access token to the
resource owner’s browser. Instead, authorization is accomplished using an
intermediary “authorization code” that is passed through the browser. This
code
must be exchanged for an access
token before calls can be made to protected APIs. The exchange process
only succeeds if a correct client_secret
is passed with the request,
ensuring confidentiality of the access token as long as client security is
maintained. Unlike with the Implicit flow described in Chapter 3, this confidentiality also extends to the
resource owner, meaning API requests made with the access token are
directly attributable to the client and its developers. Perhaps most
importantly—because the access token is never sent through the browser—
there is less risk that the access token will be leaked to malicious code
through browser history, referer headers, JavaScript, and the like.
Although there is less chance of the access token leaking because it’s not exposed to the browser, many applications using this flow will store long-lived refresh tokens in the application’s database or key store to enable “offline” access to data. There is additional risk when an application requires long-lived offline access to data, as this creates a single point of compromise for accessing data belonging to many users. This doesn’t exist with other flows, such as the flow for client-side web applications (see Chapter 3). Even with this additional risk, many websites will choose to use “offline” data access because their application architecture makes it difficult to interact with the user’s browser to obtain new access tokens.
Let’s take an example of a payroll application. The payroll application wants access to update a manager’s task list to remind the manager to approve timesheets. By placing these reminders in the manager’s task list, which the manager uses every day, it’s much more likely that employees will get paid on time, reducing the number of angry employees and time-consuming calls to the HR department.
The user experience in the most common case is very simple:
Payroll application lets the manager know that it’s asking for access to modify her tasks, and redirects her over to the task list app’s OAuth authorization server (see Figure 2-2).
The OAuth authorization server used by the task list app’s API prompts the user to grant permission for the payroll application to update her tasks (see Figure 2-3).
After the user has approved, she is redirected back to the payroll application, which now has access to the tasks (see Figure 2-4).
After registering your app (see Developer and Application Registration) with the API provider and obtaining an OAuth client ID and client secret, it’s time to start writing code! Let’s go through each step of the flow and show how the protocol works. We’ll use PHP as the example programming language and the Google Tasks API along with Google’s OAuth 2.0 authorization server.
Although we’ll write the PHP code using the raw OAuth protocol, many API providers distribute client libraries for accessing their services. These libraries abstract away some of the details of implementing OAuth 2.0 and make it easier for developers. You can find information on Google’s PHP library, which works with Google Tasks, Google+, and many other Google APIs, at code.google.com.
Since the OAuth flow involves directing your users to the website of the API provider to obtain authorization, it’s a best practice to let them know in advance what will happen. You can do this by displaying a message, along with a link (the “Add tasks to your Google Tasks” link in Figure 2-2).
After the user initiates the flow, your application will need to send the user’s browser to the OAuth authorization page (as seen in Figure 2-3). This can be done either by sending the main browser window directly to the authorization endpoint or by creating a pop up. On this page, the API provider will present the user with a request to approve the application’s ability to access the user’s data. Of course, the user needs to already be signed in to the API provider, or they will be prompted to authenticate before being asked to grant access to their data.
You can find the URL for the OAuth authorization endpoint in the API provider’s documentation. For Google Tasks (and all other Google APIs using OAuth 2.0), the authorization endpoint is at
https://accounts.google.com/o/oauth2/auth
You will need to specify a few query parameters with this link:
client_id
The value provided to you when you registered your application.
redirect_uri
The location the user should be returned to after they
approve access for your app. For this example, the application
will use
https://payroll.saasyapp.com/oauth_response.php.
The value used for the redirect_uri
typically
needs to be registered in advance with the provider.
scope
The data your application is requesting access to. This is
typically specified as a list of space-delimited strings, though
Facebook uses comma-delimited strings. Valid values for the
scope
should be included in the
API provider documentation. For Google Tasks, the scope
is https://www.googleapis.com/auth/tasks
.
If an application also needed access to Google Docs, it would
specify a scope
value of
https://www.googleapis.com/auth/tasks
https://docs.google.com/feeds
.
response_type
code
for the server-side
Web Application flow, indicating that an authorization code
will be returned to the application
after the user approves the authorization request.
state
A unique value used by your application in order to prevent cross-site request forgery (CSRF) attacks on your implementation. The value should be a random unique string for this particular request, unguessable and kept secret in the client (perhaps in a server-side session).
Here’s what the PHP code may look like:
<?php session_start(); // Generate random value for use as the 'state'. Mitigates // risk of CSRF attacks when this value is verified against the // value returned from the OAuth provider with the authorization // code. $_SESSION['state'] = rand(0,999999999); $authorizationUrlBase = 'https://accounts.google.com/o/oauth2/auth'; $redirectUriPath = '/oauth2callback.php'; // For example only. A valid value for client_id needs to be obtained // for your environment from the Google APIs Console at // http://code.google.com/apis/console. $queryParams = array( 'client_id' => '240195362.apps.googleusercontent.com', 'redirect_uri' => (isset($_SERVER['HTTPS'])?'https://':'http://') . $_SERVER['HTTP_HOST'] . $redirectUriPath, 'scope' => 'https://www.googleapis.com/auth/tasks', 'response_type' => 'code', 'state' => $_SESSION['state'], 'approval_prompt' => 'force', // always request user consent 'access_type' => 'offline' // obtain a refresh token ); $goToUrl = $authorizationUrlBase . '?' . http_build_query($queryParams); // Output a webpage directing users to the $goToUrl after // they click a "Let's Go" button include 'access_request_template.php'; ?>
In addition to the standard OAuth query parameters, you’ll notice we’ve included a few which are specific to Google’s implementation:
Use force
to indicate that we want the
user prompted for approval each time the user visits the
application. You can also use auto
to indicate
that the user will only see the approval request the first time
this application requires it.
Use offline
to indicate that the
application needs access to user data while the user is not at the
keyboard. This results in a refresh token being issued when the
user explicitly approves granting access to this app. If
online
is used, no refresh token will be
issued.
Some enterprise API providers have special provisions to handle auto-approval of OAuth 2.0 grants for an individual user if an IT administrator of the user’s organization has previously approved access for an application. In this scenario, the application will redirect the user’s browser to the authorization server, but the user will never be prompted to approve access. Instead, the user will be immediately redirected back to the application with an authorization code, as described below in Step 2: Exchange authorization code for an access token. Salesforce provides this option as “no user approval required” on their control panel page to define Remote Access Applications.
If all request parameters are valid and the user approves the
data access request, the user will be redirected back to the
application at the URL specified as the redirect_uri
.
However, if one of the request parameters is invalid, an error
condition exists. If there is an issue with the redirect_uri
, client_id
, or other request information, the
authorization server should present an error message to the user and
not redirect the user back to the application.
In the case that the user (or authorization server) denies the
access request, an error response will be generated, and the user will
be redirected to the redirect_uri
with a query parameter called error
indicating the type of error as access_denied
. Additionally, the server can
include an error_description
message and/or an error_uri
indicating the URL of a web page containing more information about the
error.
While access_denied
is the
most likely error response your application will need to handle, there
are other error types defined in the OAuth 2.0 specification as
well:
invalid_request
The request is missing a required parameter, includes an unsupported parameter value, or is otherwise malformed.
unauthorized_client
The client is not authorized to request an authorization code using this method.
unsupported_response_type
The authorization server does not support obtaining an authorization code using this method.
invalid_scope
The requested scope is invalid, unknown, or malformed.
server_error
The authorization server encountered an unexpected condition that prevented it from fulfilling the request.
temporarily_unavailable
The authorization server is currently unable to handle the request because of a temporary overloading or maintenance of the server.
In the case that no error occurs during the approval process, the
authorization server will redirect the user back to the application at
the URL specified as the redirect_uri
. In this example, the user will
be redirected back to https://payroll.saasyapp.com/oauth2callback
.
When the user has granted access, two query parameters will be included by the authorization server in the redirect back to the web application:
code
The authorization code, indicating that the user has approved the request for access
state
The value of the state
parameter passed in the initial request to the authorization
server
The state
value should be
compared against the value generated in Step 1 above. If the values do
not match, it’s possible a malicious user is attempting to perform a
cross-site request forgery attack on the application, so the OAuth flow
should not be continued.
Take, for example,
https://payroll.saasyapp.com/oauth2callback?code=AB231DEF2134123kj89&state=987d43e51a262f
The application needs to exchange the code
for an OAuth access token to make API
requests. If you’re using a client library for OAuth, this exchange will
typically happen behind the scenes by the library. However, if you’re
not using a library, you’ll need to make a HTTP POST request to the
token endpoint. The following parameters need to be passed in the
request:
code
The authorization code passed to the application
redirect_uri
The location registered and used in the initial request to the authorization endpoint
grant_type
The value authorization_code
, indicating that
you’re exchanging an authorization code for an access token
This HTTP POST needs to be authenticated using the client_id
and client_secret
obtained during application
registration. There are two primary ways to handle the authentication of
the request defined in the specification: include a HTTP Basic Authorization
header (with the client_id
as the username, and the client_secret
as the password) or include the
client_id
and client_secret
as additional HTTP POST
parameters.
A typical Authorization
header
looks like this:
Authorization: Basic MDAwMDAwMDA0NzU1REU0MzpVRWhrTDRzTmVOOFlhbG50UHhnUjhaTWtpVU1nWWlJNg==
Because using HTTP Basic access authentication was a later
addition to the OAuth 2.0 specifications, it is not yet supported by
many providers. Instead, the HTTP POST parameter mechanism must be used.
The following additional POST parameters must be passed alongside the
code
and state
:
client_id
The value provided to you when you registered your application
client_secret
The confidential secret provided to you when you registered your application
If the request is properly authenticated and the other parameters are valid, the authorization server will issue and return an OAuth access token in a JSON-encoded response:
access_token
A token that can be used to authorize API requests
token_type
The type of access token issued, often “bearer,” but the set of potential values is extensible
The access token may be time-limited, in which case some additional information may be returned:
expires_in
The remaining lifetime of the access token, in seconds
refresh_token
A token that can be used to acquire a new access token after the current one expires
The JSON-encoded response looks like this:
{ "access_token" : "ya29.AHES6ZSzX", "token_type" : "Bearer", "expires_in" : 3600, "refresh_token" : "1/iQI98wWFfJNFWIzs5EDDrSiYewe3dFqt5vIV-9ibT9k" }
Because the OAuth specification is still in development, some API providers who haven’t caught up with the latest specification may format their responses differently. Facebook, for instance, returns a form-encoded (& delimited) response.
Here’s example code for exchanging the authorization code for an access token in PHP:
<?php session_start(); include 'http_client.inc'; $code = $_GET['code']; $state = $_GET['state']; // Verify the 'state' value is the same random value we created // when initiating the authorization request. if ((! is_numeric($state)) || ($state != $_SESSION['state'])) { throw new Exception('Error validating state. Possible CSRF.'), } $accessTokenExchangeUrl = 'https://accounts.google.com/o/oauth2/token'; $redirectUriPath = '/oauth2callback.php'; // For example only. Valid values for client_id and client_secret // need to be obtained for your environment from the Google APIs // Console at http://code.google.com/apis/console. // Also, these values should not be hard-coded in a production application. // Instead, they should be loaded in from a configuration file or secure keystore. $accessTokenExchangeParams = array( 'client_id' => '240195362.apps.googleusercontent.com', 'client_secret' => 'hBMLD98Zi4wiqmiwmqDq', 'grant_type' => 'authorization_code', 'code' => $code, 'redirect_uri' => (isset($_SERVER['HTTPS'])?'https://':'http://') . $_SERVER['HTTP_HOST'] . $redirectUriPath ); $httpClient = new HttpClient(); $responseJson = $httpClient->postData( $accessTokenExchangeUrl, $accessTokenExchangeParams); $responseArray = json_decode($responseJson, TRUE); $accessToken = $responseArray['access_token']; $expiresIn = $responseArray['expires_in']; $refreshToken = $responseArray['refresh_token']; $_SESSION['access_token'] = $accessToken; // Storing refresh token in the session, and using approval_prompt=force for // simplicity. Typically the fresh token would be stored in a server-side database // and associated with the user's account. This would eliminate the need for // prompting the user for approval each time. $_SESSION['refresh_token'] = $refreshToken; header('Location: /get_data.php'), ?>
Now that the app has an access token, the application can respond to the user to thank them for granting authorization, and remind them what features the access will enable. The application can now access the APIs directly through the lifetime of the access token or until the access is revoked. In the case a refresh token is provided, the application can continue to access the APIs offline without user interaction.
The access token and the refresh token should be kept secret at all times and they should not be exposed to any user, including the resource owner. Typically the refresh token is stored securely in a server-side database, associated with the user account. Access tokens can also be stored in a database, but they may also be cached in a server-side session to improve performance.
Some developers don’t understand the need for both short-lived access tokens and long-lived refresh tokens. Having both token types improves security and performance, especially for large-scale API providers with many APIs and a central OAuth authorization service.
OAuth 2.0 typically uses bearer tokens (without signatures in API requests), so the compromise of a protected API service could allow an attacker to see the access tokens received from clients. An OAuth grant may provide an application access to multiple different APIs (scopes) for a user, such as the user’s contacts and the user’s calendars. This could allow an attacker access to not only the compromised service, but other services as well. Having only time-limited access tokens accessible to API services (and not long-lived refresh tokens) reduces the potential impact of an attack.
When an API service receives an access token from a client, it needs to ensure that it’s valid for accessing the requested data. If the token is an opaque string, it determines the validity by making an internal request to the API service’s OAuth authorization service or a database lookup. This can introduce latency to API requests, so some API providers instead of OAuth use access tokens, which are signed or encrypted strings and are able to be verified less expensively.
One of the key benefits of an authorization protocol like OAuth is the ability for users to revoke access they previously granted to applications. At large-scale providers, this revocation typically is handled by a central OAuth authorization service that handles requests for many APIs. If the API services are independently verifying the access tokens using cryptography without database lookups or calls to the central service, the services won’t know when access for a client has been revoked. Thus it is important to keep the lifespan of the access tokens short so they do not remain valid for too long after the client’s access is revoked.
The next step is retrieving and updating the user’s tasks. Many API providers implementing OAuth 2.0 use bearer tokens. This means that the application can authorize API requests simply by including the OAuth access token in the requests, without the need for cryptographic signatures.
The preferred way of authorizing requests is by sending the access
token in a HTTP Authorization
header,
as discussed in Chapter 1.
Here’s an example of using the Authorization
header method of making an
authorized API call to retrieve a user’s tasks in Google Tasks. Note
that this code is again using a custom HttpClient
class to implement the underlying
calls to the curl
library:
<?php session_start(); require_once 'http_client.inc'; $tasksUrl = 'https://www.googleapis.com/tasks/v1/lists/@default/tasks'; // The value for $accessToken would typically be stored in a // server-side PHP session bound to the active user. The value of the // access token can be any string. Google uses values similar to: // 'ya29.AHES6ZS_2G4-VuL041L0GpFJqH0wGfGSR'. $accessToken = $_SESSION['access_token']; // Recommended approach for an OAuth 2 authorized request is to // use a HTTP Authorization header $httpClient = new HttpClient(); $headers = array( 'Authorization: Bearer ' . $accessToken); // Alternative to using the Authorization header would be appending // the OAuth token to the URL as a query parameter // $tasksUrl .= '?access_token=' . urlencode($accessToken); $response = $httpClient->getData($tasksUrl, $headers); $responseArray = json_decode($response, TRUE); foreach ($responseArray["items"] as $item) { echo '<li>' . $item['title'] . "</li> "; } ?>
While this sample code specifically demonstrates calling the
Google Tasks API, similar code could be used to authorize requests of
any API supporting recent versions of the draft specification. Simply
replace the values of $tasksUrl
and
$accessToken
.
When making API calls using the OAuth 2.0 access token, you may encounter errors if the access token is no longer valid because the token expired or was revoked. In this case, you should get a HTTP 4xx error. Depending on the individual API, the detailed error description will be communicated differently.
In addition to the 4xx error code, the latest version of the
OAuth bearer token specification also requires that the HTTP WWW-Authenticate
response header be included
when credentials are not included in the request or the access token
provided does not enable access to the requested API resource. This
header may include additional details on the error encountered.
Here’s an example response from the specification, indicating that an expired OAuth access token was passed to the app:
HTTP/1.1 401 Unauthorized WWW-Authenticate: Bearer realm="example", error="invalid_token", error_description="The access token expired"
Valid error codes include: invalid_request
, invalid_token
, and insufficient_scope
.
Because the use of the WWW-Authenticate
header was a late addition
to the spec, it may not be implemented by all of your favorite API
providers.
When Facebook encounters an error with the token, it returns a HTTP 400 status code and includes the following JSON object in the body of the response:
{ "error": { "type": "OAuthException", "message": "Error validating access token." } }
Here’s an example response resulting from the use of an expired access token with one of Google’s newer APIs:
{ "error": { "errors": [ { "domain": "global", "reason": "authError", "message": "Invalid Credentials", "locationType": "header", "location": "Authorization" } ], "code": 401, "message": "Invalid Credentials" } }
When an authorization code is exchanged for an access token, many
API providers will issue short-lived access tokens even if they support
long-lived “offline” access to their APIs. Although these access tokens
have a limited lifespan, two additional parameters may be included in
the response to enable long-lived access: expires_in
and refresh_token
.
If included in the response, expires_in
indicates the remaining lifetime of
the access_token
, specified in
seconds. When the access token expires, the refresh_token
parameter can be used to obtain
a new access token.
If trying to optimize for latency in your application, it’s best to store the access token along with the time when the access token expires. When making an API call, first check to see if the current time is greater than the expiration time. If so, refresh the access token first, instead of waiting for the API server to reject your request because of an invalid access token. This will result in reduced latency because of fewer HTTP requests being made when the token expires.
Refreshing the access token is accomplished by making a HTTP POST
to the token endpoint, specifying the grant_type
as refresh_token
and including the refresh_token
. The request must also be
authenticated.
Here’s an example in PHP:
<?php include 'http_client.inc'; function getNewAccessToken($refreshToken) { $refreshTokenUrl = 'https://accounts.google.com/o/oauth2/token'; // For example only. Valid values for client_id and client_secret // need to be obtained for your environment from the Google APIs // Console at http://code.google.com/apis/console. $refreshTokenParams = array( 'client_id' => '240195362.apps.googleusercontent.com', 'client_secret' => 'hBMLD98Zi4wiqmiwmqDq', 'grant_type' => 'refresh_token', 'refresh_token' => $refreshToken ); $httpClient = new HttpClient(); $responseJson = $httpClient->postData( $refreshTokenUrl, $refreshTokenParams); $responseArray = json_decode($responseJson, TRUE); return $responseArray; } $responseArray = getNewAccessToken('adbadsfa12345'), $accessToken = $responseArray['access_token']; $refreshToken = $responseArray['refresh_token']; $expiresIn = $responseArray['expires_in']; ?>
This example authenticates the request by including the client_id
and client
_
secret
as HTTP POST parameters. Some
OAuth providers may also support authenticating the request using the
HTTP Basic access authentication method described in Step 2.
When requesting a new access token, a new refresh token may be issued as well. In this case, store the new refresh token and discard the previous one.
Regardless of whether API calls are being made direct from a user’s browser or server-to-server, some applications only need access to a user’s data while the user is “at the keyboard.” In this case, the application may be able to request “online” access that results in no refresh token being issued and the access token having a limited lifespan. In this case, obtaining a new access token is done by sending the user through the authorization flow, starting at Step 1 again. Some API providers will not reprompt the user for access if the application has previously been granted access to the same set of data by the user and will instead redirect immediately back to the application with an authorization code.
Here are some specific implementations:
Google defaults to “online”
access and does not hand out refresh tokens unless explicitly
requested by passing access_type=offline
to the authorization
endpoint at the time an authorization code is requested (see Step
1). In this case, the user is warned that they are granting
permission for the application to “Perform these operations when I’m
not using the application.” If an application with only “online”
access needs a new authorization code, it is automatically issued to
the client without user interaction, and then exchanged by the
application in a server-to-server call (see Step 2).
Facebook defaults to
“online” access: it issues access tokens with limited lifespan and
does not issue refresh tokens. If an application needs offline
access, it can request offline_access
by specifying this
permission as one of the values in the scope
string. This will result in an
access token being issued with an infinite expiration time, though
the token will still be subject to potential revocation by the
user.
Different authorization servers have different policies as to when access tokens are revoked. Most typically enable the user to explicitly revoke access through an account management interface, though these interfaces can be difficult for users to find. Additionally, some API providers (such as Facebook) revoke outstanding access tokens when a user changes their password.
Applications are not usually informed when a user revokes access, and the specification does not define any way to implement a notification—the app will simply see an error the next time it attempts to use an access token or refresh the token stored for that user.
Facebook, however, does have a definable “Deauthorize callback URL” which performs a HTTP POST to your application when a user revokes access in the style of a WebHook. More information is available in Facebook’s developer documentation.
While users can revoke their access manually, some OAuth 2.0 authorization servers also allow tokens to be revoked programmatically. This enables an application to clean up after itself and remove access it no longer needs if, for instance, the user uninstalls the app.
Programmatic revocation is defined in a draft extension to the OAuth 2.0 specification and is implemented by popular OAuth providers such as Salesforce and Google. Salesforce allows for revocation of both refresh tokens and access tokens, while Google only enables revocation of refresh tokens. Here’s an example revocation request:
curl "https://accounts.google.com/o/oauth2/revoke?token=ya29.AHES6ZSzF"
The extension also defines a JSONP “callback” query parameter that OAuth providers can optionally support. Both Salesforce and Google support this parameter.
A 200 response code indicates successful revocation.
18.116.50.87