The possession of great power necessarily implies great responsibility.
—William Lamb, British Member of Parliament, Home Secretary, and Prime Minister. From a speech in the House of Commons, 1817
Modern applications are often designed around APIs. APIs enable applications to reuse logic and take advantage of innovative services. APIs provide access to valuable data or services, so they typically need to restrict API access to authorized parties. Applications therefore need authorization to call APIs. If an application wants to call an API on a user’s behalf to access resources owned by the user, it needs the user’s consent. In the past, a user often had to share their credentials with the application to enable such an API call on their behalf. This gave the application an unnecessary amount of access, not to mention the responsibility of safeguarding the credential! In this chapter, we will cover how the OAuth 2.0 framework provides a better solution for authorizing applications to call APIs.
API Authorization
In this scenario, the application, WriteAPaper.com, is a specialized editor that helps users write and edit research papers. It calls two APIs, both of which are owned by different organizations. The first is famousquotes.com which provides validated quotes for use in papers. The second API is at documents.com and provides a document storage service. There is a second, mobile application that calls the documents.com API to provide access to documents from a user’s mobile device.
When the WriteAPaper application calls the API at famousquotes.com, it does so on its own behalf. The quotes content is not owned by the user, so the user’s consent isn’t needed for this access. The application only needs to be a registered client authorized to call the quotes API. When the application calls the API at documents.com, however, to obtain a user’s documents, the request must be made on behalf of the user. In this case, the content accessed belongs to the user, and the application must obtain the user’s consent to retrieve the user’s documents. The client application has no right by itself to access the user’s data at another site.
The mobile application provides read-only access to a user’s documents and doesn’t offer access to the quotes service. It requires authorization from a user to call the documents API and retrieve the user’s documents. We’ve included the mobile application in the example because we’ll show in the following sections how OAuth 2.0 could enforce different privileges for the two applications.
OAuth 2.0
The OAuth 2.0 Authorization Framework,i published in 2012, was designed to enable an application to obtain authorization to call third-party APIs. With OAuth 2.0, an application can obtain a user’s consent, to call an API on their behalf, and not need their credentials for the API site. An application can also obtain authorization to call an API on its own behalf if it owns the content to be accessed.
The primary use case involves a user, called a resource owner, who wishes to allow an application to access a protected resource, owned by the resource owner, at a logically separate site, known as the resource server. Using our example from Figure 5-1, the resource owner (the user) has stored documents at a resource server (documents.com). The resource owner is using the WriteAPaper application to write a paper based on content they’ve uploaded to documents.com. The resource owner wants to grant the WriteAPaper application access to their content at documents.com so it can retrieve the content for use in their research paper.
OAuth 2.0 was designed to provide a better solution. It enables a user to explicitly authorize an application to call an API on the user’s behalf, without giving their credentials for the API site to the application and in a way that limits what the application can do. With OAuth 2.0, when an application needs to call an API on behalf of a user, it sends an authorization request to an authorization server for the API. An authorization server handles access requests for an API and returns a security token that can be used by the application to access the API. In the authorization request, the application gives an indication (known as the “scope”) of what it wants to request from the API. The authorization server evaluates the request and, if authorized, returns a token to the application.
To recap, the OAuth 2.0 protocol provides an authorization solution, not an authentication solution. It enables an application to call an API on its own behalf or a user’s behalf, with the call constrained to the scope of an authorized request. The authentication step in OAuth 2.0 validates the user is entitled to give consent to authorize an access request for a particular resource. The OAuth 2.0 access token is only intended for API access and not to convey information about the authentication event or the user. The use of OAuth 2.0 is therefore appropriate for authorizing API calls but not as an authentication solution (at least in the absence of any proprietary additions to the base protocol, which some providers have implemented). OIDC, described in the next chapter, can be used to authenticate a user to an application, but this chapter focuses on describing how OAuth2.0 works for the purpose of API authorization.
Terminology
To describe OAuth 2.0 in more detail, we need to describe a few terms defined by OAuth 2.0.
Roles
Resource Server – A service (with an API) storing protected resources to be accessed by an application.
Resource Owner – A user or other entity that owns protected resources at the resource server.
Client – An application which needs to access resources at the resource server, on the resource owner’s behalf or on its own behalf. We’ll generally use the term application instead of client, for consistency across chapters.
Authorization Server – A service trusted by the resource server to authorize applications to call the resource server. It authenticates the application or resource owner and requests consent from the resource owner if the application will make requests on the resource owner’s behalf. With OAuth 2.0, the resource server (API) is a relying party to the authorization server. The authorization server and resource server may be operated by the same entity.
Confidential vs. Public Clients
Confidential Client – An application that runs on a protected server and can securely store confidential secrets to authenticate itself to an authorization server or use another secure authentication mechanism for that purpose.
Public Client – An application that executes primarily on the user’s client device (native application) or in the client browser and cannot securely store a secret or use other means to authenticate itself to an authorization server.
Client Profiles
Web Application – A confidential client with code executing on a protected, back-end server. The server can securely store any secrets needed for the client to authenticate itself as well as any tokens it receives from the authorization server.
User Agent-Based App – Assumed to be a public client with code executing in the user’s browser. Example: A JavaScript-based single-page application running in the browser.
Native Application – Assumed to be a public client that is installed and executed on the user’s device, such as a mobile application or desktop application.
In practice, these definitions may overlap because a web application may serve up HTML pages that contain some JavaScript, and single-page applications may have a small back end. For further discussion on this, see the description in Chapter 6 of the OIDC Hybrid flow.
Tokens and Authorization Code
Authorization Code – An intermediary, opaque code returned to an application and used to obtain an access token and optionally a refresh token. Each authorization code is used once.
Access Token – A token used by an application to access an API. It represents the application’s authorization to call an API and has an expiration.
Refresh Token – An optional token that can be used by an application to request a new access token when a prior access token has expired.
How It Works
Authorization code grant
Implicit grant
Resource owner password credentials grant
Client credentials grant
The following sections will describe how each of these work.
Authorization Code Grant
- 1.
User (resource owner) accesses the application.
- 2.
Application redirects browser to the authorization server’s authorize endpoint with an authorization request.
- 3.
Authorization server prompts the user for authentication and consent.1
- 4.
User authenticates and provides consent for the request.
- 5.
Authorization server redirects the user’s browser back to the application’s callback URL with an authorization code.
- 6.
Application calls authorization server’s token endpoint, passing the authorization code.
- 7.
The authorization server responds with an access token (and optionally a refresh token).
- 8.
Application calls the resource server (API), using the access token.
The authorization code grant type was originally optimized for confidential clients. The first (authorization) request redirects the user to the authorization server so it can interact with the user. The second request could be made by the application’s back end directly to the authorization server’s token endpoint. This enables an application back end, which is assumed to be capable of securely managing an authentication secret, to authenticate itself to the authorization server when exchanging the authorization code for the access token. It also means that the response with the access token can be delivered to the application back end, which will make the subsequent API calls. An added side benefit is that the tokens are returned via secure backchannel response. However, while originally optimized for confidential clients, the addition of PKCE enables public clients to use this grant type as well.
Authorization Code Grant Type + PKCE
The authorization code grant type diagram shows the use of Proof Key for Code Exchange (PKCE).ii PKCE is a mechanism that can be used with authorization and token requests to ensure that the application that requested an authorization code is the same application that uses the authorization code to obtain an access token. PKCE protects against a malicious process, especially on mobile devices and with public clients, that could intercept an authorization code and use it to get an access token.
To use PKCE, the application creates a cryptographically random string, called a code verifier, that is long enough to provide sufficient protection against guessing. The application then computes a derived value, called a code challenge, from the code verifier. When the application sends an authorization request in step 2 in the diagram, it includes the code challenge, along with the method used to derive it.
When the application sends the authorization code to the authorization server’s token endpoint to get the access token in step 6, it includes the code verifier. The authorization server transforms the code verifier value using the transformation method received in the authorization request and checks that the result matches the code challenge sent with the authorization request. This enables an authorization server to detect if a malicious application is trying to use a stolen authorization code. Only the legitimate application will know the code verifier to pass in Figure 5-4’s step 6 that will match the code challenge passed in step 2.
The PKCE specification lists two transform methods that can be used to derive the code challenge from the code verifier, namely, “plain” and “S256.” With the “plain” method, the code challenge and verifier are identical, so there is no protection against the code challenge being compromised. Applications using the authorization code grant with PKCE should use the S256 transform method which uses a base64 URL encoded SHA256 hash of the code verifier to protect it.
The Authorization Request
Authorization Request Parameters
Parameter | Meaning |
---|---|
response_type | Indicates the OAuth 2.0 grant type. “code” is used for the authorization code grant type. |
client_id | Identifier for the application, assigned when it registered with the authorization server. |
state | A non-guessable string, unique for each call, opaque to the authorization server, and used by the client to track state between a corresponding request and response to mitigate the risk of CSRF attacks. It should contain a value that associates the request with the user’s session. This could be done by including a hash of the session cookie or other session identifier concatenated with an additional unique-per-request component. When a response is received, the client should ensure the state parameter in the response matches the state parameter for a request it sent from the same browser. |
scope | Indicates the scope of access privileges for which authorization is requested. For example: “get:documents” |
redirect_uri | The authorization server sends its response with the authorization code to this callback URL at the application. For example: https%3A%2F%2Fclient%2Eapplication%2Ecom%2Fcallback |
resource | Identifier for a specific API registered at authorization server for which the access token is requested. This parameter is defined in the Resource Indicators for OAuth 2.0 extension.iii Some implementations may use other names, such as “audience.” Primarily used in deployments with custom APIs. This parameter isn’t needed unless there are multiple possible APIs. |
code_challenge | PKCE code challenge derived from the PKCE code verifier using the code challenge method specified in the code_challenge_method parameter, as described in Section 4.2 of the PKCE specification.iv |
code_challenge_method | “S256” or “plain.” Applications capable of using S256 must use it. |
The scope parameter is used by an application to request a scope of access privileges. Using our WriteAPaper application example from the beginning of the chapter, the primary, single-page application would request a scope of “get:documents update:documents,” whereas if the mobile client only needed read access to documents, it would only request “get:documents.”
The resource parameter was not in the original OAuth 2.0 specification. Since that time, authorization servers have been written to handle requests for multiple APIs and, in such cases, may support an additional parameter to indicate a specific API for an authorization request. This parameter may be called the “resource” or “audience.”
Response
Authorization Response Parameters
Parameter | Meaning |
---|---|
code | The authorization code to be used by the application to request an access token. |
state | The state value, unmodified, sent in the authorization request. Application must validate that the state value in the response matches the state value sent with the initial request. |
Calling the Token Endpoint
Token Request Parameters
Parameter | Meaning |
---|---|
grant_type | Must be “authorization_code” for the authorization code grant. |
code | The authorization code received in response to the authorization call. |
client_id | Identifier for the application, assigned when it registered with the authorization server. |
code_verifier | The PKCE code verifier value from which the code challenge was derived. It should be an unguessable, cryptographically random string between 43 and 128 characters in length, inclusive, using the characters A–Z, a–z, 0–9, “-”, “.”, “_”, and “~” and formed as described in Section 4.1 of the PKCE specification.v |
redirect_uri | The callback URI for the authorization server’s response. Should match the redirect_uri value passed in the authorization request to the authorize endpoint. |
Token Endpoint Response Parameters
Parameter | Meaning |
---|---|
access_token | The access token to use in calling the API. Different authorization servers may use different formats for access tokens. |
token_type | Type of token issued. “Bearer,” for example. |
expires_in | How long the token will be valid. |
refresh_token | A refresh token is optional. It is up to an authorization server's discretion whether to return a refresh token or not. See the Refresh Token section later in this chapter for further information. |
Implicit Grant
- 1.
User (resource owner) accesses the application.
- 2.
Application redirects browser to the authorization server’s authorize endpoint with authorization request.
- 3.
Authorization server prompts the user to authenticate and provide consent.3
- 4.
The user authenticates and provides consent for the authorization request.
- 5.
Authorization server redirects back to the application’s callback URL with an access token.
- 6.
The application uses the access token to call the resource server (API).
Since the OAuth2.0 specification was originally published, CORS has become supported by most browsers. Consequently, the implicit grant type isn’t needed anymore for its original purpose. Furthermore, returning an access token in a URL hash fragment exposes the access token to potential leakage via browser history or referer headers. The implicit grant type with the access token returned in a URL hash fragment is no longer recommended for single-page applications needing an access token.4 The authorization code grant type ( with PKCE) should be used instead.
It should also be noted that after the release of the original OAuth 2.0 specification, the OAuth 2.0 Multiple Response Type Encoding Practices specificationvii defined a new “response_mode” parameter for authorization requests that would enable applications to request that authorization server responses be returned in new ways. Subsequent specifications defined new response mechanisms. The OAuth 2.0 Form Post Response Modeviii encodes response parameters into an HTML form that is sent via HTTP-POST to the application. At the time of writing, a draft specification exists for an OAuth 2.0 Web Message Response Modeix which leverages HTML 5 Web Messaging to return an authorization response to an application. The implicit grant type with alternate response modes provides new options to applications that can mitigate issues related to the default response mode.
The Authorization Request
A successful implicit grant type authorization request using the default response mode will result in a redirect back to the application’s redirect URI with the access token, token type, token expiration, and state values in a URL fragment which can be exposed via referer headers and browser history. A request using the form_post response mode will result in the response encoded in an HTML form posted to the redirect_uri, avoiding the URL fragment exposure.
Resource Owner Password Credentials Grant
The resource owner password credentials grant type supports situations where an application is trusted to handle end-user credentials and no other grant type is possible. For this grant type, the application collects the user’s credentials directly instead of redirecting the user to the authorization server. The application passes the collected credentials to the authorization server for validation as part of its request to get an access token.
This grant type is discouraged because it exposes the user’s credentials to the application. It has been used for legacy embedded login pages and user migration scenarios. In either case, a vulnerability in the application can compromise the credentials. In addition, this grant type does not involve a user consent step, so an application can request any access it wishes using the user’s credentials. The user has no way to prevent abuse of their credentials.
Consequently, this grant type is primarily recommended for user migration use cases. If users need to be migrated from one identity repository to another with incompatible password hashes, the new system can prompt a user for their credentials, use the resource owner password grant to validate them against the old system, and if valid, retrieve the user profile from the old system and store it and the credentials in the new system. This can avoid the necessity for large-scale forced password resets when migrating identity information. If this grant type is used, the client should throw away the user credentials as soon as it has obtained the access token, to reduce the possibility of compromised credentials.
- 1.
User (resource owner) accesses the application.
- 2.
Application prompts the user for their credentials.5
- 3.
The user provides their credentials to the application.
- 4.
Application sends token request to the authorization server's token endpoint, with the user’s credentials.
- 5.
Authorization server responds with an access token (and optionally a refresh token).
- 6.
Application calls the resource server (API), using the access token.
This grant type has also been used in the past with mobile applications calling first-party APIs. This was often done because login flows that redirected via browsers on mobile devices were perceived as cumbersome. This has improved, and RFC 8252, OAuth 2.0 for Native Apps,x now recommends the use of the authorization code grant, combined with PKCE, for native applications using the system browser.
The Authorization Request
This sample has the application authenticate to the authorization server with HTTP Basic authentication scheme and a client ID and secret, obtained from the authorization server. A successful request will result in a response from the token endpoint similar to that described in the previous section for the authorization code grant.
Client Credentials Grant
- 1.
Application sends authorization request including application's credentials to the authorization server.
- 2.
Authorization server validates the credentials and responds with an access token.
- 3.
Application calls resource server (API) using the access token.
- 4–6.
The steps repeat if the access token has expired by the next time the application calls the API.
No end-user interaction with the authorization server is required for this flow. The application credentials serve as the authorization for the application and are used to request an access token from the token endpoint. Our sample uses a client ID and client secret obtained when the application registered with the authorization server.
The Authorization Request
A successful client credentials grant request will result in a response from the token endpoint similar to that described in the previous section for the authorization code grant.
Calling an API
The access token has an expiration, so it can only be used for a limited time, but it is not a one-time-use token. As a performance optimization, an access token can be cached by an application and reused until it expires, to avoid making a call to the authorization server for every API call. The access token must have been granted the appropriate scope of privileges for the API calls. This should not, however, encourage the use of overly broad scopes!
Refresh Token
OAuth 2.0 access tokens have an expiration. When an access token expires, an application could make a new authorization request, but OAuth 2.0 defined an alternative approach for traditional web applications and native clients that involves a refresh token. A refresh token can be obtained from an authorization server and used to obtain a new access token when a previous access token expires. A refresh token can be used to enable ongoing API access from native mobile applications, for example.
Refresh tokens are not used in all scenarios. There is no need for a refresh token with the client credentials grant because an application can simply request an access token programmatically at any time, without a need for user interaction. Static refresh tokens are not used with public clients because they are sensitive tokens and public clients are not capable of securely storing them. The OAuth 2.0 Threat Model and Security Considerationsxi document proposed the notion of refresh token rotation to detect if a refresh token has been stolen and is being used by two or more clients. This scheme has the authorization server return a new refresh token with each access token renewal request. The OAuth 2.0 Security Best Current Practice documentxii specifies that authorization servers must use either refresh token rotation or sender-constrained refresh tokens (bound to a particular client) with public clients to mitigate the risk of compromised refresh tokens.
Refresh tokens provide a convenient way for traditional web applications and native applications to obtain new access tokens. This facilitates use of access tokens with a short duration, which minimizes the risk if an access token is compromised. It may be tempting to automatically refresh an access token as soon as it expires, but in keeping with the principle of least privilege, it is better to only refresh an access token when it is needed, rather than always keeping a current access token on hand. In the same vein, an application must store a refresh token securely as it is a sensitive credential.
The OAuth 2.0 specification did not include a mechanism for applications to request refresh tokens, leaving the issuance at the discretion of authorization servers. The handling of refresh tokens may therefore vary across individual authorization servers. Some issue refresh tokens automatically, and others expect an application to explicitly request a refresh token. (The OIDC specification, covered in the next chapter, includes a mechanism for an application to request a refresh token for one specific use case.) The ability to revoke access tokens is not a mandatory feature in the OAuth 2.0 specification, so some authorization servers may not support it. The documentation for your chosen authorization server should explain the implementation-specific details.
The access token will be returned in a response similar to that described in previous sections. The scope parameter is optional, and if used, must be equal to, or lesser than, the scope in the original authorization request, and the client credentials passed must be those of the application which made the original authorization request.
Guidance
The preceding sections covered an introduction to how an application requests API authorization via OAuth 2.0. An SDK may abstract and simplify some of this interaction or use different parameter names. You’ll need to check the documentation for your authorization server for implementation-specific details. Even if you use an SDK, however, it is valuable to know the form of the underlying calls for troubleshooting. In addition, be sure to check the OAuth 2.0 specification as there are several additional request parameters useful for more advanced use cases.
An access token is meant to be consumed by an API. The format of an access token may vary, but an application should not depend on using data in the access token (in the absence of proprietary extensions). An API that receives an access token must validate it before processing the request it accompanies. The process for validating a token may vary by authorization server implementation.
In general, it is recommended that access token duration be short-lived and a new access token obtained when needed if the previous access token has expired. The exact duration should be determined based on the sensitivity of the resources to be accessed. Access tokens can be cached, for a period of time less than or equal to their expiration, as a performance optimization and/or to avoid hitting rate limits with excessive calls to an authorization server. It is important to note that access tokens and refresh tokens must be stored securely as they are sensitive credentials. You should utilize the secure storage options for your platform when storing these tokens.
Summary
The OAuth2.0 protocol enables an application to obtain authorization to call an API on either a user’s behalf or on its own behalf. This eliminates the requirement for users to share their credentials with the application. It also provides the user greater control over what the application can do and a limit on the duration of API access. The user can revoke API access for an individual application without impacting the ability of other applications to call the API on their behalf. Once you have an application authorized to call an API, you’ll want to authenticate users to that application, which is covered in the next chapter.
Key Points
OAuth 2.0 enables applications to request authorization and obtain an access token to call APIs.
With OAuth 2.0, a user has control over API authorizations for applications.
Scopes are used to control the access an application has when calling an API.
The original OAuth 2.0 specification defined four grant types.
The authorization code grant type with PKCE can be used by traditional web applications, public applications, as well as native applications.
The OAuth 2.0 implicit grant type is not recommended to obtain an access token with the default response mode as it exposes the access token to potential compromise.
The OAuth 2.0 resource owner password grant type is best restricted to legacy user migration cases as it exposes user credentials to an application.
The client credentials grant type is for API calls where the application owns the requested resource.
A refresh token is used to obtain a new access token when the old access token expires.
Notes
- i.
- ii.
- iii.
- iv.
- v.
- vi.
- vii.
- viii.
- ix.
- x.
- xi.
- xii.