After discussing API design patterns, I would like to dedicate a complete chapter to this topic due to its importance. All APIs need to know who they are being used by. The answer is provided via authentication and authorization mechanisms. Whatever gets implemented, always remember this:
Authentication and authorization keep data private and shared between authorized entities only!
Authentication vs. Authorization
In any system, almost all relevant APIs require users, or at least clients, to authenticate. And at some point in time, an API will require authorizations, too. It is very important to consider this fact in the first design sessions. The whole API project will be very different without these two attributes.
There are two important questions you must ask. The first question is What is the difference between authentication and authorization? The answer is quite simple:
Authentication answers who I am whereas authorization answers what I can do!
In this statement, “I” could be a user (a person) or a client (an application). I hope that you agree. That’s all there is to say about it.
The second question is What should happen when during message flows? The answer to this question is very different and more complex. The (hopefully) obvious part of the answer is the following:
Only authenticated entities can be authorized!
If an entity is unknown, it is not possible to authorize access to any resources for it. This means whatever flow is executed, authentication happens first!
Username and password as parameters
HTTP basic authentication
x.509 certificates
HTTP cookies
SAML
JWT
Other token-based systems
The chosen option needs to be respected in the API interface design. It makes a difference if HTTP basic authentication is required rather than, let’s say, username and password as parameters. The API definitions would include the following:
HTTP methods: GET, POST, PUT, DELETE, PATCH, and more
HTTP header: Authorization: Basic bXk6Ym9vaw==
Content-Type: Any
HTTP methods: POST, PUT, PATCH, and more. However, the payload (message body) needs to include username, password. If that is not possible in your use case, this method is not an option!
Content-Type: application/x-www-form-urlencoded. Others are possible, but do not comply with HTTP Form POST mechanisms and require manual extraction of the values on incoming requests.
Note: Methods like GET or DELETE may include username and password as URL query parameters, but I do not consider them viable options (i.e. GET /authenticate?username=bob&password=secret)
With any method certain advantages and disadvantages will be introduced! In any case, the API uses the incoming credentials and validates them. Most likely that involves an identity provider (IDP). That, again, influences the overall system due to the required type of IDP that has to be made available.
Once the authentication is successful, authorization follows. Here is the most common question on that topic:
Should the authorization decision be made in conjunction with the authentication step or only when it actually is needed?
If that is not clear, here is an example. Many apps on mobile devices need to access certain resources such as contacts. Some apps request authorization for that right after or during the installation. These apps preemptively request authorization. Other apps prompt for authorization just in time when they need it.
Preemptive Authorizations
Let’s say a JavaScript app needs access to the database and to the queue via two different APIs. The JavaScript app first has a user authenticated against the IDP and then requests authorizations for those two APIs in the same moment. <ApiProxy> has a mechanism to grant those authorizations and issue a JWT that contains these authorization statements. The JavaScript app now uses that JWT on both APIs exposed on <ApiProxy>. Both APIs validate the JWT, check for a required authorization statement, and grant or deny the request. <ApiProxy> forwards the request to the matching APIs on the backend. Here’s the catch: Both APIs know that the JavaScript app has other authorizations! That may be acceptable but maybe not! In these situations, always be pessimistic and act in favor of privacy!
It would be better to send a request to those APIs at <ApiProxy> with a credential only. This would be validated and now, at this point in time for this API only, <ApiProxy> would create an authorization statement that is forwarded to the matching API at the backend.
Just-in-Time Authorizations
If you imagine an API-based system that never uses preemptive but just-in-time authorizations only, you can easily imagine that the network traffic would potentially grow by a big number. A lot of noise would decrease the performance for the overall system. Therefore, a compromise between both approaches has to be found.
My recommendation is to grant authorization statements within APIs that serve the same application. For example, the fictional HandlePackages application is based on five APIs; the FindLostParcels application is built on top of three others. An application on top of them, named MyPackageAndParcelBuddy, requires access to all eight APIs.
The single app would request and receive its own authorization statements and would not share them. But MyPackageAndParcelBuddy would now need two different ones: one authorization statement for each feature and with that one per group of APIs. Although this may sound more complicated, it removes the privacy issues.
The next section will talk about OAuth and JWT in more detail and should help you make decisions in your API project. OAuth is an authorization framework that helps with both types of authorizations.
Of all available technologies that could be chosen for this task I will concentrate on OAuth and OpenID Connect. These are practically the default standards of our time, and everyone should have a good understanding of what they are.
OAuth
In today’s API-based systems OAuth is a technology that is found almost everywhere. I get many questions about OAuth, I wrote many blog posts about this topic, and I have created a web site that has oauth in its name ( www.oauth.blog ). However, even though this technology has been around for quite some time, it seems to be challenging.
Here is something you may have heard before. And if not, please pay attention:
OAuth is an authorization framework !
OAuth is not made for authentication.
OAuth is not a replacement of known authentication schemes.
OAuth is not a fixed protocol.
OAuth is not a list of well-defined features or use cases.
If you are not quite sure yet, do not worry. Here is a question I have read on Twitter that emphasizes that many people have trouble understanding it:
Are you using LDAP or OAuth?
If that question is not confusing to you, just keep on reading.
Whoever asked this question did not understand the idea of OAuth. I wrote a blog post about this topic and explained the difference between LDAP and OAuth. The post1 still gets new views every day, even after more than two years. It seems to be a hot topic!
Resource owner (RO): A person, a user, someone who uses an application
Client: An application (app)
Access token (AT): A short-lived token used by clients to access APIs that require such token as credential. These APIs are referenced as protected resources.
Authorization server (AS): A server that issues a different OAuth token
Resource server (RS): A server that provides protected APIs
Protected resource (PR): An API that serves information about or for the resource owner
SCOPE : Permissions a client is requesting (more details further down)
The username and password were only shared between resource owner and authorization server. Neither the client nor the resource server saw those credentials.
The resource owner was asked to provide his consent! This means that the resource owner was in the position to decide whether the client could access his calendar or not!
The client received an access_token, which it used with its API request GET /calendar?access_token to the resource server. This was good enough for the resource server to accept the request and return the calendar details {"calendar":"details"}. No user credentials required!
A few years ago, the resource owner would have configured the client with his username and password and the client would have accessed protected resources impersonating the resource owner. With OAuth, the client accesses the protected resources on behalf of the resource owner!
This was the first flow example, but since OAuth is a framework, it supports other flows too. There are also terms that must be discussed. If you are not interested in knowing the bits and pieces, then at least remember that OAuth is a mechanism for authorizations! If you want to know more, keep on reading.
OAuth, the Details
OAuth supports different flows. They are called grant_types.
- A grant_type can be one of the following:
authorization_code (CODE)
Resource owner password credentials (ROPC)
refresh_token (RT)
client_credentials (CC)
Implicit
- OAuth specifies two types of clients:
Public (no, I cannot keep a secret to myself)
Confidential (yes, I can keep a secret to myself)
- OAuth specifies two APIs:
/authorize (web-based)
/token (API-based)
- OAuth matches different flows to different types of clients (applications):
JavaScript clients
Mobile clients (native implementations)
Web applications
OAuth requires an explicit or implicit consent of resource owners for a client.
- OAuth supports flows that do not involve a resource owner.
client_credentials
- OAuth specifies three different types of tokens:
access_token
refresh_token
authorization_code
- Upper left and right corner:
The types of applications relate to client types.
- Lower left and right corner:
resource_owners (users) provide an explicit consent, requested during an authorization flow, or implicitly by just using a client.
With the client_credentials (cc) flow no user is involved and therefore no consent is required.
- /authorize, /token
The two APIs that are specified in OAuth
/authorize is used with browser based flows and displays a login and consent screen.
/token is used as plain data API; no website is involved.
- Public, Confidential
The distinction between clients that are secure and able to keep a secret (confidential) or not (public)
- Implicit
A flow that results in a client receiving an access_token
- CODE, ROPC, RT, CC
Flows that result in a client receiving an access_token and optionally a refresh_token
- Dotted rectangle surrounding Implicit and CODE
Both flows begin with a request to /authorize and involve a browser.
Both flows include an initial parameter named response_type (more about that below).
In comparison to implicit, CODE receives a temporary token (authorization_code) instead of an access_token. The temporary token has to be exchanged for an access_token in a second step.
- Dotted rectangle surrounding CODE, ROPC, RT, and CC
All these flows are API-based with no browser involved.
Resource_owners are not required to provide explicit consent. Or they have given it previously.
All flows include an initial parameter named grant_type (more about that below).
- 1.
An application needs to authenticate, but users do not.
- 2.
Users should grant applications explicitly when using the mobile app.
- 1.
Use case 1: No user, but the client needs to authenticate ➤ cc (client_credentials). From that, you can see that the client type must be confidential and should be implemented as web application (or at least on a server). The client will use the /token endpoint, no consent required.
- 2.
Use case 2: Start off in the explicit consent corner. Via /authorize you get to choose the implicit or the CODE flow. Since the client is mobile, it is also public.
Now, let’s begin discussing flows and all their details! Along the way I will introduce all parameters and hopefully everything that needs to be known about them.
OAuth flows (grant_types)
OAuth supports different flows that clients can choose to obtain authorizations. All flows have a few attributes in common and some specific ones. The common ones are explained in the “General rules” bullet points and specifics are explained within their own section. Whenever anyone starts working with OAuth, they always ask, Which flow shall I use? The following sections will explain which one to use and why.
The /authorize API accepts requests using HTTP GET or POST and always responds with a redirect (HTTP status 302) unless a redirect_uri is not available.
The /token API only accepts requests using HTTP POST and always responds with content-type application/json.
HTTP POST requests are always used with content-type application/x-www-form-urlencoded.
HTTPS is a must!
For any flow that involves a browser, web-based vulnerabilities have to be addressed.2
Wherever redirect_uris are used, only accept registered ones! Never accept open redirects!
Submitted parameters must be URLEncoded. A typical error is to URLEncode a complete URL instead of just the parameters. It should be done like this:
https://example.com/authorize ?
key1=urlEncode(value1)
&key2=urlEncode(value2)
instead of
https://example.com/authorize ?
urlEncode(key1=value1&key2=value2)
The examples following here show flows (grant_types) with example configurations. As you get into OAuth, you will discover that any of the following can be used with different parameter values. Nevertheless, to get started, try it as shown for the next five flows, even if the whole section is very technology heavy.
Implicit Grant
Description: A client is requesting an access_token using the response_type token. This response_type requires a browser or a web view on a mobile device and prevents the client from accessing the resource owner’s credentials. Implicit flows are not secure when it comes to the visibility of issued token. This should only be considered if an exposed access_token is not a risk.
{client_id}: This is a unique identifier that is known at the authorization server and identifies one specific client. It has to be preregistered before it can be used.
{response_type}: For implicit flows the value is token that advises the authorization server to include an access_token in its response.
{requested_scope}: A client optionally requests scope values. Scope values are specific per environment and are practically permissions. Multiple values may be provided as a space-separated list of values (but URLEncoded!).
{redirect_uri}: The authorization server will return any error messages or issued token attached to this URL as a URL fragment. The fragment is indicated by the number sign (#). A fragment is only available to the browser! The {redirect_uri} value used in the request must match a pre-registered value. The authorization server will not accept a request if there is a mismatch.
{state}: An optional state can be included in the request. It is opaque to the authorization server and is meant for the client only. It can be used to prevent CSRF3 attacks . The authorization server will attach the value as-is to the given redirect_uri in its response.
{granted_scope}: The authorization server may not grant the requested scope. Therefore, the response includes granted scope.
{access_token}: The token that can be used by the client to access protected APIs.
Access Token displayed in browser: #access_token={access_token}
On mobile devices, a redirect_uri of a third-party-app may be invoked. With that, the token is received by the wrong app!
Authorization_code Grant, Step 1
Description: A client is requesting an access_token using the response_type code. This response_type requires a browser or a web view on a mobile device and prevents the client from accessing the resource owner’s credentials. This is the most secure response_type when it comes to the visibility of issued tokens. The result is a temporary token, which has to be exchanged for an access_token afterwards (step 2).
Note
This is also the flow used for social logins!
HTTP status=302
HTTP header ‘Location={redirect_uri}
&state={state}
&code={authorization_code} // difference compared to ‘implicit’
{response_type}: For the code flow the value is code, which advises the authorization server to include an authorization_code in its response.
{authorization_code}: A temporary token
On mobile devices a redirect_uri of a third-party app may be invoked. With that, the authorization_code is received by the wrong app! To mitigate this risk, apply RFC 7636, Proof Key for Code Exchange.4
Authorization_code Grant, Step 2
Description: After receiving an authorization_code in Step 1, the client now needs to exchange the code for an access_token.
{client_secret}: Just like a password for users, clients have a client_secret.
{grant_type}: For this flow, the value is authorization_code. It advises the authorization server to use the value of code as grant. The authorization server will validate the code and find the associated resource_owner who has granted the client in Step 1.
{refresh_token}: A second token that can be used by the client to request a new access_token when the first one expires.
{redirect_uri}: This value has to match the value used in Step 1!
One of the few risks is the mix-up problem. This occurs when a client receives an authorization_code from one server but tries to exchange it for an access_token with a fraud server.5
Resource Owner Password Credentials (ROPC) Grant
Description: This flow is considered only for trusted clients. The client receives the resource_owner credentials directly. This may be chosen only if the owner of the user credentials (such as an enterprise business) is also the owner of the client (client for employees).
{grant_type}: For this flow the value is password. It advises the authorization server to use the provided username and password to authenticate the resource_owner.
{username}: The username of the resource_owner who uses the client
{password}: The resource_owners password
To be used with caution since the client receives the user credentials.
Refresh Token Grant
Description: A client uses a refresh_token to request a new access_token, optionally a new refresh_token. By design, this token is valid until the resource_owner revokes it. However, many implementations do support an expiration date.
{grant_type}: For this flow the value is refresh_token. It advises the authorization server to issue a new token based on the provided refresh_token.
{refresh_token}: An existing refresh token
{requested_scope}: The requested scope cannot include any value that has not been requested in the initial authorization request with which the here used refresh_token has been received!
Potentially this is a long-lived token. With that, it may be necessary to have resource_owners prove that they are still in possession of the client that received this token from time to time.
Client Credentials Grant
Description: A client requests authorization on its own behalf. No resource_owner is involved.
{grant_type}: For this flow the value is client_credentials. It advises the authorization server to grant authorization on behalf of the client. The client is also the resource_owner.
Only confidential clients are supported by this grant type.
These are all flows as specified by RFC 6749. If you are a hardcore OAuth expert, you will notice that I have neglected available options for some flows. For example, alternatively client credentials can be provided as an HTTP header ‘Authorization : Basic base64(client_id:client_secret)’ and not as parameters. Nevertheless, I believe the provided descriptions are sufficient in this context.
Tip
You may have observed that these flows often referenced username, password as parameters in order to authenticate a resource_owner. Needing to reference username, password is actually only required when the ROPC flow is used! It is not the case for the implicit and CODE flow. Username and password are only used in the RFC and in this chapter because it is the most common way to authenticate users.
I encourage you to choose the best way for your environment to authenticate resource_owners! It may be by cookie, by SAML, by JWT, or a combination of a phone number and an OTP. Whatever it is, do not limit yourself to anything that does not work for your environment. For example, a product I work on issues an authorization_code after resource_owners go through a social login flow with a social platform. No username or password is ever visible in our product, only the code!
OAuth SCOPE
Scope is specified in RFC 6749, but more or less like scope exists, and it can be used however you want. The RFC does not specify any values, nor does it provide a good guideline for it. Many questions I get around scope are caused by this openness. But, before you complain and say, Yes, I have noticed that and it annoys me, please remember that OAuth is a framework! Frameworks usually do not provide details such as specific values. Instead, a framework lets you build whatever you like but within a given and well-known environment. Look at it as a good thing!
A client requests authorization, including scope=read_calendar.
An access_token gets issued, associated with scope=read_calendar.
The client uses the access_token at a protected API, which requires the access_token to be associated with that scope.
The client can read the calendar.
If the same protected API also supports updating a calendar, it may require a second scope for that such as scope=update_calendar. The client would have to request that scope additionally, like scope=read_calendar update_calandar. If it tries to update a calendar without having an access_token associated with scope=update_calendar, the request will fail!
It is important to remember that scope should be used as permission for clients but not for resource owners! I have often been asked how scope can be issued based on authenticated users that have granted the client. In most cases, the ask is to do it based on certain attributes such as role (i.e. administrator, writer, developer). To be blunt, that is a bad idea!
Employees: Employees are managed in an LDAP server.
Employee attributes: Attributes are managed in an LDAP server.
OAuth clients: Clients are managed in a database or and LDAP server.
SCOPE: Scope values are managed in a database or an LDAP server.
APIs: APIs are managed in an API Portal system.
API requires scope; scope is granted to clients.
API requires attributes; attributes are assigned to resource owners.
- During the authorization request:
Grant scope based on client.
- When a protected API is accessed:
API checks for scope.
API checks for attributes of resource_owner.
Using this approach does not tie together scope and resource_owner attributes.
- During the authorization request:
Grant scope based on client.
Grant scope based on resource_owner.
- When a protected API is accessed:
API checks for scope.
Enabling this does not only imply that all scopes are assigned to clients and resource owners. It also implies that the authorization server is able to know which APIs will be accessed by the client. That is often not the case! A client may be able to access the API /calendar but also /email. Both APIs may use the same scope’s read write update.
Unfortunately, a typical authorization request does not include the information of which API will be accessed. The only parameter that could be used is scope. But now scope values cannot be reused for different APIs! It will cause a huge maintenance challenge! The two APIs would now need their own scopes such as read_email write_email update_email. And if you assume that those APIs have multiple versions it introduces another level of scope complexity.
The application CalendarClient is used by owners of a calendar but also by administrators.
- The protected API to access a calendar supports these features:
Read a calendar: scope=read
Update a calendar: scope=update
Delete a calendar: scope=delete
- Update other calendar: scope=write_other
This scope enables a client to update a calendar of other resource_owners.
The client CalendarClient is used by any employee and always requests the same scope: scope=read update delete write_other.
- The authorization server authenticates the client and the resource_owner, and issues those scopes. This means the authorization only checks these conditions:
Valid client requesting valid scope?
Valid user?
Both validations successful? → issue access_token
- The calendar API, however, implements this logic:
For all operations, it will check if the required scope is associated with the given access_token.
For any non-read operation, it will also check if the associated resource_owner is also the owner of the accessed calendar! This is not based on scope but is based on attributes. No other user than the owner should be able to modify the calendar.
In addition, the API has implemented support for writing onto other calendars if the associated resource_owner is an administrator. This is also based on attributes.
To decide how OAuth clients, scopes, resource_owners, and APIs are related to each other, do not hesitate to take the team and simulate different approaches. Make sure team members of different groups within the organization are involved!
On a big side note, be conscious about naming conventions and remember that most resource_owners do not know what scope is. And they should not have to know. If your team implements a Consent page that displays requested scope values (permissions), make sure to not display the scope value by itself! In most cases, that will be perceived as completely useless and confusing.
Client xyz requests SCOPE: read update delete to manage your calendar.
Client xyz would like to manage your calendar.
Scope should always be represented as a human-readable message!
OAuth Consent
One reason why OAuth became popular is the fact that resource_owners are put in control of who can access their data. Most likely anybody reading this book has been in front of a screen that displayed something like this: “Application xyz would like to access your email address. Do you grant this request?” This is the point in time where a click on Grant or Deny shows the power any user has. Clicking Deny simply rejects the wish of an application. No administrator or any other entity can overrule the decision.
Although this is very good, there is something that has not been supported so far, at least not in a larger scale. Whenever a user clicks Grant, there has been no specified location where this decision could have been viewed. Sure, some applications have a section within a user profile saying “Associated applications” or similar. But there is no standardized way of supporting this kind of feature.
In recent months the term “consent receipt” has been brought up often, especially during the introduction of GDPR6 in Europe. It’s exactly what it is called: a receipt for any given consent. This came up first (as far as I know) at the workshop “Internet Identity Workshop (IIW)” in Mountain View, California in October, 20157The concept is similar to a receipt you get after purchasing an item in a store. It states clearly what has been purchased when and where. It can be used to prove that this event happened.
Consent receipt | |
Application: | API Book Consent Receipt App |
Date: | 10. June 2018, 13:10:00 PST |
Permissions: | read write update |
Domain: | example.com |
Expiration: | unlimited, revocation required |
URL: | |
Reason: | Required as an example |
Status: | Active |
It is more important than ever to enable any resource_owner to find an overview of receipts. And, as a vital feature, let resource_owner revoke a consent but without removing the history of such events!
The receipt above could change its state from Active to Revoked when resource_owner decided to revoke access for the associated client.
OAuth and Step-Up Authentication
Let me answer this question first:
What is step-up authentication ?
In general, it means requiring a stronger credential than have been provided in the past. If a user has been authenticated by username and password, step-up may mean providing a one-time-password or answering questions x, y, and z. Step-up is usually defined within a specific domain.
Despite that fact that OAuth by itself has no such concept as step-up authentication, I have been in many meetings about this topic. Most meetings asked the question when to require step-up authentication: during the initial authentication (when granting a client) or at the point in time when a specific API gets accessed?
I always look at it this way: If you want to know if a resource_owner is the one who pretends who he is when it comes to transferring one million dollar, you want the step-up authentication to happen the moment where the money is transferred!
Here is an example.
- API: /transfer
Moves funds from one account to another
- API: /stepup
Authenticates resource_owners
- 1.
Client request:
POST /transfer
Authorization: Bearer {access_token}
Content-Type: application/x-www-form-urlencoded
amount=1000000&from_account=111&to_account=222
- 2.
API:
/transfer: the API validates the incoming request. It realizes that the original authentication statement of the resource_owner, who is associated with the given access_token, is more than 15 minutes old and has an authentication class reference (acr)8 value of 1 but it requires 3! It returns this response, requiring a new, stronger authentication:
HTTP status: 401 (authentication required)
- 3.
The client receives the response and redirects the resource_owner to /stepup.
- 4.
API:
/stepup: It requests a resource_owner to provide a username, password, and an OTP (one-time password), which has been send to his mobile device. Once the resource_owner confirms the OTP, the client redirects him back to /transfer, using the same values as before.
- 5.
API:
/transfer: The validation of the incoming request now succeeds, and the amount can be transferred from one account to another.
If the same step-up authentication had been required during the initial authorization flow, there would be no guarantee that the authenticated user is still the same when the amount of $1,000,000 got transferred.
As a hint, keep this in mind:
Require step-up authentication as close to the requiring event as possible!
Although OAuth by itself has nothing to do with step-up authentication, it may still be related to it!
JWT (JSON Web Token)
The book early on referenced JWT but did not explain what it is. The next section introduces id_token. Before I continue, I would like to explain how JWT and id_token look and how they relate to each other. That should make it easier to follow the next few pages.
- JWT header ({from zero to first dot}.)
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9
base64 decoded: {"alg":"HS256","typ":"JWT"}
- JWT payload (.{between the two dots}.)
eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ
base64 decoded: {"sub":"1234567890","name":"John Doe","iat":1516239022}
- JWT signature (.{after the last dot to the end of the string})
SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5
This simple format enables JWT to be exchanged as an HTTP parameter or header, although they are not bound to HTTP! Generally, JWT may be used in any context. JWTs are also not bound to protocols or frameworks such as OpenID Connect or OAuth. On the other hand, the usage of JWT in OAuth and OpenID Connect are reasons for their wide adoption.
- Message integrity:
A message is communicated between party A and C via B. Party B should not be able to manipulate the message. Therefore, party A creates a JWS using a shared secret. Party C can validate the integrity.
Example: An application supports financial transactions that include a currency, an amount, and a recipient. It is important that none of those values can be manipulated.
- Message confidentiality:
A message is communicated between party A and C via B, Party B should not be able to read the message. Therefore, party A creates a JWE using party C’s public key. Party C can decrypt and read the message, but party B cannot.
Example: An application communicates health data between different parties. Only authorized ones should be able to read the messages.
JWS and JWE both support shared secrets and public/private keys. Shared secrets have to be exchanged via a secure method which, unfortunately, is not specified in the RFCs. Nevertheless, in OAuth and OpenID Connect the OAuth client_secret is usually used for this purpose. For public/private keys, the JWT header may contain complete certificate chains or references to used keys. This information can be used by recipients to determine which key to use for validation purposes. For a list of all header values, refer to RFC 7515, section 411.
Next, I will explain what id_token is and after that how JWT, JWS, and id_token work together.
id_token
Overview of id_token keys
Key | Example | Required | Short description |
---|---|---|---|
iss | https://server.example.com | true | Issuer: The issuing entity. Usually a valid URL. |
sub | 24400320 | true | Subject: Either a username or a ppid (pairwise pseudonymous identifier) |
aud | s6BhdRkqt3 | true | Audience: The audience for whom this id_token is intended for. A client_id of the requesting client. Optionally other audiences. |
exp | 1530774920 | true | Expiration: The 10-digit Unix timestamp (seconds since 01-01-1970) when this token expires |
iat | 1530772920 | true | Issued at: The 10-digit Unix timestamp when this token was issued |
auth_time | 1530772120 | false | Authentication time: The 10-digit Unix timestamp when the resource_owner was authenticated |
nonce | a-ranD8m-4alue | false | A client-side value, opaque to the server. It is available only if the client included it in its authorization request. |
acr | http://fo.example.com/loa-1 | false | Authentication Context Class Reference, specifying the LoA (Level of Assurance) of the authentication |
amr | otp pwd | false | Authentication Methods Reference: A reference to the method of authentication |
azp | s6BhdRkqt3 | false | Authorized Party: The client_id of the requesting client |
The highlighted keys (Issuer, Audience, Expiration) are the ones that are always relevant when validating id_token. Others may be neglected in simple use cases.
Since id_tokens are also JWT, they are expressed as JWS. With that, they are URL friendly and integrity protected! Because of that, id_token and JWT often refer to each other. But keep this in mind:
id_tokens are just one type of JWT!
Creating an id_token (JWT)
- Create the JWT header:
{"typ":"jwt", "alg":"HS256"}: Indicates the usage of a shared secret using the algorithm HMAC-SHA256. The receiving party has to be informed which shared secret to use for the signature validation.
{"typ":"jwt", "alg":"RS256", "kid":"d273113ad205"}: Indicates the usage of a private key using the algorithm RSASSA-PKCS1-v1_5 SHA-256. For validations the receiving party has to use the public key referenced as d273113ad205.
- Create the payload:
This is the id_token
- Create the signature:
- Create the input:
Input = base64urlEncode(jwt-header).
base64urlEncode(jwt-payload)
- Sign the input:
JWT-signature = base64urlEncode(sign(alg, input))
- Serialize the output (referred to as JWS Compact Serialization):
jwt.compact = input. signature
The string jwt.compact can now be returned to a requesting client. The process of validating the JWT will be discussed later.
OpenID Connect
OpenID Connect is referenced as identity layer on top of OAuth 2.0. It adds the missing link between an OAuth application and resource_owners. In particular, it enables developers to implement applications that are aware of the current resource_owner. It also supports identity federation between different parties.
Why OpenID Connect?
In cases where OAuth is used with a response_type (requests send to the OAuth /authorize API), clients are generally not able to retrieve details of the resource_owner. Clients are not able to display a message such as Hello Sascha! Regardless of that, it is often desired. To bypass this limitation (or better, that part of OAuth’s privacy model) applications have implemented proprietary OAuth-protected APIs that simply return those details. In order to access those details, resource_owners must grant permissions (scope) that are also proprietary.
This situation did not make developers lives easier! For example, if a developer wanted to build an application that retrieved user details at two different platforms, he had to use different SCOPE values and different APIs that produced different responses. In one case, it could have been SCOPEs such as wl.basic, wl.emails; in the other case, user_about_me, email. In one case, the API would have been /user; in the other case /me. And with that, responses were different, too.
After some time, the OpenID Foundation13 took on the task of creating a specification to align all those different efforts that were around. OpenID Connect, as an identity layer on top of OAuth, was born!
How Does It Work?
- 1.
Request an access_token granted for specific SCOPEs.
- 2.
Send an OAuth request to the resource_server and receive the resource_owner’s details.
That’s it, on a high level! On a lower level, there are many details around it. But first things first.
- Formalized OAuth SCOPE
openid, email, profile, address, phone,
- Formalized userinfo API that returns details about the resource_owner
/userinfo, request, and response
- Introduced a token, identifying an authenticated resource_owner
id_token (JSON message with well-defined structure)
- Introduced and extended OAuth response_types
response_type=token id_token
response_type=code // this exists in OAuth, but in combination with SCOPE=openid the token response includes an id_token
Additional response_types were added too, but not right from the beginning
For each SCOPE, OpenID Connect has specified a list of claims16 that may be returned. This enables a developer to implement an application that can handle responses of different platforms with one code base.
The way to invoke the /userinfo API is always the same. The response is always the same: a JSON message with a well-defined structure.
The id_token is a JWT and is expressed as a JWS and can be validated by the client without having to send a validation request to the issuing server.
The different response_types allow clients to choose the desired flow, depending on their use case.
An authorization request always starts off at the OAuth /authorize API. Here is a simple example:
GET /authorize?client_id=...&redirect_uri=...&state=astatevalue&...
...scope=openid+email+profile&response_type=token+id_token
SCOPE openid: The client indicates to the server that it is requesting an OpenID Connect flow. Look at this value as kind of a switch, as in OpenID Connect on/ off. If it is not included, any of the other SCOPE values will be treated as non-OpenID Connect values. Some server implementations may even fail the request. The response will include the claim sub, which contains the username as plain text or a ppid, which is expressed as opaque string
SCOPE profile: The client is requesting general information about the resource_owner such as name, family_name, given_name, preferred_username.
SCOPE email: The client is requesting the email address of the resource_owner. The response will also include the claim email_verified. This indicates that the responding platform can confirm that this email address is a valid one.
Response_type token id_token: token is known from OAuth that indicates an implicit flow. The server will respond with an OAuth access_token. In addition, an id_token will be issued. This token cannot be used at any protected API. Instead, it represents an authenticated user.
- Response from /authorize would include this in the redirect_uri:
...#access_token=... &id_token=eyJh...ssw5c&...
- Response from the /userinfo API could look like this:
{"sub": "12ab34cd56ef","preferred_username": "saspr","name": "Sascha Preibisch","email": "[email protected]","email_verified": true}
Although the early version of the Core specification already simplified the life for application developers, many more features were added over time. Nowadays the OpenID Connect ecosystem is a very comprehensive list of specifications including a self-service testing system. The next section explains how to find the way through the specifications, with a focus on authentication and authorization.
How to Leverage OpenID Connect
Within API ecosystems OAuth is a common participant of authorization flows. In addition, OpenID Connect is the de facto standard for the authentication part. For example, wherever a web site provides a button like “Log in with Google” or “Log in with Facebook”, an OpenID Connect flow gets initiated17. Not only can applications design the onboarding process for new users easier this way, they can also reduce the number of times a login and consent screen are displayed.
OP: OpenID Provider (server)
RP: Relying Party (client)
An OP is an OAuth server that also supports OpenID Connect features. Clients may connect to the server and use extended OAuth responses_types such as token id_token. RP registers itself as an OAuth client at the OP and uses an OpenID Connect-enabled OAuth flow to authenticate resource_owners. Any system may take on both roles, too.
- 1.
Take resource_owners through an initial login and consent flow.
- 2.
During consecutive authorization flows, display the login screen only if the resource_owner has no session and do not display the consent screen again.
- 3.
Accept an id_token issued by a third party as resource_owner credentials.
OpenID Connect has many more features, but these three seem to be of the biggest interest. Therefore, I will explain how they are used.
Use Case 1: Take resource_owners Through an Initial Login and Consent Flow
This is straightforward. A resource_owner uses a client to access a protected resource. The client’s implementation requires the resource_owner to be logged in. The client initiates an authorization flow using response_type=code. The flow redirects the resource_owner to the OP, which provides a login and consent screen. Once the resource_owner got authenticated and has authorized the client, an authorization_code gets issued. All of this is standard OAuth.
...&scope=openid+email+profile&...
IF SCOPE contains (openid) THEN persist the consent decision and issue an id_token in addition to other token such as access_token and refresh_token.
This task is emphasized because it is important for the three listed use cases above. The OP may receive other parameters, but they are not relevant for this discussion. As a final outcome, the client will not only receive the default token response but also the issued id_token. With that, the resource_owner is logged in. The client now may send a request to the OP’s /userinfo API to receive resource_owner details.
Use Case 2: During Consecutive Authorization Flows Display the Login Screen Only If the resource_owner Has No Session and Do Not Display the Consent Screen Again
This use case has several aspects to it. For one, the login screen should be displayed only if no session exists. A session is identified by an active id_token. Furthermore, the consent screen should not be displayed again! Not again means it is independent of an existing session and has to be managed as its own entity!
So, how do these requirements work together?
- prompt: This may contain one or multiple values.
none: Do not display a login and consent screen.
login: Prompt for login.
consent: Prompt for consent.
select_account: Enable the resource_owner to select an account. This is useful for users with multiple accounts.
- id_token_hint: This contains a single value.
id_token: The id_token that was issued earlier.
- 1.
The client uses an existing access_token to access a protected resource. The OP validates the token and returns the requested resource.
- 2.
The client’s access_token has expired and therefore it uses its refresh_token to request new token. The OP validates the refresh_token and issues a new access_token and refresh_token. The client uses the new access_token and retrieves the resource.
- 3.Both tokens have expired, access_token and refresh_token. This is the case that is different from default OAuth. Without OpenID Connect, the client would now need to request new tokens by taking the resource_owner through a new authorization flow, which would prompt for login and consent. But, instead the client leverages the additional parameters prompt and id_token_hint. By setting prompt=none the client indicates to the OP do not display any screens to my user! Needless to say, OP still has to validate the request:
- a.To skip the login screen:
- i.
Is the id_token still valid?
- ii.
Fail otherwise
- i.
- b.To skip the consent screen:
- i.
Does the requested SCOPE match the previously issued SCOPE, or a subset?
- ii.
Did the resource_owner provide consent previously for this client?
- iii.
Fail otherwise
- i.
- a.
Using this feature reduces the times a user gets confronted with login and/or consent screens. This not only improves the user experience but also reduces the number of times a resource_owner has to use his password! Each time the password does not need to be used is a step towards password-less systems.
Use Case 3: Accept a id_token Issued by a Third Party as resource_owner Credentials
Federation is one of the biggest features in OpenID Connect! There even is a new, dedicated specification for it: OpenID Connect Federation 1.0 – draft 0519, currently in a draft status (October 2018). The specification will evolve over the next few months. But even without that specification, federation can be supported.
- 1.
Verify the issuer as an accepted third party.
- 2.
Verify the expiration date.
- 3.
Verify the signature algorithm.
- 4.
Verify the signature.
Important
Bullet point 3 is extremely important! Never validate a JWT by using the alg value of the JWT header. It could have been replaced with any other algorithm by a third party and therefore the message integrity cannot be assumed!
Validating id_token in Detail
As mentioned, there are several signature algorithms available. In the case of HS256, the OP and RP usually agree on using the client_secret for creating and validating the signature. There is hardly a question on how to distribute that value.
- OpenID Connect Discovery20
A specification describing a discovery document (JSON) that lists features that are supported.
It’s list of APIs, supported response_types, SCOPEs, and other details.
- /.well-known/openid-configuration
The API returning the discovery document
- /jwks.json
The API containing a list of JSON Web Keys (more or less the public certificates required for RS and ES-based signature algorithms)
OpenID Provider
- 1.iss (issuer)
- a.
The OP publishes its iss value. This can be a URL.
- b.
By specification, this URL does not need to be resolvable, but in my experience, this is usually the case.
- c.
iss itself has to appear in the OpenID Connect Discovery document (issuer).
- d.
Ideally this value is the only one a RP needs to configure!
- a.
- 2./.well-known/openid-configuration
- a.
The OP configures all details of its system that should be publicly available.
- b.
This URL is standardized. RP should be able to use it like this:
- i.
{iss}/.well-known/openid-configuration
- a.
- 3./jwks.json
- a.
The OP configures this API to return a list of public keys that are used for JWT signatures.
- b.
The keys are expressed as JSON Web Key Set (JWK / JWKS21).
- c.
Each key is identified by a key ID (kid).
- d.
When the OP issues an id_token (JWT) the JWT header needs to include the matching kid!
- a.
Here are example documents.
After the OP has prepared its environment, it can start issuing id_token ( JWT).
Relying Party
- 1.Configure accepted iss.
- a.
The RP configures its application to accept only JWT issued by one or multiple configured parties, such as https://example.com/op/server or https://anotherone.com/op/server .
- b.
Only accept the HTTPS scheme. Fail otherwise!
- a.
- 2.Configure expected alg.
- a.
As mentioned before, NEVER trust the alg found in the JWT header!
- a.
That’s it!
The next step is to implement the validation flow that starts after receiving the id_token (JWT). There are many steps required but once implemented it is actually straightforward. The flow should execute CPU (calculate signature) and latency (network calls) heavy operations late in the process:
- 1.
Base64 decode the JWT-payload (the part between the two dots).
- 2.
Extract iss and compare the value against a configured, acceptable one.
- 3.
Extract exp and check that it has not expired.
- 4.
Extract aud and check if the client_id is included.
- a.
This may be skipped for federation cases.
- 5.Base64 decode the JWT-header and check if at least kid, alg, and typ are included.
- a.
alg has to match the expected value.
- b.
Fail otherwise!
- a.
- 6.Retrieve the discovery document:
- a.
GET {iss}/.well-known/openid-configuration
- a.
- 7.
Extract the jwks URL (jwks_url) as found in the discovery document.
- 8.Retrieve the JWKS.
- a.
GET {jwks_url}
- b.
Only accept the HTTPS scheme. Fail otherwise!
- a.
- 9.Find a kid that matches the one found in the JWT-header.
- a.
Fail if there is none!
- a.
- 10.Extract the associated JWK and use it to validate the JWT signature.
- a.
Recreate the signature and compare it to the given one.
- b.
Fail if it does not match!
- a.
Any other validation is most likely application specific.
OAuth vs. OpenID Connect vs. LDAP
This content is based on one of my blog posts. I decided to include it in this book and within this chapter because this topic causes a lot of confusion according to questions I have received in the past. It relates to API design and can be seen as an add-on to the last section.
If OAuth is a set of characters, OpenID Connect creates words and a language using them.
OpenID Connect is a profile on top of OAuth just like HTTP is on top of TCP.
OAuth knows about apps; OpenID Connect knows about users.
Let’s get started!
LDAP (Lightweight Directory Access Protocol)
To authenticate a user: Compare the given username and password against values found in the LDAP.
To retrieve attributes: Retrieve firstname, lastname, role for a given username.
To authorize users: Retrieve access rights for directories for a given username.
I believe that most developers at some point in time have to deal with an LDAP server. I also believe that most developers will agree with what I just described.
OAuth
OAuth is a framework that enables applications (clients) to gain access to resources without receiving any details of the users they are being used by. To make it a little more visual I will introduce an example.
The very cool app named FancyEMailClient
For each email provider, the user provides details such as smtp server, pop3 server, username, password on a configuration page within FancyEMailClient.
FancyEMailClient now accesses all configured email accounts on behalf of the user. More precise, FancyEMailClient is acting AS the user!
The user has shared all details with FancyEMailClient. I must say, it feels a little fishy; don't you agree?
FancyEMailClient is an OAuth client and gets registered at each email provider that should be supported.
FancyEMailClient does not ask users for any email provider details whatsoever.
FancyEMailClient delegates authentication and authorization to the selected email provider via a redirect_uri.
FancyEMailClient retrieves an access_token and uses this token at an API such as /provider/email to retrieve the user’s emails. The access_token may be granted for scope=email_api.
FancyEMailClient has no clue who the user is and has not seen any details such as username or password.
This is perfect in regard to the user’s privacy needs. However, FancyEMailClient would like to display a message such as “Hello Sascha” if Sascha is the user, but it can’t.
OpenID Connect
As I explained above, a client does not get any details about the resource_owner. But, since most applications would at least like to display a friendly message such as “Hello Sascha” there needs to be something to help them.
To stick to the email provider example, before OpenID Connect (OIDC) was born, these providers simply created OAuth-protected APIs (resources) that would return details about the resource_owner. Users would first give their consent and afterwards the client would get the username or firstname and would display “Hello Sascha.”
Since this became a requirement for almost any OAuth client, we now have a common way of doing that, specified in OpenID Connect. OIDC has specified SCOPE values, a /userinfo API, and an id_token that represents an authenticated user.
- 1.
When requesting access to emails, also request access to user details. The request would now have to include something like ...&scope=openid+profile+email+email_api&... (scope == permissions like access control).
- 2.
During the authentication and authorization flow, the user would not only grant access to his emails but also to his personal details.
- 3.
FancyEMailClient would now receive an access_token that could not only be used at /provider/email but also at /provider/userinfo.
- 4.
FancyEMailClient can now display “Hello Sascha!”
Now the big question: How does it all come together?
LDAP servers are the only component that exists without OAuth and OpenID Connect. LDAP servers are always the source of users (and maybe also clients and other entities). LDAP servers have always been used to authenticate users and have been leveraged to authorize them for accessing resources. OAuth and OpenID Connect can’t be supported if no LDAP server is available. OAuth and OpenID Connect are protocols only, not systems to manage users.
Here is how FancyEMailClient works using the different technologies.
Case: OAuth
- a.
When a user selects an email provider within FancyEMailClient, his browser gets redirected to that provider. It is an OAuth authorization request and includes OAuth SCOPE values. To access the API /provider/email, a SCOPE value such as email_api may be included. I say “may” because there is no standard SCOPE for that. To also gain access to the user details, other SCOPE values need to be included. This is more straightforward since they have been specified within OpenID Connect. An openid profile email would be sufficient and is supported by practically all OIDC providers. In the end of the flow, FancyEMailClient gets back an OAuth authorization_code.
- b.
The user only shares his credentials with EMailProvider. He types them into the EMailProvider’s login page and EMailProvider will validate them against his LDAP server. (The LDAP server may be a database or any other system that maintains user details.)
- c.
After receiving the OAuth authorization_code FancyEMailClient exchanges this short-lived token for an OAuth access_token. That access_token provides access to resource APIs. I hope it is obvious that this exchange request is a backchannel request; no browser is involved!
- d.
FancyEMailClient accesses /provider/email and /provider/userinfo by providing the OAuth access_token it received earlier. Although both APIs require an access_token, there is one difference. /provider/userinfo is an OpenID Connect API whereas /provider/email is an API proprietary to the EMailProvider. Let's call it a plain OAuth-protected API.
- e.
In this area I want to emphasize the role of the LDAP server. As you can see, it is involved during almost all requests.
Case: The Old Days
A user would share his credentials with FancyEMailClient. And he would do this for each single provider he had an account with. FancyEMailClient would probably also ask for other details so that an API such as /provider/userinfo would not even be necessary. FancyEMailClient would now collect all this sensitive data and could do whatever it wants with it. That is a big disadvantage!
Another disadvantage is the fact that the user’s credentials are now used for each single request. This increases the chances for them being exposed.
OAuth, OpenID Connect, and LDAP are connected with each other. But I hope it becomes obvious which component plays which role and that one cannot replace the other. You may say that my explanation is very black and white, but I hope that it clarifies the overall situation.
Summary
This chapter discussed questions that come up in the context of API-based authentication and authorization and gave an introduction to patterns that should be avoided or explicitly addressed.
Authentication and authorization were discussed and distinguished from each other. You should now be able to decide at which point in message flows authentication and authorization should be handled.