Introduction to OAuth

The Internet has become so diverse with tools and applications we often run into the problem of connecting them together. Applications now collaborate and share information like never before, giving users a better experience and a more powerful online presence. This chapter is going to discuss how OAuth connects these services together without putting your personal information at risk.

Challenges of Authorization and Authentication

Authorization and authentication are difficult promises to solve in software and one of the most fundamental expectations users have when using an application. It’s reasonable to expect other users will not be able to see private portions of your account, create unauthorized content with your account, or take any other sort of action without your knowledge. As software developers, we try to guide our users into simple steps to protect themselves from other users gaining access to their account, but as we know, it doesn’t always work. Bear in mind, right now we’re discussing a single user in a single system and haven’t even scratched the surface of how one user is going to safely integrate multiple accounts. As you can see, this is a very important challenge to overcome.

Before we go any further, we need to define a couple of key terms used throughout the book. The two main concepts of a site that has user accounts is ensure users can log in, and to ensure users can only operate in ways allowed by the application. We’re going to look at these two concepts in greater detail right now.

Authentication

If we really think about the act of a user logging into a system, we expect the user will identify themselves using key pieces of information only they (hopefully) know. Commonly this is a password. Authentication is the process by a which a user verifies their identity to gain access to a system. Though the processes of logging in and having access to all the appropriate information are seamless to the user, they are really completely separate. Authentication allows the application to verify the user exists in the system, nothing more. The authentication process allows access to the application, but doesn’t specify what resources the user is able to see or modify. Only after the user’s identity is confirmed, can we determine what the user is allowed to access.

Consider a business that requires the use of badges to gain admittance to the building. If an employee shows up and swipes their badge, they are identified as an employee of the company and able to gain access. In the same way, a valid username and password identifies a user in an online application. There are other resources within the building which certain individuals may or may not have access to, like the supply closet or a server room. Simply being an employee of the company doesn’t grant them access to these protected resources, it merely identifies the user as someone who works for the company.

Authorization

Authorization determines whether or not the identified user has access to a certain resource or set of resources. If we operate under the assumption users can only post content from their own user account, authorization is the step which prevents user “joesmith” from posting a status or an update as “janedoe”. With authorization, we already have an authenticated user and know who we are dealing with before we make any determination to whether the action being attempted is allowed.

If we go back to the previous real world example, using our badge to gain access to the building doesn’t necessarily give us access to all the resources in the building. If an employee belongs to a group of employees which has access to a supply closet or server room, the same identification badge can be used to gain access. The important difference is of course, the specific user has permission to enter the room, where as, others do not. This concept is sometimes known as Access Control, but it is the same process of specifically restricting people from accessing resources or rooms.

More Challenges

In a web application, we often notice we are only required to log into the system once, until our identification expires and we must re-identify. This is because web applications have the ability to maintain a specific application state. An application requiring a user to log in every time they wanted to go to a different resource or page within the application would be quickly discarded as unusable. As applications grow and require more integrations from outside systems, this process becomes more complicated.

If you are like me, you don’t want to provide your username and password for one application to a completely different application. If there were a security breach on another service, the attackers would quickly be able to gain access to other accounts you own, and the potential damage they could cause is much greater. We also know we don’t want another service to login to another service for us, and we trust there is a secure method of retrieving information that doesn’t expose our secret identification. Because of this, the use and production of APIs has grown tremendously in order to integrate two completely different systems.

The other large problem we run into is when we are accessing information from another service, we’re doing so in stateless manner. This means when we allow a service, for example Facebook, to access our Twitter account, neither Facebook nor Twitter are keeping track of the fact at some point we logged into the other system. This means every time we make a request for a resource, we need some way to authenticate ourselves and allow the service to determine if we are authorized to access a resource.

This section provides more questions than it does answers and that’s alright; we’re going to get into how OAuth solves these problems a little later in the chapter. For now, it’s important you understand the difference between authorization and authentication and how they are used in the context of an application.

Differences Between OAuth 1 and 2

As with any piece of software, when enough enhancements become obvious new versions spring to life adding useful features and removing features which are no longer needed. In this regard, OAuth is no different; while you can still find services supporting OAuth version 1, version 2 was created to correct some of the issues presented in version 1. This section will detail what some of these changes are without diving into all the technical details. It is important to understand the differences between the two, specifically if you are considering an OAuth implementation for your business or service.

Signatures

OAuth signatures are used, as you might imagine, to sign the requests you’re making to the API. This signature is one of the main components necessary to identify the user making the request. OAuth version 1 had very stringent requirements when it came to the signature, including sorting the elements in authorization header alphabetically and encoding the signature before sending it. OAuth version 2 does not rely on generated signatures to identify the user, though the user must pass along an access token in place of the signature. The simplified method of authenticating removes the need to parse, sort, and generate individual signature components.

Short-lived Tokens

Tokens are used in OAuth to identify a user and are generated when the user grants permission for another application to use his or her data. Tokens serve as a substitute for passing usernames and passwords in the request and unlike usernames and passwords can be revoked and regenerated at any time. OAuth 1 provides tokens which are available for a long period of time to ensure the user doesn’t have to continually regenerate new tokens. As with username and password combinations, tokens that exist for a long time are less secure because they are tied to the same entity for a long time. OAuth 2 provides the ability to generate short-lived access tokens which shorten the amount of time an individual token is tied to a specific user. In addition to the short-lived token, there is also a long-lived refresh token which allows the tokens to be regenerated after they expire, and doesn’t require the user to regenerate the tokens themselves.

Bearer Tokens

OAuth 2 provides the ability to generate Bearer Tokens, which are able to be used without cryptography. They are sent over HTTPS and serve to identify a user without the need to generate a signature. This tends to be a commonly used method of authentication in OAuth and was not available in OAuth 1. They do not provide any additional security outside of the protocol they are transported by, which should always be SSL.

This is a quick look at some of the changes and differences between OAuth versions 1 and 2. These concepts will be covered in greater detail in the corresponding chapters.

When Do I Need a Client/Server

As we start to get a deeper understanding of OAuth, authentication, and authorization in general, we need to decide what types of OAuth components we need to implement it in our application. OAuth has both client and server specifications and they serve very different purposes. It is important to understand the roles and functions of the both components when designing an application or API.

Client

The term Client indicates this component will be making requests to obtain or alter information on another system. Typically, a client application will read and manipulate data from another service. The sole responsibility of the client is to create the requests and provide the proper identification of the user that wishes to make them, to ensure the user exists, and the action is permissible. If your application is going to be consuming or altering data on a third party system using OAuth for authentication, you will need to provide an OAuth Client capable of creating and signing the requests in a way the server will understand. That is to say, if the third party service is using an OAuth 2 server, the requests must be formatted to fit the OAuth 2 specifications. In that case, you will need an OAuth 2 client. In a system interacting with multiple third-party sources, it is possible you will need to provide a client for each version and idiosyncratic implementations of OAuth so the properly formatted requests can be made to each source.

For example, when Twitter first became popular, it allowed many different client applications to fetch a user’s stream, post updates, and send messages.

Server

The term Server indicates this component will be receiving the requests from a client. The server is the application which has resources clients want to fetch and interact with. The server must verify any request is signed properly and handle the authorization component of the request as well. The server will attempt to identify the user making the request, ensure that they are authorized to make such a request, and verify the resource being requested actually exists.

In the event the user is identified and authorized for the action they are attempting to make, the server will then allow the action to be carried out and return a response. The response is broken up into two different components: the response headers and the response body. If the request was successful, the body is the content being requested; otherwise, it generally contains a specific error message explaining why the request failed. The headers also give a good deal of information about the request, including the content type, status and length of the content. Status codes are universally understood and eliminate the need for each API to develop its own error code system. By sticking with HTTP status codes, you can clearly communicate the status of the requests with almost any user in the world.

In Chapter 1, we covered HTTP Status codes. These are used to communicate to the client whether the request was successful. If your application will be providing an API and you intend to use OAuth to handle authorization and authentication, you will need to provide an OAuth Server to handle the requests that are being sent by the client.

Conclusion

As you can see, it is possible for an application to require both the use of an OAuth server and an OAuth client in order to meet the needs of the application. The chapters covering the client and server implementations for each version of OAuth will provide far more technical detail. As you start to design your application, it is important to understand the differences between the client and server, and how to determine whether you will need either or both to meet your needs.

Solving Auth Challenges with OAuth

We’ve covered the difficulties of authentication and authorization with APIs in some detail. It makes sense to discuss a few ways OAuth helps solve some of these challenges. The most difficult balance we have to achieve is between security and usability. OAuth, while not perfect, helps us bridge the gap and provide a secure authentication process that’s easy for the end user to consume.

Repeated Auth

We’ve discussed briefly when consuming APIs we expect the environment to be stateless, because communication take place over HTTP. This means the API is not going to remember a specific client previously authenticated. By definition, this requires a user to be authenticated and authorized for every single request. You already know how quickly users would flee your application if they had to provide a username and password for every request made (keep in mind some pages make multiple requests).

OAuth handles this repeated authorization process with tokens and signatures and—with the exception of Bearer Tokens in OAuth 2—requires every request be signed appropriately. A token identifies a user as an authenticated account on a third party service without exposing the user’s login credentials. When authorizing access from a third party, a user will generally log into the third party website and access tokens are generated to identify the user on other systems. With these tokens, the application can build the proper signature (which will be discussed in greater detail in later chapters) and request information from the third party service. The configuration of the service usually requires these tokens be provided once so each request can be signed without the need to continually provide the tokens. The tokens can be regenerated or revoked on the third party website at any time, which is often less time consuming than resetting a password and adds an additional layer of security.

Security

We know sending our credentials in a raw, or unencrypted, format is dangerous because it increases the probability they can be stolen from a request by anyone who can read the contents of the request. The opportunity for having our username and password stolen increases with every request we send. Regardless of how we authorize and authenticate, it’s important to use SSL to transport credentials so the data is not sent across the Internet in a human readable format. It’s also important to keep the SSL certificates current and updated. With that disclaimer out of the way, let’s take a look at how OAuth provides a higher level of security than other methods.

If we were to use our username and password to authenticate every request, we’d be required to send them with every single request. Since the basic expectation of the user would be to provide the credentials once per session, we would also expect the client application we were using to store those credentials. As we’ve seen from some very public and embarrassing revelations in the recent past, even large reputable companies don’t always take care of our credentials the way they ought to. One example might be storing passwords with a one-way hash. Obviously, providing a username and password to any other entity opens us up to our credentials being misused. Or worse yet, the potential situation also exists where the entity is hacked and credentials for various external applications are obtained with malicious intent. At this point, we’re left to hope and trust our information is stored in a secure manner. If companies don’t secure credentials to their own applications, it’s silly to expect them to secure the credentials for applications which integrate with theirs. It only takes one hack or one disclosure for sensitive data to be obtained and for people’s lives to be potentially ruined.

The use of tokens is very important to OAuth as it allows a secure way for users to be identified. If an application were to be hacked and this data exposed it would take a matter of minutes for the tokens to be regenerated rendering any unauthorized attempts fruitless. Tokens are usually generated in pairs, there are tokens which identify the application as well as the user. There is a public token that can be viewed by looking through the HTTP headers, but the private token is used to build the signature and without it, the requests cannot be properly signed. It’s the very reason the user is informed to not share the secret token with anyone else when they’re generated.

OAuth also requires the use of a nonce to protect against replay attacks. In simple terms, a replay attack is when an attacker eavesdrops on a client server transaction and attempts to resend the transaction. A nonce is an arbitrarily generated number that can only be used once. A request using the same nonce twice in a predetermined amount of time will be rejected. This prevents the same request from being made repeatedly and generally relies on the authentication method to create the nonce with every request.

As you can see, OAuth was designed with these concerns in mind and strikes a good balance between security and user experience. While OAuth certainly helps us make good, safe decisions it is still very important to utilize defensive programming to limit the vulnerability of your application. Your application can still be vulnerable to many other attacks, even if you use OAuth. Don’t be lulled into a false sense of security your application is safe. Work to ensure every aspect of your application or API is reasonably protected against all types of attacks.

Removing the Magic

The term magic is generally used when a software component works and the developers working with it can’t easily understand how. Magic is never a good thing; when decisions are made and improperly abstracted, debugging becomes nearly impossible. For a lot of people OAuth seems like magic, in large part because it’s a completely different authentication model than people are accustomed to using. While there is a learning a curve, it is still very important to understand how the components of an OAuth request are generated.

Some of the concepts are unfamiliar. To ensure we are not relying on software that isn’t understood by the developers, we have to really break down what the code is doing and how it’s producing these results. Removing the magic isn’t an attempt to rewrite or re-engineer; it’s an attempt to understand the core of the code to ensure we don’t get burned in the future.

Using Existing Libraries is Good

The best code is the code you never had to write, and in the PHP world it’s becoming easier and easier to find packages and libraries capable of providing much of the functionality you would have to write otherwise. The same is true for OAuth, the purpose of this book is not to encourage you to write your own OAuth implementation library, but to more fully understand what the libraries you are using are doing for you. Widely used libraries have already gone through a vetting process where major bugs can be found and remedied. Essentially, all the work has been done for you. Good libraries have also been used by many other developers and have the reputation of being reliable. By avoiding Not Invented Here syndrome or NIH, it’s possible to understand what the library is doing in a very granular way and redirect the use of your time to writing the components and features your application requires. While I’m not suggesting people write their own OAuth implementations, understanding how it works also provides opportunities to contribute to existing libraries.

Nothing is perfect, but what is important is the code we are writing solves the problem it was intended to solve. Your users won’t care if you wrote your own OAuth implementation, but they will care considerably more if your implementation misses the mark and exposes them to vulnerabilities and annoying breaks in functionality. Technology moves and changes quickly; having an understanding of the underlying technology can put you in a position to contribute heavily when things change. If you really have an interest in writing an implementation, consider contributing to Open Source libraries to improve their code and documentation.

Decoupling Auth

Hopefully we all strive to write small modules of code which can be used together and reused. How many auth implementations are tied directly to the application using it? Significant changes in the data store or how the application operates shouldn’t require significant changes in other areas of unrelated code. The term coupling refers to how closely connected different components in our application are relative to each other. Tightly coupled applications generally exhibit a behavior of cascading changes. Or more simply put, when you make a change to a module of code many other changes must be made to separate modules for the application to function correctly.

By using OAuth, we get started down the right path of loose coupling. We know our Authentication process is going to be singularly focused and we’ll be able to easily drop this protocol in place in our application. The key is to be able to ensure the Auth process knows as little about the application as possible to do its job. By using tried and tested libraries, we can ensure our Auth process stays out of the way of the rest of our application, which will allow us to maintain functionality without a major headache or support burden.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.134.102.182