© Yvonne Wilson, Abhishek Hingnikar  2019
Y. Wilson, A. HingnikarSolving Identity Management in Modern Applicationshttps://doi.org/10.1007/978-1-4842-5095-2_16

16. Troubleshooting

Yvonne Wilson1  and Abhishek Hingnikar2
(1)
San Francisco, CA, USA
(2)
London, UK
 

When you have eliminated the impossible, whatever remains, however improbable, must be the truth.

—Sir Arthur Conan Doyle, British author, from The Sign of the Four (1890)

You’ve created your application, fired off your first authentication, and your hard work blows up in a roaring flame of error messages… or worse, nothing happens, no error messages are displayed, and you don’t have a clue where to start looking! Fear not, there is a methodical approach to debugging authentication and authorization issues. We’ll share an approach, tools, and techniques, and soon you’ll be solving authentication and authorization issues with the mastery of Sherlock Holmes!

Get Familiar with the Protocols

A working knowledge of the identity protocol(s) you are using for authentication and API authorization is helpful. These protocols involve browser redirects and/or HTTP requests/responses between several components. Troubleshooting will be easier if you are familiar with the expected sequence of interaction for a particular scenario. You can capture an HTTP or network trace for a situation and compare it to the expected interaction as described in a protocol specification and/or identity provider documentation, to identify where things are going wrong. It is particularly helpful to know
  • The sequence of interaction for different scenarios

  • The parameters expected by each protocol endpoint

  • The responses and error codes returned by each endpoint

In addition to the protocol specifications, you should know the identity provider APIs and SDKs you are using. Vendors may extend a specification when they implement a protocol. Using an API testing tool to try out calls with various parameters and observing the results can give you a better understanding of the identity provider and APIs you use, which can help when debugging issues.

Prepare Your Tools

The following tools will help you debug an issue:
  • An environment where you can duplicate a problem and test

  • Two independent browser windows

  • Tools to capture and view HTTP traces

  • A tool with which to test API calls

  • Tools to capture and view network traces of back-end API calls

  • Tools for viewing and creating JWT and SAML 2.0 tokens

The next few sections will explain why each tool is necessary in more detail.

Test Environment

It is often helpful to have an environment in which to duplicate a problem. For some issues, you may be able to use your production environment to collect all the data you need. For others, you may need an environment where you can experiment and change settings as part of your investigation. You’ll need a test environment with an instance of the identity provider used and with an account for a test user. Ideally, you’ll have administrative access so you can alter configuration settings or create users with different profiles if needed. Having a test environment in which to test and debug an issue avoids any impact to your production system from debugging activity.

Independent Browser Windows

In addition to a test environment, it is helpful to bring up two independent browser windows. You can use two different browsers or two windows for the same browser with “Private” or “Incognito” browsing mode so they do not share cookies between them. One browser window is for testing the broken login issue as an end user. The other browser window is for accessing the administrative interface of your identity provider or application to make configuration changes. Independent browser windows ensure that activity in one window doesn’t impact the other window and give you confusing results.

It is also necessary to note how each browser handles sessions, particularly whether it will save or reconstitute a previous session. When testing, it is best to start with a clean browser session, unless the issue you are debugging does not occur with a new session. A new browser session ensures that there are no cookies or state from a previous session to confuse results. Browsers now offer the ability to restore state from a previous session so it is best to use a new “Private” or “Incognito” browser session and start with a clean slate each time.

Capture HTTP Traces

You’ll need a browser with the ability to capture an HTTP trace. Google Chrome and Firefox both offer dev tools features that provide a built-in ability to capture an HTTP trace in the “Network” tab. Internet Explorer also has a built-in HTTP trace capability, accessed by pressing F12. Safari’s Web Inspector, accessible via the “Develop” menu option, enables you to capture network activity in the Network tab. Learn how to capture an HTTP trace in every browser that your service officially supports so you are prepared to debug issues on each browser.

If you are collaborating with others, it is convenient to be able to dump the HTTP trace to an HTTP Archive format file (.har file). A .har file will capture everything, including the cleartext value of any secret (client secret, password, API key, etc.) entered or transmitted during the capture. If you can’t avoid capturing a secret by limiting a trace to only a part of the interaction, you should reset the secrets after capture so you don’t expose valid secret(s).

View HTTP Traces

If you receive an HTTP Archive (.har) file from someone else, you’ll need a tool to view it. A list of HTTP trace and .har file viewers as well as other useful debugging tools is included in Appendix E. The ability to view traces sent by others is useful if you cannot duplicate a problem yourself.

Make API Calls

Another valuable tool is an API client explorer that allows you to create and send API calls. This provides a convenient interface for learning, debugging, and testing API calls for identity providers as well as your own APIs. Vendors of identity services (or authors of other APIs) may provide ready-built packages with calls for their APIs that you can import into such tools. Appendix E lists some current tools in this category. You can use such tools to test and debug individual API calls which can facilitate finding the source of problems.

View API Calls

If you make API calls from a back-end application component or native application, you will need a different mechanism to capture the calls as they do not go through a browser. You can use a network web debugging proxy tool or a debugger. Appendix E lists a few tools for this purpose.

View JWT and SAML 2.0 Tokens

A tool to decode and view the security tokens received by your application is essential. Appendix E lists a few sites which are useful for viewing JWTs and SAML 2.0 requests/responses. These tools will allow you to inspect the contents of the tokens. They may also provide you with a way to create test tokens for sending to APIs for tests. With these tools in place, you’re ready to start debugging.

Check the Simple Things

You may save yourself some time by checking a few simple things before diving into a detailed analysis:
  • Check the identity provider is accessible and not experiencing an outage.

  • Check the credentials supplied are correct for the environment (test vs. production).

  • Check the login account and credential are not disabled or expired by logging in to the identity provider through another application.

  • Check the application is using the correct URL for the identity provider.

  • Check the client ID in the application matches that registered in the identity provider.

  • Check the redirect/callback URL for the application matches the URL registered in the identity provider.

  • Find any error messages to see if they provide valuable clues.

Once you’ve checked the simple things, if the issue is reported by someone else, ask questions to understand the problem so you can replicate it or focus your debugging on the most likely spot. Start by asking for a general description of what happens, followed by questions to elicit more details on what the user did so you can replicate the situation. Be sure to ask about any error messages displayed on the screen or in any log files. Also ask what the user expected, because sometimes users can have an incorrect expectation of how a system is supposed to work!

Gather Information

Troubleshooting is facilitated by knowing what questions to ask. Identity solutions involve many components, including your application, the user’s browser, and an identity provider. There may also be APIs or an authentication hub in the mix. Any of these components could potentially contribute to a problem. The following questions will give you useful information to replicate the problem and/or narrow down the possible source of an issue.

How Many Users Impacted?

Is the issue experienced by all users or just a few? If only a few users, the issue is most likely caused by something unique to those users’ profiles or their environments, such as browser configuration settings. On the other hand, if all users experience an issue, it is probably caused by something in the components common to all users, such as the application or the identity provider.

Contributing Environmental Factors?

Does the issue occur with all browsers, devices, locations, or platforms or just one? Testing with different browsers, devices, locations, and platforms can identify if there are any environmental factors contributing to the issue. If an issue occurs in multiple environments, it is probably not caused by an environmental factor, and debugging should focus on other components such as the application or identity provider. If, however, an issue occurs on only one browser or type of device, your inquiry should focus on whether the browser or device could cause the issue.

Which Applications Impacted?

How many applications does the issue affect? If there are multiple applications involved in a scenario, it can be helpful to test each to see if the problem occurs in all of them or just some applications. If all applications experience the issue, the problem may be caused by an issue at the identity provider. If only one or some applications experience the issue, it is probably caused by the application code/configuration or the configuration for the application(s) at the identity provider.

Consistent or Intermittent Issue?

Does the problem happen consistently or only intermittently? An intermittent problem will be easier to debug if you can reduce it to a problem you can reliably reproduce. Try checking where one instance of a component out of several could be misconfigured, such as one application server or one firewall instance out of several. Shut them all down and start them one at a time to see if the issue occurs consistently with one.

Worked Previously?

Does the issue occur in an application that worked previously but suddenly stopped working? If so, check for recent changes, such as the following:
  • Identity provider outage

  • Change to identity provider API or API used by the failing application

  • Network connectivity issue

  • User password expired

  • Recent software upgrades

  • Recent browser or device configuration changes

  • Certificate expiration or key rotation

  • Servers with incorrect time due to NTPi not running

These are common causes of failures of previously working systems.

Where Does Failure Occur?

How much of the authentication and authorization sequence of interaction completes, as observed during a login transaction or in an HTTP/network trace? Noting where the interaction stopped often suggests which component to investigate first.

Replicate the Problem

If the issue is reported by someone else, it is valuable to replicate the problem in your own environment. This can determine if the other person’s environment contributes to the issue. It also provides a test environment in which to try different things to gather more information about what causes the problem to appear. This is particularly useful if the person reporting the problem is unable or unwilling to test different scenarios to aid debugging.

Analyzing an HTTP/Network Trace

An HTTP or network trace of a broken scenario is invaluable for debugging. In this section, we’ll describe what to look for in a trace.

Capture a Trace

A trace of HTTP and API calls will be one of the most valuable debugging aids. Using a debugger or other tracing tool, perform the failing authentication, authorization, or logout transaction starting from the beginning and going as far as you can through the sequence. When done, stop the trace to minimize the capture of irrelevant data. If you receive a trace captured by someone else, use a suitable tool to view it.

Check Sequence of Interaction

The first thing to check is the sequence of redirects or API calls to see how much of the expected interaction succeeded. The sequence diagrams in earlier chapters may be helpful for this. For OIDC or OAuth 2.0, look first for a call to an “authorize” endpoint on the authorization server. For SAML 2.0, look for a “SAMLRequest” message to the SSO URL of the identity provider. Then look for the requests to prompt the user to log in and for a redirect or response back to the application after the user has authenticated. For OIDC/OAuth 2.0, this will be to one of the callback URLs configured in the authorization server. For SAML 2.0, this will be a SAMLResponse message to the ACS (Assertion Consumer Service) URL configured in the identity provider. If you do not see the complete sequence of expected calls and responses, the place where the interaction started to deviate from normal is a clue for where to start looking for issues. Table 16-1 provides some symptoms and possible causes.
Table 16-1

Symptoms and Issues

Symptom

Possible Causes

User never redirected to identity provider.

Application has incorrect URL for identity provider.

User redirected to identity provider but no login prompt.

Application sent malformed request.

Incorrect client ID or client secret.

Error in identity provider login page configuration.

User prompted to log in but receives error.

User error. Test with a different account.

User password has expired.

Wrong password for environment.

User account does not exist.

Identity provider lost connection to data store.

User logs in without error, but not redirected back to application.

Incorrect or invalid callback URL for application at authorization server (OAuth 2.0/OIDC).

Incorrect Assertion Consumer Service URL for application at identity provider (SAML 2.0).

User redirected back to application but receives authorization error, or application content doesn’t display.

Tokens or assertions returned to application are malformed or do not contain information expected by application.

Exchange of authorization code for token fails.

Application not granted necessary scopes.

Check Parameters in Requests

Check the parameters in a request. For OAuth 2.0 or OIDC, check the following:
  • Request is sent to the correct endpoint at the authorization server.

  • Correct response_type used for the desired grant type or flow.

  • Scope parameter value is adequate for the requested action.

  • Callback URL matches what is registered in the authorization server.

  • A state parameter value is specified, if required by authorization server.

For SAML 2.0 requests, check the following:
  • Request is sent to the correct URL at the identity provider.

  • Request specifies the binding for a response, if required.

  • The correct certificates and public keys have been configured.

Check HTTP Status Codes

The next step is to check the HTTP status code on the response from the authorization server or identity provider. Table 16-2 lists some common HTTP status codes for error scenarios and some possible causes.
Table 16-2

HTTP Status Codes and Possible Causes

HTTP Status Code

Possible Causes

400

Malformed request. Check your request has the correct parameters and valid values for them.

401

Unauthorized. Check the application or user has the necessary privileges for the request.

403

Forbidden. Check the application or user has the necessary privileges for the request.

500

Internal Server Error. Check the configuration at the authorization server or identity provider.

503

Service Unavailable. Check if the authorization server or identity provider service is running and reachable.

Check Security Token Contents

If the HTTP status code does not indicate there is an error, check the security token(s) returned. Appendix E lists tools for viewing the contents of these security tokens. Check the relevant security tokens to see if they are formatted correctly and they contain the requisite information.

For ID Tokens, check
  • ID Token contains the correct user information in the “sub” claim.

  • ID Token contains any other claims expected by the application.

For Access Tokens that can be viewed, check
  • Scopes granted to the application are adequate for the request.

  • Access token contains any claims needed by an API.

  • Audience for the token is correct for the intended recipient API.

  • The access token is valid and has not expired.

For SAML 2.0 SAMLResponse messages:
  • Subject, Name identifier element contains a user identifier expected by the application.

  • Additional attribute statements expected by the application exist.

An application may need information for authorization conveyed in custom claims. If such authorization data is missing from an ID Token or SAML 2.0 assertion, the user may get an “unauthorized” message or possibly a blank screen. If an API you call requires custom claims in an access token, your program may get an error status from the API. You should check the contents of the access token if possible. If the access tokens are in JWT format, they can be viewed in a JWT viewer. If they are opaque strings, however, you may need to use an introspection endpoint on the authorization server to get information about the token. If the contents of the security tokens are correct, another possible cause of issues is a problem validating a security token.

Check for Security Token Validation Errors

After an application receives a security token, it must validate it. The security tokens returned by OIDC and SAML 2.0 are digitally signed. They may also be digitally encrypted. If an application cannot validate the signature on a security token (or decrypt it if encrypted), it should log an error. Checking application logs for such errors can help identify if this type of issue exists.

Errors with security tokens can also occur at identity providers. One identity provider may delegate authentication for a user to another identity provider. If the first identity provider does not receive a valid authentication token from the remote provider, it should log the authentication failure. Identity provider logs should be consulted if errors seem to originate at the identity provider as these logs will often have the most useful information.

The previous sections describe a series of troubleshooting steps that will help you solve many common causes of authentication and authorization issues. A frequent complication with troubleshooting is that you may not own all the pieces. In such cases, you need to collaborate with others.

Collaborating with Others

If you are not able to test the application personally, or can’t replicate the problem, you will need to ask someone who can replicate the problem to capture a trace of the issue. A .har file or network trace can show interactions between an application and an identity provider as well as an API if used. This can include the requests made, the parameters, the timing of such interactions, and the responses received. Such traces are extremely useful for debugging issues with authentication, SSO, and authorization. When you receive a trace file, you’ll need a viewer suitable for the type of trace captured. Appendix E includes a few such tools.

You should remember that a trace may capture sensitive data, including a username and password typed by a user or sensitive security tokens returned to applications. If someone sends you a trace file, you may wish to warn them about this so they can reset a captured password or invalidate any sensitive tokens. This can reduce your liability. Furthermore, invalidating any long-lived tokens captured and deleting trace files when you are done troubleshooting is another good practice.

Summary

This chapter described tools and approaches useful for troubleshooting many common issues. It helps to know the protocols you are working with and to have debugging tools that give you sufficient visibility into the authentication and authorization interactions of your program. Collecting data about where and when the problem occurs can narrow down the possible source of an issue. An HTTP trace, network trace, or debugger can help you analyze the flow of traffic between components as well as the parameters in the requests and responses. By obtaining the right tools, and asking the right questions, you can speed up the process of debugging an issue. This completes the set of chapters on building and debugging the code for your application. The next chapter covers some things which can go wrong beyond the code and for which you should prepare.

Key Points

  • Develop a working knowledge of the specifications for the identity protocols you use.

  • Prepare a suite of debugging tools.

  • Check the simple things first.

  • Gather relevant information about the problem.

  • Replicate the problem in your own environment if reported by someone else.

  • Use an HTTP or network trace to help identify where a problem occurs.

  • Check the list of symptoms and causes in this chapter.

  • Check for error responses from identity providers.

  • Check application and identity provider log files, if possible, for clues.

Note

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.191.214