Organizations must apply an integrated approach to API security or else leave the door open to further threats.
D. Keith Casey
API design doesn’t stop at HTTP methods, paths, resources, and media types. Protecting APIs from malicious attackers is an essential part of API design. If left unprotected, an API will become an open door that can do irreparable damage to an organization and its customers. An API protection strategy involves the implementation of the right components, selection of an API gateway solution, and integrating an identity and access management to tie it all together.
This chapter outlines some foundational principles and provides guidance on common practices along with anti-patterns to avoid when approaching an API protection strategy. Resources are provided for further reading and research on the journey.
Some API providers may choose to implement no API security or only basic API security measures using passwords or API keys. Mischievous attackers prefer to seek out poorly secured APIs and exploit them as the means to gain access to data and internal systems.
Recent API security breaches show some of these key vulnerabilities that can occur when using APIs and the impact that can produce:
■ Gaining access to a user database via an unsecured API, allowing the bad guy to confirm the identities of 15 million accounts on Telegram while remaining undetected
■ Exploiting a password reset API that returns the reset token, allowing the email confirmation email to be bypassed and accounts to be taken over, exposing sensitive health and personal details
■ Combining large data sets from previous hacks to confirm authorization of users resulting in the ability to pass security screening and download tax returns from the United States Internal Revenue Service
■ Undocumented APIs intended for internal, private use by a company for their mobile apps was reverse engineered and used to access data easily with minimal or no implemented protective measures. This is common for many API vendors that consider an undocumented API as secure, such as SnapChat
■ Exposing the exact location, by latitude and longitude, of users because a previously private Tinder API was opened for end-users. A thorough security review prior to opening the API to developers would have identified that the mobile app, not the API, was responsible for hiding the actual physical location of their users
These recent breaches span from low-reward results, such as disclosing business intelligence as a competitive advantage, to high-reward results that can disclose extremely sensitive data. One even jeopardized the safety of individuals by disclosing their exact location!
Unfortunately, some API providers may take shortcuts in securing their internal APIs. Perhaps they mistakenly think that if they do not document the potential access to the API, no one will go looking for it. This is naïve at best and risks exposing the organization to various attack vectors that it could otherwise avoid.
Whether the API is available for use by public developers or hidden for private use, protecting the API is important. API protection requires a variety of practices that are essential to an overall API security strategy:
■ Authentication (authn): Used to determine the identity of the caller and verify their identity. Using username and a password is most common for web apps but not recommended for API use since passwords may change often. Instead, use OpenID Connect or similar solution to ensure the identity of the caller is verified before allowing API requests to be processed
■ Authorization (authz): Prevents unauthorized access to individual or groups of API operations based on the caller’s assigned scopes. API keys, API tokens, and/or OAuth 2 are commonly used authorization techniques for APIs
■ Claims: Assigns access controls at a finer-grained level than authorization allows, ensuring that API resource instances are protected
■ Rate Limiting (throttling): Restricts API request thresholds to prevent traffic spikes from negatively impacting API performance across consumers. Also prevents denial-of-service attacks, either malicious or perhaps unintentional due to developer error. Rate limits are typically based on an IP address, API token, or a combination of factors and are limited to a specific number over a period of time
■ Quotas: Limits an application or device from using the API more than permitted within a specific time frame. Quotas typically have a monthly limit and may be established based on the assigned subscription level or through formal agreements between organizations
■ Session hijack prevention: Enforces proper cross-origin resource sharing (CORS) to allow or deny API access based on the originating client. Also prevents cross site request forgery (CSRF) often used to hijack authorized sessions
■ Cryptography: Applies encryption in motion and at rest to prevent unauthorized access to data. Keep in mind that this requires additional precautions to protect private keys used to encrypt data elements, otherwise attackers will easily decrypt the data from API responses using a compromised private key
■ Mutual TLS (mTLS): Mutual TLS, or mTLS, is used when a guarantee of client identity is required. This is common when communicating between services, or when HTTP-based callbacks using webhooks are used, preventing malicious parties from attempting to forge their identity
■ Protocol filtering and protection: Filters requests from API clients that may be used for malicious purposes. This includes the use invalid combinations of the HTTP method and path, enforces the use of secure HTTP via TLS for encrypted communications, and blocking of known malicious clients
■ Message validation: Performs input validation to prevent submitting invalid data or overriding protected fields. This may also include parser attack prevention such as XML entity parser exploits, SQL injection, and JavaScript injection attacks sent via requests to gain access to unauthorized data
■ Data scraping and botnet protection: Detects intentional data scraping via APIs, online fraud, spam, and distributed denial-of-service (DDoS) attacks from malicious botnets. These attacks tend to be sophisticated and require specialized detection and remediation
■ Review and scanning: Manual and/or automated review and testing of API security vulnerabilities within source code (static reviews) and network traffic patterns (real-time reviews)
Not all of these practices are included in a single solution. Instead, there are several components that are a necessary part of an API protection strategy that must be considered.
There are several components that may be used to protect APIs. When combined, these components will form the foundation of a security strategy for APIs.
API gateway is both a pattern and a classification of middleware. The API gateway pattern involves the addition of an additional network hop that the client must be traversed to access the API server.
API gateway middleware is responsible for externalizing APIs across network boundaries. They may act as a pass-through or perform protocol transformation as part of the process. The API gateway becomes a central gatekeeper for all traffic in and out of the API.
API gateway middleware may be standalone products or a component within a larger product offering, such as an API management layer. While API gateways may be built from the ground-up, some gateways are composed from building blocks such as a reverse proxy and plug-ins to realize the features needed. API gateways rarely address more advanced features needed to manage APIs as products. These concerns are offered by API management layers.
API management layers, or APIMs, include an API gateway but also extend their capabilities to include a complete API lifecycle management solution. This includes publishing, monitoring, protecting, analyzing, and monetizing APIs. It may also include community engagement features.
Subscription level support involves defining the API operations to be included or excluded at each level. It also allows for more advanced rate limiting and quota support based on the assigned subscription level for a registered application.
APIMs may also offer extended security measures not found in most API gateways. As a result, they may overlap with the duties of web application firewalls (WAFs).
Service meshes shift the needs of network reliability, observability, security, routing, and error handling away from each process to separate out-of-process infrastructure. This new infrastructure is portable and independent of any specific programming languages and frameworks selected by each process, making it portable. Service meshes have grown in popularity due to the introduction of microservices but may be used for any architecture or combination of architectural styles.
Service meshes replace the direct communication of processes with a series of proxies that direct the communication and error handling on behalf of the process. Each proxy is deployed alongside a running process to eliminate any central point of failure. Deployment is often to a single VM or alongside each container as a sidecar. A centralized management control plane is used to configure the proxies, communicate outages, and oversee the network health. The controller, however, does not involve itself with network data communications.
The components of a service mesh are shown in Figure 15.1.
Figure 15.1 The components of a service mesh including the proxy instances, each connected to a central control plane for oversight and configuration.
Service meshes may be seen as a competitor to API gateways and API management layers. However, this is not the case. While service meshes manage on OSI layer 4 (TCP/IP) and OSI layer 7 (HTTP), they are often paired with an API gateway or APIM. This offers the best of both worlds by offering resilient, observable network communications using a service mesh, with API product and lifecycle management offered by an APIM or API gateway.
Service meshes introduce additional network hops and therefore may have a negative impact on network performance. However, the capabilities offered by a service mesh may offset the negative impact and may produce a net gain when factoring in the many separate network management elements that have to be coordinated when a service mesh is not present.
Finally, bear in mind that smaller organizations may not see the need for the added complexity of a service mesh. However, larger organizations managing many developer teams producing a multitude of services across one or more cloud environments may benefit from the use of a service mesh.
Web application firewalls (WAFs) protect APIs from network threats, including common scripting and injection attacks. Unlike API gateways, they will monitor OSI layer 3 and layer 4 network activity, allowing for deeper packet inspection than what is possible with API gateways that focus on the HTTP protocol only. As such, they can detect more attack vectors and prevent common ones before request traffic reaches backend API servers.
WAFs offer an additional layer of protection against distributed denial of service attacks (DDoS attacks) that may be sourced from a variety of locations and IP addresses.
It is important to note that while the capabilities offered by WAFs are important, they are sometimes merged into APIMs, CDNs, and other layers that may remove the need for installing an explicit WAF.
Content delivery networks (CDNs) distribute cacheable content to servers spread across the world, reducing load on API servers. They improve application performance by responding with cached data to API clients quicker than waiting for API servers to handle the request.
Some CDN vendors are taking on many of the aspects of a WAF by acting as a reverse proxy for dynamic content alongside caching static content. This reduces unwanted traffic on APIs and web applications. Some CDNs also offer an additional layer of protection against Distributed Denial of Service attacks (DDoS attacks), before they ever reach cloud infrastructure.
Even with one or more of these components in place, API providers are still vulnerable to automated attack vectors, sometimes referred to as botnet attacks. These attacks are often coordinated across multiple hosts and even multiple IP ranges, resulting in attacks that may go undetected. This is because most components evaluate an incoming request in isolation. They aren’t designed to evaluate incoming traffic across multiple clients spread across the internet.
Data scraping is also a risk for APIs that surface an entire catalog of data at once. API quotas and rate limits might be large enough to support someone scraping all data at once, even if the related API operations are protected by an API gateway, APIM, and WAF.
Therefore, it is becoming more essential to have advanced detection techniques in place to analyze API traffic across multiple originating IP addresses. This capability is delivered through more advanced versions of the components described above, or by dedicated components that monitor and assess traffic for more complex attack vectors. This protection goes beyond traditional WAFs in that they extend beyond single IP address rules to a more comprehensive traffic assessment that includes multiple IP addresses.
Every organization will require a specific API topology that will include one or more API gateway or APIM instances. The topology should seek to make the API platform or product easily managed and flexible to handle the various functional and non-functional requirements demanded by the marketplace, regulatory requirements, and business goals.
As such, this section outlines some considerations and common topologies from the field. Keep in mind that not all organizations may fit one of these specific scenarios. When a deviation is identified, seek to verify that the business and operational aspects of the intended scenario merit the need for an uncommon approach.
There are three primary options for hosting an API gateway or API management layer: hosted, on-premises, and hybrid. Each one offers advantages and disadvantages to the organization.
Hosted APIMs are offered as a SaaS-based solution by vendors. Some vendors may offer a hosted solution up to a maximum number of requests per second before they recommend self-hosting. Other vendors may support a large number of API requests, with a variety of subscription levels and SLAs offered to customize the solution. Hosting an APIM is a great option for smaller organizations or for organizations beginning to embark on the API journey. However, they may become costly and are often moved on-premises as the API program matures.
Figure 15.2 demonstrates the hosted APIM option.
Figure 15.2 The hosted API management option.
On-premises APIMs are installed within a data center or cloud infrastructure. They place more burden upon the operations teams to ensure proper reliability and availability than hosted APIMs, but also offer more customization options. In addition, on-premises installations allow organizations to install multiple instances of the gateway to isolate APIs involved in regulatory audits or to isolate the impact of API usage across partners and customers. It is also useful when API gateways are desired to manage internal-facing APIs that are not externalized to the public internet. Figure 15.3 demonstrates the on-premises APIM option.
Figure 15.3 The on-premises API management option.
The last type of APIM management option is hybrid. Hybrid installations use a hosted dashboard and reporting infrastructure offered by the vendor, while supporting API gateway instances to be deployed using an on-premises model. This is the option that is seen least in the field. The primary advantage is to reduce the burden of supporting the various processes involved in analysis and reporting systems, particularly if the organization lacks in-house expertise for some of the related components or database vendors. Figure 15.4 demonstrates the hybrid APIM model.
Figure 15.4 The hybrid API management option.
Keep in mind that some cloud infrastructure providers offer their own API gateway or APIMs. While this may be useful in the short term, some organizations may find the customization effort required to be too great. Organizations that are required to take a multi-cloud approach may opt to select a third-party APIM vendor rather than trying to support multiple cloud-provided API gateways. Whatever the case, select the best fit for the current stage of the API program, re-evaluating to ensure the best option continues to be in use.
Multi-Cloud API Management Retail Case Study
Multi-cloud strategies aren’t new. In fact, anyone delivering solutions in the retail space may have encountered challenges when using a competitor’s cloud. One example is Walmart, who prefer that hosted SaaS offerings not use AWS. The initial assumption to this demand may be concerns about placing data on a competitor’s cloud. However, the real reason is simpler than that: they don’t want operational revenue to go towards their competitor. As such, those using AWS for their primary cloud provider may be required by retail companies to use another cloud provider, such as Azure.
This had a considerable impact on the organization’s choice for API management deployment. It also forced the organization to consider an independent APIM vendor to avoid supporting two separate API gateways, one for each cloud vendor.
Be sure to factor these considerations into an API management strategy architecture to avoid vendor lock-in and losing potentially lucrative business.
It is important to include network communication considerations as part of establishing an API security strategy. The traffic entering and existing a data center requires different treatment than traffic moving within the data center. This will have an impact on how organizations manage their API network traffic.
To better understand the decisions involved in API network traffic protection, it is important to review network topology concepts. When in doubt, consult a network engineer to establish a secure and efficient network topology for an on-premises or cloud-based infrastructure.
North-south traffic describes the flow of data in and out of the data center. Northbound traffic is data exiting the data center. Southbound traffic is data entering the data center. East-west traffic denotes the flow of data within a data center.
In the case of request/response API styles, all API requests from applications outside the data center are considered southbound traffic and API responses are northbound. The traffic between an API and a database, or service-to-service communications, is east-west traffic.
Note that with the introduction of zero trust architecture (ZTA), the differentiation between north-south and east-west traffic is decreasing. In ZTA, all public traffic, corporate network traffic, and VPN traffic is viewed with no initial trust factors. Instead, all devices and services are required to establish their trust through per-request access decisions. This places even greater emphasis on establishing well-architected access policies that incorporate identity and access management, authentication, and authorization services combined with a comprehensive access control policy for every API, service, and application. More details on zero trust architecture may be found in the NIST special publication on Zero Trust Architecture.
The most common topology for standalone API products is the direct routing of incoming requests through the API gateway to the API backend. The API backend is often a cluster composed of a load balancer and multiple API server instances. In this scenario there is no need for a service mesh. Figure 15.5 demonstrates this traditional approach to API management.
Figure 15.5 API topology #1 showing an API gateway routing to a monolith.
Another option is to compose an API of multiple backend services. The API gateway uses the path of the request to determine which service is responsible for handling the request. Services may be managed behind a load balancer or may be part of a service mesh, allowing the API gateway to leverage the service mesh to communicate with an available instance. Figure 15.6 demonstrates how an API gateway is used to route incoming requests to multiple backend services.
Figure 15.6 API topology #2 showing an API gateway routing to multiple backend services based on the base path of the incoming API request.
For organizations that have regulatory requirements with frequent audits, or for those that must handle a variety of customer, partner, and web/mobile app deployments, multiple API gateway instances may be required. Each gateway instance may route to a single monolith, as demonstrated in topology 1, or to multiple backend services as shown in topology 2. Alternatively, API gateway instances may be dedicated to one or several tenants of a multi-tenant SaaS. Issues with availability of one gateway instance should not negatively impact the other gateway instances, limiting the impact during peak usage scenarios. This is shown in Figure 15.7.
Figure 15.7 API topology #3 showing multiple API gateway instances that support various internal and external API clients, including the isolation of payment processing for PCI compliance and auditing.
So far, the assumption has been that there is an API client, an API server, and now an API gateway and perhaps other middleware that helps to prevent malicious attack vectors. There is one more important ingredient to protecting an API product or platform: identity and access management (IAM). IAM provides authentication and authorization services, often through the integration with other vendors using industry standards. It also includes the generation of API tokens that take the place of passwords when representing a user and their assigned access controls. IAM is the glue that ties together all other API protection components.
Some APIs have chosen to allow API clients to provide their username and password credentials that are used to login to the web or mobile application. While this is an easy way to get started, it is highly discouraged for several reasons:
■ Passwords are fragile as they change often, which would render any code unable to use the API until it is updated with the new password
■ Delegating access to some or all data to a third-party requires sharing the password with them
■ It does not support multi-factor authentication
To avoid these challenges, the use of API keys or API tokens is preferred for most situations. These two concepts are often used interchangeably but are quite different:
API keys are simple replacements for a password and have no expiration date. They are often found within a user profile page or settings page for a web application. They may be a long alphanumeric value, e.g., l5vza8ua896maxhm
. Since API keys have no expiration date assigned, anyone that obtains the key maliciously may be able to use the API to access data and backend systems for an indefinite period of time. Resetting an API key usually requires a manual step within the same user profile or settings page, assuming that the API provide offers API key reset capabilities at all.
API tokens are a robust alternative to API keys. They represent a session where a user is authorized to interact with an API. While they may be alphanumeric and look similar to an API key, they are not the same. An API token may represent a user or a third-party that has been given limited or full access to the API on the user’s behalf. API tokens also have an associated expiration time.
An API token’s expiration time that may vary from a few seconds to a few days depending on various configuration elements. With an API token also comes a refresh token, which allows the API client to request a new API token when the previous one has expired or is no longer valid.
API tokens may have one or more access controls associated with it. These controls are often referred to as scopes. Multiple API tokens may be generated on behalf of a user, including one with an assigned scope for read-only access of a specific API resource, another with assigned scopes for read/write access to all resources, and yet another that offers a single scope assignment for limited API resource access by a delegated third-party application. This is demonstrated in Figure 15.8.
Figure 15.8 An example of three separate API tokens, only one of which is valid and allowed to pass to the API server by the API gateway.
APIs often use a variety of methods for passing an API token to the server including as a query argument on the URL, as a POST parameter, and through an HTTP header. Avoid using query arguments in the URL as the API token will be logged by web servers and reverse proxy servers. It also allows JavaScript code to access the API token easily. POST parameters tend to be more secure, but the location of the token will vary across APIs.
Therefore, it is recommended to use the standardized HTTP Authorization
header. Access to HTTP headers can be limited through the use of CORS response headers and headers are less likely to be logged by intermediary servers.
Pass by reference API tokens do not contain any content or state, only a unique identifier for de-referencing on the server-side. For example:
GET https://api.example.com/projects HTTP/1.1 Accept: application/json Authorization: Bearer a717d415b4f1
It is the responsibility of the API server to de-reference the API token to determine the specific user making the API call, along with any other details.
Pass by value API tokens contain name/value pairs included within the token. This reduces the number of lookups required to dereference a token to its associated values by the API server.
API tokens that use pass by value typically allow the API client to access the same name/value pairs that are available to the API server. Therefore, pass by value API tokens should embed feature flags or other sensitive data that could be used to compromise a system. Instead, use them to convey minimal details, such as opaque identifiers.
A popular pass by value API token standard is the JSON Web Tokens (JWTs), typically pronounced “jot”. JWTs are composed of three elements: a header, payload, and signature. Each element is Base64 encoded and dot-separated to compose an opaque token that may be used as an Authorization bearer token between client and API. JWTs are signed to ensure they haven’t been tampered with by the client before being sent to the server. Using a private key signature provides further protection against tampering and verifies the identity of the client. The JWT.io website is an excellent resource for learning more about JWTs.
JWTs tend to be more popular for communicating authorization details for east-west traffic, while pass by reference API tokens are used for north-south traffic.
The workflow to authenticate a user, generate an API token, and support delegated access to third-party applications requires a complex workflow between the data owner, the API server, an authorization server, and the third-party. OAuth 2.0 is an industry-standard framework designed to prevent every API server from implementing a different form of this workflow. It offers specific authorization flows for web applications, desktop applications, mobile phones, and devices. These flows support multiple grant types, integrated or third-party authorization servers, a variety of token formats, authorization scopes, and support for extensions.
This complex workflow is commonly seen with websites that support logging in with a Google, Twitter, Facebook, or other kind of supported account. While the website itself isn’t owned or managed by any of these vendors, they do provide the login screen for authenticating with a user account on their system. The website implements a specific flow to send the user to the login page of the chosen vendor (e.g., Google). Once the login is successful, the website user is returned to the website and is now authenticated. Behind the scenes, the website and the authentication provider exchange sufficient details to verify that the user is who they claim. The core components of an OAuth 2.0 interaction are shown in Figure 15.9.
Figure 15.9 The core components and basic interaction of OAuth 2.0.
OAuth 2.0 is a complex framework but one that can be understood given sufficient time and effort. As with other API security topics, it merits a dedicated book. For now, more information on OAuth 2.0, including links to resources, can be obtained by visiting Aaron Parecki’s excellent OAuth Community website.
As mentioned earlier, OAuth 2.0 is focused on the authorization workflow. OpenID Connect is an identity layer on top of the OAuth 2.0 protocol that offers a standard way of verifying and obtaining identity details. It allows web and mobile clients to verify the identity of the end-user, as well as to obtain basic profile information using a REST-like API. Without this protocol, custom integration is required to bridge identity and profile details between the authorization server and the API. The specification details, along with an updated list of OpenID Connect compliant servers, is available on the OpenID Connect website.
Enterprises that require federated identity management across multiple internal and third-party vendors lean heavily on single sign-on, or SSO, for their web applications. Security Assertion Markup Language (SAML) is a standard used to bridge APIs into an existing SSO solution within the enterprise, making the transition better for enterprise users accessing an API through an applicaion. More details are available on the OASIS SAML website.
Teams often consider building their own API gateway or using a helper library to implement their own authentication and authorization support. While some organizations had to take this on early in their API journey, there is no longer a need to build an API gateway in-house. In fact, building a custom API gateway is highly discouraged for three reasons.
Want to make it easier for an attacker to find and exploit a security hole in an API? Build a custom API gateway. Ask any company that has experienced a breach through their API – security is hard, even with the right components in place.
Applying proper security requires focused attention to detail at every aspect of the organization. Unless the organization has a staff of security experts on hand, building an in-house, secure API gateway will take much longer than it takes to build a proof-of-concept version. And it will require continued resources to keep it up to date with the latest vulnerabilities.
Building a custom API gateway often starts as a romantic notion. Rationalization begins with “It shouldn’t take too long” and continues with “It will do exactly what I need it to do – no more – and it will be much faster as a result”, only to end with the dreaded rhetorical question, “How hard could it be?”
The reality is that building and maintaining a production-worthy API gateway isn’t trivial. There is a reason why API gateway and APIM vendors are able to charge for their product. Beyond the baseline set of features, deviations by non-standard clients and proxy servers will force all sorts of troubleshooting throughout the life of the API gateway. Additionally, implementing OAuth 2.0, OpenID Connect, SAML, and other specifications is complex and will take considerable time to build, test, and support.
It is important to first ask if the time spent building a custom API gateway is time well spent by the organization. Count the full cost of building and maintaining the API gateway, including patches and improvements to handle new and emerging attack vectors that are not currently handled. Many organizations have gone down this path, only to never deliver their intended solution to market.
In software, there are three recommended phases of development: make it work, make it right, and make it fast. Often, developers are good at the first step – make it work. They experiment with code to see if something is possible, or perhaps to see what the result might look like before proceeding.
The effort required to go from making it work to making it right for production is vast. The edge cases are numerous and unforeseen. It takes time to make it right. To make it fast requires even more investment. Is the organization ready to dedicate staff on building a solution that already exists?
Perhaps a team is considering that the features of an API gateway could be included right inside the source code. Maybe an existing helper library offers API token generation and some basic security features. That might work for today. However, will this be sustainable in the long term?
Additionally, many developers assume that the library was written by security experts, designed to address the needs of the organization, and will be maintained in the future against all forms of exploits, bugs, and language/framework major version releases. Unless it is a library offered by a commercial company, at least one of these assumptions will be wrong. Is that a risk the organization is willing to take?
Use helper libraries to integrate with third-party IAM solutions that offer authentication and authorization services. Avoid implementing authentication and authorization using helper libraries as it will expose the API to malicious attacks that take advantage of weak or abandoned code.
In summary, manage identity, authentication, and authorization through a third-party vendor, integrating it with a preferred API gateway or APIM that supports the selected IAM vendor.
API design requires considering how an API will be protected from malicious attackers. Unprotected APIs are an open door that welcome attackers to damage an organization and its customers. An API protection strategy involves the implementation of the right components, selection of an API gateway solution, and integrating an identity and access management to tie it all together.
Don’t leave API protection to someone’s side project or to a well-intentioned team within the organization. Select the right approach with vendor-supported components that ensure that the front door of the organization’s APIs is barred shut rather than left unlocked.