This chapter covers
We started this book by identifying three new approaches—message-level security, security as a service, and policy-driven security—that address the challenges SOA introduces in security.
Part II (chapters 4-7) described all the technologies and standards related to message-level security. Making and verifying identity claims, protecting data confidentiality, and verifying data integrity were described there. Chapter 8 explored the idea of offering security as a service. Chapter 9 focused on declarative, policy-based security.
By now you know enough about SOA security, both technology- and standards-wise. Still, if you want to design SOA security solutions for real-world enterprises, you need to know more. In particular, each enterprise has unique requirements that influence the overall SOA security solution. One enterprise may be concerned about high availability and disaster recovery capabilities, perhaps due to the high cost of application unavailability. Another may be focused on return on investment (ROI) and demand a low-cost solution. As an architect, you should translate these unique needs into technical guidelines.
In this chapter, we are going to provide insight into the process of developing security solutions that respect the constraints in an enterprise. While the other chapters in this book discussed a concrete technology that solves a specific problem, this chapter identifies the standard problems that you might face when designing security solutions in an enterprise and discusses the approaches typically used to solve these problems. In other words, this chapter will not provide silver-bullet answers; instead, it provides enough information for you to come up with your own answers.
We will begin by classifying the real-world challenges for SOA security solutions into different categories:
Let’s start with the enterprise software requirements and see what kind of challenges they pose for a security solution.
Most software developers fail to appreciate the complexity of enterprise software. They cannot understand why a simple software solution costs so much to deploy. The IT processes seem downright archaic, unsuitable for developing quality software. It is easy to write software that works; so why is it so complex to put together, say, a simple database-backed website?
In reality, there are several steps before any solution can be used by an enterprise. The following questions are asked before rolling out a database-backed website: How is the database managed? How is the application deployed? How is it updated? How do we train the users? How do we take care of hardware failures? Similar questions need to be answered before rolling out any enterprise SOA security solution.
Instead of answering these questions one by one, we will provide the general concerns that enterprises have for their application, so that you can answer them effectively. Typically the enterprises quantify these concerns using TCO (total cost of ownership). The aspects of the solution that impact the TCO:
We will first describe the need for catering to a large and diverse user base and explain how it impacts the design of enterprise SOA security solutions.
Typically, enterprise software is used by people with different technical backgrounds. For example, senior managers in a logistics company—people with deep knowledge of logistics (their business domain) but less familiar with application security—may need to use several applications. The process managers who use the computer frequently may need in-depth knowledge of the applications and yet may have only a limited understanding of the security solutions. Warehouse operators and truck drivers are likely to use less software.
In short, we have a large number employees with differing job descriptions using software for their specific duties at different locations. Naturally, this imposes several constraints on the way we develop a security solution.
To start with, the diverse user base makes it difficult to train anyone in security awareness. The security solution will have to be simple enough to be used by everybody. As the saying goes, “assume stupidity on part of the internal users, and maliciousness on part of the external users.” That means, if any mistake can be made by users, assume that they will make it. If you make the password schemes complex, assume that they will write down the passwords. Also, be aware that you cannot prevent social engineering-based[1] security compromises without proper training.
1 Social engineering attacks are ones where hackers trick people into providing access control information. We cannot address them using technology alone. A well-designed security solution can take care of these attacks in multiple ways; for, example, by making password sharing difficult or alerting users of possible security violations, and so on.
How can we make a simple security solution really secure? One way is to have strong audit measures by monitoring for unauthorized requests or unusual number of requests, unusual times of requests, or unusual patterns of requests. This monitoring can lead to proactive security management, such as temporary revocation of privileges or manual verification of the user activities.
The other way we can increase security is by multifactor authentication where we require users to present multiple pieces of evidence to identify themselves. For example, as shown in figure 10.1, we may require users to present a key generated by a hardware token along with their username and password.
Theoretically, multifactor authentication will improve security only if the multiple pieces of identification users present are qualitatively different from each other. For example, if the user simply presents two different passwords, that does not enhance the quality of security. A password is something the user knows. If the user can in addition present evidence that he is in possession of something—a hardware token that generates a different key every minute or a smart card that has an embedded digital certificate—the quality of security is certainly higher than what we get with a simple password-based scheme.
We need to be wary of any solution that is complex, as it will make logistics, training, and support difficult. For example, if we go in for two-factor authentication, we have the additional burden of distributing and managing hardware tokens or smart cards. That is why we should try to stick to mechanisms that are familiar to our users. As always, security and convenience impose opposing constraints on the solution, and the existing corporate culture will provide guidance on how to strike the right balance.
The next challenge in enterprise-class solution development is the need to design for long life cycles.
Most of us are surprised to find how old some applications in enterprises are. Some enterprises have back-end applications from the ’60s. Others are just now migrating from Windows 98. This is in sharp contrast to most developers, who always want to use the latest tools and technologies. Why is that? And, what lesson is there for our security solution?
The complexity of IT in enterprises is such that they need to manage systems carefully. Any introduction of new software will have to be accompanied by extensive testing, training, and support provisioning. Since the cost of introduction and upgrade is so high, it may take six months, whereas a developer can upgrade his systems within six hours.
What this means to us solution developers is that we need to design our upgrade processes carefully. In this regard, separating the security concerns into declarative information can make upgrades easy. Be aware that we still have to go through the full cycle of testing, as we mentioned earlier. As a part of this upgrade, users should not require new training. Some of the techniques we have shown earlier, such as JAAS, can help isolate the changes to new libraries and new configuration files.
The long life of applications also requires us to make our implementation choices carefully. For example, we should choose technical standards, libraries, and vendors with a long-term perspective of viability, availability of source code, and support. We have to make sure that we can support our solution for a long time.
Another important requirement for enterprise-class solutions is robustness. Let’s discuss this requirement next.
For an individual, an application crashing may result in loss of data and time, causing inconvenience. For enterprises, this is not mere inconvenience; it may cause serious process disruption, resulting in irate customers, a warehouse full of unshipped goods, idling flights, and unhappy shareholders.
While any process disruption is bad, a security process disruption can be even worse. For instance, a solution that depends on a crucial resource such as a credential store can render every application inaccessible when it crashes. With such serious consequences, we must make sure that our solution is robust; that is, it works even in the event of unexpected failures. Here are some ways we can make our security solution robust.
First, we must follow standard application development guidelines to make the applications robust. Usual causes of crashes can be unstated and unreasonable limits on the resources. For example, if you program assuming only 10 people are going to access the system, what if an emergency forces 100 people to access the system? This can cause the system to repeatedly crash, making the situation even worse. One way we can make sure this will not occur is by gracefully dropping the additional users without application thrashing.
Second, we need to provide fault-tolerant systems. When a primary instance of an application goes down, a secondary instance should quickly come up and take over. There are a few standard ways of doing this regardless of what the application is. The approach to use will depend on details such as the amount of time it takes to start the application, the amount of state that the application keeps, how the state is kept, whether the application can gracefully recover from loss of state, and so on.
Enterprise SOA security solutions should also gracefully degrade and employ fault-tolerance mechanisms in order to be robust enough for use in real-world enterprises.
Next we will discuss the manageability requirements that enterprise solutions must satisfy.
Manageability refers to the ease with which applications can be managed. Important aspects of enterprise application manageability include the ease with which administrators can provision applications for use by a large number of users, monitor the health of applications, and do maintenance activities such as creating backups and installing patches.
An enterprise-class SOA security solution needs to pay attention to all of these issues. For example:
Another challenge that enterprise-class solutions often have to meet is the need to integrate with diverse legacy applications. Let us discuss this requirement next.
As we described, due to the long life cycle of applications, enterprises accumulate applications built on different technical platforms. A typical enterprise will have a mix of applications running on mainframes, Unix servers, and Windows desktops. Any security solution we build must be able to work with all these systems. That is:
In section 10.3.2, we will describe different strategies you can use to secure legacy services.
In this section, we have discussed challenges and requirements all enterprise solutions must address. We have also discussed the impact of these challenges on the design of an enterprise SOA security solution. In the next section, we will focus on the strategies you can use when securing different kinds of services in an enterprise.
Although we have been assuming that a service is just that—a service—up to this point in this book, different kinds of services may need different security strategies. For example, the strategy we would use to secure a new service that is being developed from scratch on a web services platform may be different from the strategy we would use to secure a service that wraps a legacy application running on a mainframe. It is important for enterprise SOA security architects to understand the kinds of services that need to be secured and the strategies that work well for each kind of service.
In this section, we will discuss three broad kinds of services:
For each kind of service, we will explore the strategies we can employ when securing them. Let’s start with a discussion of the strategies we can use for services developed from scratch.
Typically, most enterprises build services on top of their existing applications as they start moving to a SOA-driven enterprise. In new projects, it is likely that they develop services entirely from scratch. Securing such services is simpler than securing the legacy services.
Remember security is a part of the overall SOA strategy. High-level SOA security strategy should be coming from the SOA governance team in your enterprise. If you are part of the team assigned the responsibility of coming up with a security strategy, here are some of the guidelines you can use when developing a security strategy for new services:
3.1 | Securing a new service should be done without adding more code. It should be possible to use a declarative facility for that. |
3.2 | Managing security for a service should be easy. We should be able to change the role required to use a service without having to understand the code. |
3.3 | The framework should support auditing and logging as well. This helps us fine-tune the security implementation in practice. Often these services are already a part of the security requirements. |
3.4 | Whether you deploy security as a separate service or as a part of each service depends on your particular requirement. It is important to centrally maintain the code that manages security so it can be audited and tested thoroughly. |
These guidelines help you in figuring out a specific security strategy for services you are developing from scratch. Next, let’s explore the strategies you can use for securing services that wrap legacy applications.
Most of the services in the current generation of SOA implementations come from wrapping existing legacy applications. In fact, SOA is seen as a way for these applications to participate in modern workflows. Without SOA, these applications cannot provide the useful business logic they contain in a readily usable form.
There are many ways we can wrap legacy applications to offer services, each constraining the security strategy differently. Let’s look at what technical choices we have in developing the services before discussing the possible security strategies for each of those choices:
Figure 10.2 depicts how wrapper services work, abstracting away the details of strategies used to wrap a service interface on top of a legacy application.
As you can see, there are three entities in the picture. The first is the client. The second is the wrapper or the proxy for the service. The third is the actual application implementing the service. As we discussed earlier, sometimes the second and third entities may lie in the same application. In any case, we can assume that the communication between the wrapper service and the legacy app is secure. Thus, we only have to see:
The material you have seen in the book so far describes how to tackle the first challenge. The second challenge involves reconciling the security model of the service with that of the back-end application. By security model, we mean the notion of users, roles, authentication, authorization, encryption, and nonrepudiation.
Security models in different applications can have different views on users and roles. In the security model of one application, each user may have only one role, and in the security model of another application, each user may have multiple roles. The user and role names themselves may be different. Or, roles with the same name may semantically differ. For example, an expert role could mean different sets of permissions in different applications.
When we reconcile the security models of two applications, we map the users or roles of one model to those of the other. If traceability is important, each username in the first application must map to one and only one username in the other. If traceability is less of a concern than other factors such as simplicity and licensing costs, it may be acceptable to map roles defined by the security model of the first application to roles defined by the security model of the second application. When mapping roles, we want to achieve a semantically meaningful mapping, where roles in one application get equivalent roles in another application.
In reconciling the security models of the wrapper service and the back-end application, there are three choices:
To illustrate these choices in security models, we will examine some sample scenarios next.
In the simplest case, the underlying application may not have any security model. It may be relying on the network or the machine infrastructure to enforce security. For example, several Unix applications use file permissions to handle security traditionally. As shown in figure 10.3, in this scenario, we can completely manage the security in the service itself.
The wrapper service will have to address all aspects of security including authentication and authorization mechanisms per the overall SOA security guidelines.
Let’s now consider a sample scenario of a legacy app wrapped by a service interface and see what security strategies make sense.
Here is a common scenario you might encounter: A department, to satisfy its own need, develops a small web application. Over the years, the application evolves, adding new functionality that the users require. Since it is built on early-generation web technologies, it might not support the security model that the corporation approves. It may have a simple model of authenticated, superuser, and guest roles, with every user mapped into one or more of those roles. It may have its own user management system that lets the admin create users and grant roles. As shown in figure 10.4, to reconcile security models of the wrapper service and the legacy app in this scenario, we can map users in the wrapper service to users in the application.
The corporate standards typically dictate the use of a more fine-grained set of users and roles backed by a user directory. Since the application has the built-in logic for users and roles, we must map the corporate standard into the application model. One popular choice is to create three internal users in the service called authenticated-user, superuser, and guest-user, each one registered with the back-end application with the appropriate role. After suitable authentication and authorization, each incoming user is mapped into one of the internal users. The back-end application is invoked using these internal users.
The mapping of the users to these limited roles depends on the security framework we are using. If we are using LDAP-based authentication, we can create the roles in the directory itself. If we are using a security service, the SAML assertion can provide the role in the header as we discussed in earlier chapters.
There is a precedent for this kind of security model management. Most database-backed web sites map web users to a single generic database user. They do not depend on the database user model for authenticating and authorizing the users. This technique lets them pool database connections in the web application.
There are drawbacks to this approach. If the back-end legacy application maintains its own auditing, your user mapping will render it useless. For example, the back-end application will show a large number of accesses from authenticated-user, which in reality are from several different users. If we want to analyze user accesses, we must now rebuild auditing facilities at the wrapper service access point. In addition, the diagnostic messages or the informational logging messages from the back end may not be useful under this gross mapping scheme.
If the response of the back-end application depends on the actual username, we should provide the username as is. In that case, we cannot use this mapping technique. Instead, we need to provision all the users in the application, then use one-to-one mapping. Or, if the back-end application uses a directory, we need to synchronize it with the corporate directory.
Let’s consider yet another sample scenario of a legacy app wrapped by a service interface and see what security strategies make sense.
Several applications being developed these days support pluggable authentication systems. Some of the application servers even support completely externalized declarative support for authentication systems.
We can envision two different ways to use the flexibility of the security model in the legacy application.
Next we will discuss the security strategies you can use in another commonly found scenario: a wrapper providing a service interface to capabilities found in an enterprise resource planning (ERP) application.
Large ERP applications come with sophisticated user models which support user hierarchies, groups, and good management tools. An ERP application can offer desktop applications, web applications, and command-line interfaces and its security model must support each type of application. In addition, it provides standard interfaces via libraries, network protocols, and message buses for getting the data in and out of the application and invoking the functionality provided by the ERP.
Since ERPs offer such complete security models, it is preferable to use one of them. If ERP packages need to be integrated, especially using services, we face the problem of mapping the security model of one application to another. If both applications use one source for authentication and authorization, this is a simple task. Typically we need to reconcile the user information between two different directories and our choices range from synchronizing the two directories using tools to reconciling user information as we discussed for other legacy applications. Vendor solutions for solving this problem are typically referred to as identity management solutions.
So far, we have discussed the security strategies you can use for services written from scratch and services that wrap legacy applications. Let’s now look at the security strategy for services that are composed of other services.
When we are wrapping applications in services, we may be doing more than simple proxying. We may be orchestrating multiple applications to produce a response to a service request. What we are creating in such cases is a composite service—a high-level service that brings together the capabilities of several lower-level services.
Composition of services can happen in different ways. A simple composition may merely transform and collate the data from multiple services. More sophisticated composition can selectively invoke services based on the outcome of other services. In more complex scenarios, composition can involve additional business logic and invocation of alternate services to elicit the best possible response. Figure 10.6 illustrates a sample composite service.
Securing composite services can be complex. For instance, in figure 10.6, the composite service has to invoke services A, B, C, and D to provide a response. Who can invoke the composite service is determined by who can invoke A, B, C, and D. One choice is to make the composite service simply relay the user credentials (for example, username and password) to services A, B, C, and D. If a client is allowed access to all the required services he is allowed access to the composite service. This model does not preserve the atomicity of the composite service. That means that after service A is invoked, if service B rejects the credentials, we would have to roll back the effect of invoking A.
An alternative is to define the security policy of the composite service conservatively as an intersection of services A, B, C, and D. After authenticating the user, we can carry the context to each of these services so that the access is granted to the client. This approach takes more effort to build security for the composite service. The benefit is that the atomicity of the composite service is not broken due to security violations.
As you can see, in every scenario, there are multiple security strategies you can choose from. You will have to evaluate the trade-offs associated with each possible strategy and pick one that you are comfortable with.
So far in this chapter, we have discussed two of the topics enterprise SOA security architects must be familiar with. We discussed the demands of enterprise IT environments and their impact on SOA security design. We also discussed a number of security design strategies for different kinds of services commonly encountered in enterprises.
Next, we will discuss the challenge of coming up with the right deployment architectures for SOA security solutions.
Until now, we have dealt with services, service consumers, credential stores, and security services as if their relative locations did not matter. This abstraction was deliberate. Focusing on logical characteristics of entities and ignoring their relative locations helped us reduce the complexity often seen in the real world, for the purpose of teaching. Now, we are going to address the location question—where in the network does each service belong?
Location has a significant bearing on the design of a SOA security solution for a real-world enterprise. This is generally true for any enterprise solution. Designers of enterprise solutions see the impact of location in multiple ways:
In addition to these challenges, security, performance and availability requirements need to be taken into account when answering the “what goes where” question. The solution to what goes where is commonly referred to as the deployment architecture.
In the rest of this section, we will discuss deployment architectures for SOA security solutions. Figure 10.7 depicts the challenge of what goes where in the context of a SOA security solution.
In the figure, on the left side are the entities in our problem domain:
On the right side are the possible locations for these entities. At a high level, possible locations are:
As you can notice, these locations are not atomic. Some can and should be divided if we are to succeed in coming up with the right deployment architecture. To do this division, we need to identify locations that have unique needs/capabilities or pose unique challenges. The enterprise may need to be further divided into regions, with each region consisting of a headquarters, branch offices, and data centers. A data center may in turn need to be divided into multiple locations for security reasons.
Some of this division may be omitted if the basis on which further division can happen is inconsequential when discussing the deployment choices for a use case (or a class of use cases). This is what we will do as we consider three important classes of use cases.
The three classes of use cases we consider here are
We will start our discussion with the intranet use cases.
One would think that the deployment architecture for securing services within an enterprise intranet would be quite simple. Firewalls will not get in the way and one can assume that all entities live in a single large network. This simplistic view turns out to be misleading.
The distributed and federated nature of twenty-first century enterprises makes intranet network topologies quite complex. Administrative and legal boundaries, the multiplicity of data centers, and the lack of IT support staff at small branch offices, can all have a significant impact on the deployment architecture for a security solution. We will discuss the impact of each of these factors, starting with the simplest possible case of an enterprise with just one data center and one office.
Assume that the enterprise intranet consists of just two locations, a data center and an office, as shown in figure 10.8. All the services are hosted out of the data center. So are the security service and the credentials repository. Service consumers can exist both in the office (user applications acting as service consumers) and in the data center (one service depending on another).
The exact form of security service can vary. It can be a server-based solution or a network device (such as an AON device described in appendix E). Interaction between services/service consumers and the security service can take any of the five forms described in sections 8.2.1 through 8.2.5.
It is common for enterprises to have a large number of branches. For example, you see banks opening branches in supermarkets and oil companies owning gas stations in remote areas. It is difficult to place qualified IT people in these branches. Any task that requires complex configuration and maintenance is an operational challenge in these offices.
Let’s see what kind of support these branch offices need. Some desktop applications used in a small/remote branch office might be consumers of services offered within the enterprise intranet. That means these service consumers need to be configured with at least the location of a registry that houses metadata for services offered within the enterprise. This configuration needs to be updated if and when the registry location changes. In addition, if some of the consumers require security policies to be configured beforehand, a mechanism is needed to keep the preconfigured policies up-to-date. This is clearly a challenge if IT support staff is not available in the branch office.
A possible solution to this problem is shown in figure 10.9.
If we place a remotely configurable security device (either hardware or software) at each small/remote branch office, we need not maintain the security policy configuration in each of the desktop applications acting as service consumers. The security device can be periodically updated with the latest security policies, based on which it can appropriately secure each service invocation emanating out of the branch office.
There are several choices for implementing remotely manageable client proxies. We can use a remotely managed server running client proxy software. We can use tools from middleware vendors to develop and manage this server. We can also use an AON. We can expect these tools to support, at the minimum, remote configuration and monitoring.
An enterprise may rely on more than one data center for multiple reasons. Every office can be served from the closest data center to provide fast responses. Enterprise operations can be kept up and running even if disaster strikes in one of the data center locations. Adding a second data center brings additional complexity.
Each data center may house its own credential stores and security service. In such a case, we need to set up mechanisms for replicating credentials and security policies between the data centers as shown in figure 10.10. All LDAP directories support replication between multiple instances of the same make. If you use directories from different vendors in different locations, you will need to build your own replication mechanisms unless all vendors support LDAP Duplication Protocol (LDUP). How you replicate security policies will depend on where you store them. If you are using a service registry to store your policies, you will need to check with your registry vendor to understand how you can replicate registry contents between multiple instances.
Multinational companies have autonomous units in different countries or regions to satisfy legal requirements in each location. The strength of relationships between autonomous units can vary from enterprise to enterprise. In most cases, there is a trust established between autonomous units so that they can use some of each other’s services over a private network. Trust between these enterprises—established by accepting each other’s digital certificates, Kerberos tickets, or other mechanisms—needs to be maintained and managed as both enterprises evolve as shown in figure 10.11.
There is no one specific technique for trust management, as requirements will vary widely between enterprises. Here, technology plays only a limited role; there is a need for defining and managing security processes to establish and manage this trust.
We have so far discussed possible deployment architectures for securing services in the intranet. Let’s now discuss the deployment architectures for securing services offered to the public.
We have seen in the previous section how, contrary to our initial expectations, deployment of a security solution for intranet services can get quite complex. The situation is the opposite for deployment of services offered to the public. Thanks to the popularity of web applications, most techniques needed for securely deploying services offered to the public, for instance firewalls and the demilitarized zone (DMZ), are well understood. There are a few new questions to be answered before we can decide on specific deployment architecture for publicly offered services. We will cover these later in this chapter.
Before we proceed we need to introduce two common techniques—one from network security and another from web application deployment. A good understanding of these techniques is needed to follow the rest of the discussion in this section.
A firewall is a network device that filters traffic passing through it based on rules set up by the firewall administrator. To understand firewalls, you need to understand how a network works.
An application message, such as a SOAP request addressed by one endpoint to another, gets split into one or more packets by the network. Each packet carries a part of the data in the application message, plus addressing information and other control fields (such as a checksum). The addressing information within a packet is different from what we understand as that at an application level. The endpoint URI for a service call is not part of the addressing information in packets that make up the service call. Instead, all that is available at the network layer is
As a network firewall filters one packet at a time, the filtering rules you provide to a firewall can only be based on information available within a packet. The kinds of rules firewalls can use for filtering
Figure 10.12 illustrates the combined effect of rules 1, 2, and 4. As you can see, firewall rules can be used to block all packets except those that we specifically allow to come in or go out of our internal network.
It is no good stating that we will firewall every application. There are always a few applications that need to be exposed to the outside world. For example, a mail server is always needed to receive emails from customers, partners, and the general public. As no piece of code is ever perfect, we need to prepare for the eventuality of an attacker gaining complete control over the machines hosting publicly accessible applications. A security hole such as a buffer overflow in an application can allow an attacker to execute arbitrary code on the server and obtain, in lots of cases, privileged access. A common technique used to defend against compromise of publicly exposed applications is to place them in a separate subnet called the DMZ.
Applications in a DMZ can receive packets from anywhere—from within the enterprise intranet or from a public network. Applications in a DMZ cannot, in general, initiate connections to nodes within the enterprise intranet. The damage an attacker can cause is limited even if he assumes complete control over a machine within the DMZ.
There are multiple ways to implement a DMZ. Figure 10.13 shows a popular implementation strategy. Two firewalls are used in this strategy:
One or both firewalls may additionally do Network Address Translation (NAT) from publicly known IP addresses to private IP addresses. This further enhances security.
In practice, limited exceptions are made to allow applications within the DMZ to access specific applications within the intranet that they depend upon. It is common to punch a hole in the inner firewall to let a web server access a database within the internal network, as shown in figure 10.14.
Although this seems, at first sight, to be a serious dilution of the concept of DMZ, this is a better choice than placing all the needed applications into the DMZ. We could have placed the database server as well in the DMZ, but this is not a good alternative, as the database server would get opened up for direct attack.[2]
2 This is true even if we do not allow direct access to the database server in the DMZ across the outer firewall. An attacker who gains control of another machine in the DMZ can launch a DoS attack on the database server by flooding it with TCP SYNs.
In some cases, punching holes in the inner firewall can be avoided if the direction of connection establishment can be reversed. If the application in the internal network can be made to establish a connection to the application in the DMZ no hole would be needed in the inner firewall. The DMZ application can then use the connection established by the internal application for all subsequent communication.
Now that you understand the concepts of firewalls and DMZ, it is easy to see how a security solution should be deployed for publicly exposed services.
As the public web services will themselves be hosted in the DMZ, a security service protecting them will also have to be in the DMZ. This service should protect the public services by validating incoming requests against a schema, scanning the requests for commonly used XML attacks, then authenticating and authorizing the request. This kind of security service is a web services security gateway.
Where should the credential store (directory server) for authentication be deployed? The directory server should be within the internal network, as placing it in the DMZ will expose it to direct attacks. A specific hole needs to be punched into the inner firewall to let the security service in the DMZ contact the directory service in the internal network as shown in figure 10.15.
There is another important and interesting question to answer here. Where should the enterprise’s private key be stored? If the inbound messages received by the security service are encrypted, the security service will need access to the enterprise’s private key in order to decrypt incoming messages. It is not a good idea to permanently store the enterprise private key in the DMZ, as it could be stolen more easily. The key needs to be protected by placing it in the internal network.
When the security service requires the private key for decryption, it can fetch the key from the credential repository in the internal network. The security service should not remember the private key for too long, as the key can be stolen if the security service itself is compromised.
A more secure option is to rely on a service offered from within the internal network to decrypt incoming requests. That way, the enterprise private key never leaves the internal network. Another hole needs to be punched into the inner firewall to let the security service in the DMZ contact the decryption service in the internal network. Or, the decryption service can initiate a connection to the security service in the DMZ and the latter can use the established connection to invoke decryption services.
You now have a good idea about possible deployment architectures when securing services in the intranet and when securing services offered to the public. Let’s look into the possibilities when securing services offered to/by partners.
Partners are different from the general public or internal members. They get controlled and limited exposure to the enterprise. Not all partners are alike. We will consider three kinds of partners for discussion in this section.
Not all partners of every enterprise will fit into one of these three classes perfectly. We just picked these three to illustrate the deployment choices available when integrating with partners. Figure 10.16 shows a deployment option each for these three classes of partners.
Small partners with little IT infrastructure of their own may not have the ability to satisfy the security policies of the enterprise. For example, a small partner may not have a security solution in place that can encrypt parts of a message that need to be kept confidential. In these cases, the enterprise can require its small partners to place a security agent that will help them comply with the enterprise security policies in their networks. The security agent can take the form of a device (such as an AON) that can be remotely managed. This will keep the burden on the small partner to the least amount necessary.
Requests secured by a security agent at the partner’s site can be trusted and routed directly to web services offered for partners. Requests emanating from large partners who do not use the enterprise’s security agent will have to be first screened by a security gateway in the enterprise DMZ.
Requests to partner-provided services or services hosted by a managed service provider will also have to adhere to the enterprise’s security policy. When we say security policy here, we are referring to the policy for outgoing messages. In figure 10.16, we illustrate a logical separation between a security gateway for inbound messages and a security gateway for outbound messages.
In this section, we described the possible deployment architectures for securing services in the intranet, services offered to the public, and services offered to/by partners. We will next look into another important challenge enterprise SOA security practitioners face: making the security solution industrial-strength.
SOA is becoming an important part of IT strategy. One of the prerequisites of widescale deployment of SOA is that it should be industrial-strength. That is, corporations can rely on it to support business needs. Security solutions for SOA should support the same goals. In this section, we will examine what it means to make a SOA security solution industrial-strength and how we can achieve that goal. In particular, we will describe how you can address performance, scalability, and availability requirements. We will start with a discussion of performance issues.
Everybody likes their systems to perform well. Unfortunately, compared to the tightly coupled systems that are typically replaced by SOA, performance of SOA systems lags behind. This is a big concern for enterprises embracing SOA. Fortunately, there are several ways to increase the performance of systems based on SOA. In this section, we are going to confine ourselves to only those that increase the performance of the security solutions.
One clear guideline about improving the performance is this: unless we know where the performance bottleneck is, we cannot develop an efficient solution. Speeding up a task that takes 50 percent of overall time by 10 percent has more impact than speeding up a task that takes 10 percent of overall time by 50 percent. Therefore, we should start optimizing once we have good metrics on where time is being spent in the total solution.
Next, let’s consider what you can do when you encounter a performance bottleneck.
There are several tasks that are part of a security solution that can effectively be done in hardware. Hardware offers increased performance in multiple ways, at a cost of increased complexity. Special-purpose hardware can be faster than general-purpose processors, since it can be optimized specifically to do the task it is designed for. At a minimum, it offloads the burden from the main processor, allowing it to do other tasks. It introduces additional complexity—the program or the processor needs to know which task to delegate to the hardware. If the program needs to do the delegation, it needs to be explicitly coded. In some specific cases, the processor can invoke the hardware device without explicitly being asked.
The best candidates for hardware acceleration are ones with very well-understood requirements. For example, point-to-point encryption is well understood and is expensive enough to benefit from hardware acceleration. Indeed, several special-purpose hardware chips offer SSL in hardware. These have been in use since even before SOA—and are applicable in SOA security as well—to provide network-level encryption.
Not all time-consuming activity can be done in hardware. Typically the difficulty lies in coming up with a problem that is well defined and can be done in hardware more efficiently so that the overall solution can be faster. There are only so many tasks that fit this description. One recent approach is to expand the range of tasks that the hardware can do by making hardware programmable.
One such example, as discussed in appendix E, is AON. AON solutions can be programmed to do well-defined horizontal tasks (that is, those that do not require deep domain knowledge): security, auditing, content-based routing, load balancing, and so on. We can only expect the increased use of such special-purpose hardware to provide well-understood generic services that are customizable for a situation. The complexity these days lies in customizing and maintaining the total SOA solution. With enough attention to these problems, hardware solutions may become more and more usable.
Hardware-based solutions for improving performance are not always applicable. Improving the design and implementation of the solution in software can often improve performance to the required levels. Here are some of popular techniques:
This is not an exhaustive list of possible optimizations. As mentioned, any optimization must start with a performance profile of the application. Most of the standard optimization techniques work well for security solutions, too. In particular, XPath evaluations cause serious performance degradation in applications. We can handle this in two different ways: by reducing the complexity of XPath expressions that need evaluation and by using libraries that reduce XPath evaluation cost.
Let’s take a look at what scalability means for a SOA security solution, and how we can achieve it.
As the performance and load requirements on the application keep increasing, we need to make sure the application can handle it by adding more resources. If we configure a system to handle 1000 users and if it had to handle 10,000 users, we need to make it work by adding more hardware resources—additional memory, additional disks, additional machines, and additional network infrastructure.
Normally scalability is achieved through adding more machines. We can make use of more machines effectively only if we can partition the problem in such a way that each machine can be engaged in solving a part of the problem. There are three ways we can engage multiple machines:
If we separate security concerns into a service, we can easily make it scalable. Even if we build security into each service, we can replicate essential parts to provide the scalability. This scalability needs synchronization among the replicated resources, which is not difficult to do.
We have so far discussed how to address performance and scalability concerns of an enterprise when securing its services. Let’s now discuss another important concern any enterprise will have when deploying a SOA security solution: availability.
If we make every service implement security, we must make sure that security works all the time. If not, applications stop when the security solution does not work. Thus, availability of security solutions is important in an enterprise.
There are standard ways to ensure high availability. If there is a critical resource such as a directory service or a token-granting service that is the central piece in a security resource, it should be configured for high availability. For instance, if we are depending on some files, we can use highly available storage solutions such as storage area network (SAN). If we are depending on a critical data source, we can set up a hot spare, which needs to be kept in sync with the main server. If we have a critical service, we can run it in a cluster that manages one machine as a backup for the other.
This concludes our description of possible approaches for performance, scalability, and availability requirements for a SOA security solution. These are very important concerns in real-world deployments, and any enterprise SOA security architect should have experience dealing with them.
Despite all our effort, we know that no security solution can be perfect. We can proactively develop defensive practices that reduce the damage that attackers can cause. In the next section, we will address under this issue by looking at how we manage vulnerabilities in our solutions.
As indicated at the beginning of the book, we have not been discussing security holes introduced by poorly written code. For example, we have not focused on the possibility of a buffer overflow in a service implementation. In the real world, you need to worry about such vulnerabilities in your services, service consumers, intermediaries, the security service itself, and any other services/libraries you depend on. Here we will briefly discuss the kinds of vulnerabilities you need to guard against and common techniques to do so. In particular, we will discuss:
Let’s start with a discussion of common vulnerabilities in any software that hackers usually seek to attack.
Code in SOA implementations, like any other code, is vulnerable to certain common forms of attack. Thanks to the focus on security brought about by numerous virus attacks in recent years, common vulnerabilities in code are well understood by the security community. Techniques and processes to defend against common vulnerabilities are also well understood. Here is a representative sample of common vulnerabilities and techniques to defend against them.
Buffer overflow is probably the most commonly exploited security hole. This problem occurs when a program preallocates fixed buffers to deal with inputs and processing. Under certain circumstances, by carefully crafting the input, the program can be made to write and access beyond those preallocated buffers and trick the system into violating security constraints. The references listed at the end of this chapter will help you understand this topic in greater detail. What we will briefly mention here are the techniques to guard against buffer overflow vulnerabilities.
A first line of defense against buffer overflow attacks is provided by some of the modern programming platforms. Java, for example, internally checks all array access for bounds violations. Array bounds checking also happens internally when managed code is executed in .NET.
All this checking certainly causes performance degradation. But most applications can afford to pay this cost. For high-performance applications that cannot afford the cost of automatic bounds checking, you will have to rely on the programmer’s skill to make sure that buffer overflows do not occur. You can choose to have a second line of defense, though.
Assume that your application does indeed have a buffer overflow vulnerability and try to minimize what an attacker can do after successfully exploiting the vulnerability. Never give an application more privileges than it really needs. Configure vulnerable applications to be run using accounts that do not have administrative privileges. This way, the amount of damage an attacker can cause is limited.
On Unix you can further limit the damage by running an application in a chroot jail. See the references listed at the end of this chapter for information on what a chroot jail is and how you can set it up.
Most business services rely on a database. If a service composes an SQL query using data provided by the caller without vetting the data first, it runs the risk of an SQL injection attack. The following query is composed by a BankAccount service to implement a balance query operation.
"select balance from account where accountId = '" + accountId + "'";
The implementer of this service expects that the caller will provide an accountId, as shown here:
<queryBalance> <accountId>account123</accountId> </queryBalance>
A malicious caller may guess the details of this implementation and try submitting the following instead as accountId.
<queryBalance> <accountId> account123'; update table account set (balance) values (1000000) where accountId='account123'; select balance from account where accountId='account123' </accountId> </queryBalance>
If the service implementer is naïve and uses the caller-provided accountId as is when composing the SQL query, the composed query will now consist of three SQL statements:
select balance from account where accountId = 'account123'; update table account set (balance) values (1000000) where accountId='account123'; select balance from account where accountId='account123';
As you can see, the malicious caller is able to update the balance in an account to a million dollars! These are known as SQL injection attacks and are representative of a large class of attacks that take advantage of applications that do not sanitize input before using it.
The only effective antidote to SQL injection attacks is in the hands of programmers; security administrators can only ensure that no more privileges than necessary are granted to an application in the database security configuration.
Almost every database access API provides a safe way of binding user input to placeholders in an SQL query. Quotes and special characters are escaped or handled appropriately if a programmer uses such APIs. Java programmers should use a PreparedStatement with placeholders instead of composing a Statement using string concatenation. Figure 10.17 shows the architecture for general input validation.
Programming languages such as Perl also provide facilities to mark all input as tainted and disallow direct use of suspect data. Programmers can only remove the taint attached to an input data item by extracting a specific pattern of characters from it. In the balance query code shown previously, the accountId provided by the user can be sanitized by picking only the first substring of alphanumeric characters using a regular expression such as [a-zA-Z0-9]+.
There is not much most applications can do when hit with a flood of concurrent requests from multiple points in the network. Almost all software crumbles under such an attack, resulting in DoS to legitimate users.
To cope with DDoS attacks, applications need to be able to quickly distinguish between legitimate requests and others. The faster this check can be made, the more an application can stand up to a DDoS attack. Unfortunately the rate at which an application can screen requests can be exceeded by the rate at which bad requests are coming in. Filtering needs to happen early and at multiple access points to the network, as shown in figure 10.18. Every access point into the network hosting the application needs to be equipped with firewalls that can filter good traffic from bad. Several network vendors provide products that can do this job effectively.
So far, we’ve discussed common vulnerabilities that are found in any software. Let’s next discuss specific vulnerabilities that XML-based web services may possess.
In addition to the vulnerabilities that SOA software shares with every other piece of code, there are a few more that are likely to be specific to SOA. Almost all SOA implementations involve a lot of XML processing, and XML brings its own set of new vulnerabilities.
The biggest source of vulnerabilities in XML lies in the facility for defining internal and external entities. Entities are declared within a Document Type Definition (DTD). An internal entity in a DTD is like a named constant in a program.
<!ENTITY constant1 "value1">
An external entity in a DTD is like an include/import statement; it allows the DTD to get a part of its contents from external sources.
<!ENTITY importFromFile SYSTEM "/some/file/path"> <!ENTITY importViaHttp SYSTEM "http://... ">
Once an entity is defined, it can be referenced using the syntax &entityName;. Two types of potential attacks are reported in the use of entities: entity recursion attacks and external entity attacks. We will describe each separately.
It is possible to define one entity’s value by referring to another.
<!ENTITY constant1 "value1"> <!ENTITY constant2 "&constant1; and value2">
Any reference to constant2 now expands to "value1 and value2". It is possible to quickly build up large amount of text:
<!ENTITY constant1 "value1"> <!ENTITY constant2 "&constant1;&constant1"> <!ENTITY constant3 "&constant2;&constant2"> ... <!ENTITY constantN "&constantNminus1;&constantNminus1">
By defining the value of constantN as a concatenation of two constantNminus1 values, we have employed recursion to build 2N-1 repetitions of constant1. This constitutes a vulnerability in the following sense.
An attacker can submit an XML document with an internal DTD subset that employs the recursive definition trick shown here to exhaust the memory resources available to the recipient’s program. Note that recursion allows one to generate exponentially growing output with input that only needs to grow linearly. In our example, an attacker can generate 232 repetitions of constant1 simply by defining 32+1 constants recursively. The attacker’s program does not require much memory to generate input that will exhaust the attacked program’s memory resources.
Almost every XML parser allows you to plug in an external entity resolver that can resolve external entity references. A naïve implementation of a resolver that simply resolves any external reference is bound to introduce vulnerabilities, such as the following:
A security-conscious implementation of an external entity resolver will limit resolution to a limited set of known locations.
To rule out the possibility of entity attacks, SOAP explicitly prohibits the presence of DTD (and processing instructions) within a SOAP message. If your SOAP-processing engine is compliant with the SOAP specification, it should not be vulnerable to entity attacks.
If your SOA software accepts any XML input other than SOAP messages, you may need to turn off entity expansion altogether to completely secure yourself against entity attacks.
In security enforcement, it is always a good idea to ban everything first, then allow a few trusted entities. This principle should be applied, where possible, when accepting XML inputs. If schema definitions exist for the XML inputs your services accept, validate incoming XML against those schemas. Schema-based validation is expensive, and you may see a performance hit when you introduce validation into your processing. If the performance hit you are seeing is unacceptable, evaluate the use of high-performance validators that are available in the market.
Despite that, two important limitations of schema-based validation need to be kept in mind.
As you can see, validation of XML messages is turning out to be a specialized area. XML security appliances in the market address this area well. For a list of vendors, see appendix E.
Thanks to the focus on security in the last few years, enterprises have a much better understanding of how to stay on top of vulnerabilities that are common in software. Next we’ll discuss a workflow process enterprises use to fix vulnerabilities promptly and regularly.
Security experts recognize that there will always be many more vulnerabilities in our code than we recognize at any point. Guarding against vulnerabilities is not a one-time activity. It obviously requires constant vigilance. It is important to institute a process for patching up vulnerabilities when they are reported. Vulnerability remediation is an important workflow within an IT organization. Figure 10.19 depicts this workflow.
Activities that are part of this workflow are
Figure 10.19 identifies the owner of each step in a typical enterprise. In a large enterprise, this workflow needs to be managed by tools. Manual management can be cumbersome, as the number of resources and management processes is too great.
Developing a SOA security solution for an enterprise is not easy. One needs to combine all of the ideas introduced in the previous chapters with the right implementation and deployment choices to create a solution that fits the needs of the enterprise. In this chapter, we tried to give you a good idea of the challenges you are likely to face and the choices you have when designing a SOA security solution for real-world enterprises.
We first explained the demands enterprise IT environments place on software solutions. An enterprise SOA security solution must be usable by a large and diverse user base, must remain usable over a long time, must be robust, and must be easy to manage. In addition, as most enterprises are likely to have a mix of services—some developed from scratch, some that wrap legacy applications, and some high-level services that compose lower-level services—we explained the variations in strategies you can use for securing these different kinds of services.
We also discussed myriad considerations that can go into designing the deployment strategy for a SOA security solution. We discussed methods of deployment in an enterprise—with branch offices, with partners, and to the general public. The deployment scenarios differ in how elements of the solution can be placed in the network.
While the issue of DoS attacks is not the focus of this book, we discussed a few issues that can affect security—in particular, we showed how XML-based attacks work and how to protect applications from them.
One warning: This chapter merely touched on issues that come up when developing an enterprise-class SOA security solution. Unlike other chapters, we cannot offer a fully working concrete example here. The best way to use this chapter is to understand your particular problem in the terms explained here. With enough practice, you will be able to extend the lessons in this chapter to solve your security problems.
This chapter completes our book on SOA security. By reading this book, you have come to understand that SOA security is quite different from typical application security. Fortunately, SOA standards help you build good secure solutions.
If you are with us thus far, you have learned the standards through simple examples, along with the underlying technologies. You also have seen the different options for full-fledged frameworks and several practical challenges. Depending on your responsibilities, you can use this knowledge directly in building a security solution for your application; or you can use it in formulating SOA security strategy for your enterprise based on vendor tools and products. In either case, we hope the information you’ve learned from this book—the standards, code explanations, and underlying technologies—will help you on your way to mastering SOA security.
18.217.147.193