Chapter 3. Web Services Technologies

Web services offer many valuable benefits. They support heterogeneous application integration at a fraction of the cost of other methods. They are simpler to build and easier to maintain than traditional middleware, and they can make your IT systems more flexible and adaptable. They also allow you to reuse application functionality, reducing development and administration costs.

Web services derive these benefits from their underlying Web technologies and standards. In particular, they derive these benefits from the Web, XML, and the SOA. In this chapter we will dig into these technologies and examine why they produce these benefits.

First let's get a bit more specific with our definition: A Web service is service-oriented application that communicates over the Web using XML messages. Looking at this definition, we can see that Web services represent the convergence of three core technologies:

  • The Web. A Web service is a Web resource. It communicates over the Internet using standard Web protocols. Leveraging the Web yields universal connectivity and pervasiveness.

  • XML. XML is a language for electronic documents and messages. It provides a universal, self-describing data format that can be interpreted, processed, and transformed by any application running on any platform. Leveraging XML yields heterogeneous platform support and tremendous application flexibility.

  • SOA. The SOA describes a set of common practices for service-based applications. Most RPC-based middleware technologies rely on the SOA. The SOA defines mechanisms for describing services, advertising and discovering services, and communicating with services. Leveraging the SOA yields developer productivity and runtime efficiencies.

The Web

The Web offers universal connectivity. It is a vast, interconnected information system that relies on the Internet to support global communications. The Internet is a completely pervasive, highly reliable communications infrastructure. The Internet relies on two critical technologies: TCP/IP (Transmission Control Protocol/Internet Protocol) and DNS (Domain Name System). TCP/IP is a pervasive network protocol that is platform- and language-neutral. DNS maps an abstract network address, such as www.bowlight.net, to a physical TCP/IP network location, such as 192.168.1.1. Together, these two technologies support the largest and most scalable network on earth.

The Web consists of resources and links. A resource is any type of named information object, such as a word processing document, a digital picture, a Web page, an e-mail account, or an application. The most fundamental concept that defines the Web is the Uniform Resource Identifier (URI). A URI is a compact, formatted name that identifies a resource. We use this name to identify, reference, access, and share a Web resource.

A URI can take many forms. Some URIs are used simply as a name, and others can be resolved to a specific application running on a computer at a particular TCP/IP network address. A URI that is simply a name is called a Uniform Resource Name (URN). It doesn't point anywhere. For example, I could create a URI to represent myself, such as urn:AnneThomasManes. This URI doesn't resolve to a TCP/IP address. It is simply a name. A URI that does resolve to a physical network address is called a Uniform Resource Locator (URL). At runtime the Internet can use DNS to map a URL to a physical network address. When one resource refers to another via its URL, it forms a link. These links create the interconnections that constitute the Web.

Users interact with a Web resource using some type of Web application protocol. The Web can support many types of applications, such as Web browsing, e-mail, file transfer, terminal emulation, and network management. There are a number of Web application protocols, such as HTTP, Simple Mail Transfer Protocol (SMTP), and File Transfer Protocol (FTP), that support these different types of applications. Each application protocol supports different semantics and behaviors. For example, HTTP supports a request/response behavior, and SMTP supports a one-way messaging behavior. Table 3-1 shows the application protocols used for some popular Internet applications.

Table 3-1. Internet Application Protocols

Internet Application

Application Protocol

Web browsing

Hypertext Transfer Protocol (HTTP)

E-mail

Simple Mail Transfer Protocol (SMTP)

File transfer

File Transfer Protocol (FTP)

The first part of a URL, called a URI scheme, often indicates the application protocol used to access the resource. Table 3-2 shows three URLs, each supporting a different protocol.

Table 3-2. Three URI Schemes

URI Scheme (Boldface)

Resource

http://www.bowlight.net/index.html

A Web page

mailto:[email protected]

An e-mail mailbox

ftp://ftp.bowlight.net/atmbio.pdf

A file

The Web Versus Other Networks

Web services use the Web as a communications infrastructure. The Web supports universal connectivity. A Web service is a Web resource, so it is identified, referenced, and accessed by its URL. Web services rely on the Internet to map the URL to a physical network address at runtime.

Traditional middleware technologies were designed before the introduction of the Web, so they don't rely on the Internet for communications. In fact, these technologies were designed to work over a variety of network technologies, such as TCP/IP, DECnet, and System Network Architecture (SNA). At the time these technologies were developed, network independence was considered a feature. More important, these technologies were designed to support communications between applications within a single corporation. These technologies are not particularly suited for intercorporate communications. Most companies use a firewall to restrict external access to corporate systems. This means that unless instructed otherwise, a firewall blocks traditional middleware traffic.

Unlike traditional middleware systems, Web services generally communicate using standard Web protocols, such as HTTP and SMTP. These protocols are designed to work over the Internet, and they work very well with standard firewalls. Moreover, these protocols are ubiquitous. Most systems include support for these protocols, and you don't need to pay license fees to use them.

XML

A Web service can be developed using any programming language and can be deployed on any platform. In addition, a Web service can be accessed by an application written in any programming language running on any platform. Although the Web supports universal connectivity, the Web by itself doesn't resolve the issue of heterogeneous communication. Different programming languages use different formats to represent data. Web services support heterogeneous communication because they all use the same data format: XML.

XML is the secret sauce that gives Web services their power and flexibility. When you exchange information between two heterogeneous systems, you need to convey this information using a data format that both applications can understand. Web services communicate by passing XML messages, and XML is the Web's universal language. Every programming language and every platform can understand XML.

XML is a language for electronic documents. An XML document is a structured way to represent information. An XML document is a hierarchy of XML elements. An element represents a distinct piece of information in the document. For example, an element can represent someone's name. Furthermore, each element can be structured into subelements. For example, a name element can have three subelements that represent first name, middle initial, and last name. As mentioned in Chapter 2, applications must work with unambiguous structured data, such as XML.

XML is a markup language. You structure your document using tags, which provide a label and a container for each element. A tag is a word or set of characters encased in pointy brackets—for example, <name>. An element has a start tag and an end tag; the end tag begins with a forward slash. The element content sits between the start and end tags. For example, a name element would look like this: <name>Anne</name>.

You can think of an XML document as an electronic form containing a set of labeled fields. Each element represents a field. The tags provide the label for each field. Figure 3-1 compares a printed form and the XML that represents the form electronically.

In an XML document, each element sits within a set of tags. The elements are structured in a hierarchy.

Figure 3-1. In an XML document, each element sits within a set of tags. The elements are structured in a hierarchy.

Some markup languages, such as HTML, use the markup tags to specify formatting information. These tags indicate that certain pieces of text should be formatted with a particular font, at a particular size, in a particular color. They indicate that text should be formatted into paragraphs, bulleted lists, tables, and so on. A browser uses these tags to determine how to display the HTML page. The set of tags defined in the HTML standard is called the HTML vocabulary. A vocabulary defines the set of tags that can be used in a specific type of electronic document, such as an HTML page.

Although HTML is an excellent vocabulary for Web browsers, it's not particularly useful for application-to-application communication. An application doesn't care about fonts and colors. Instead it needs unambiguous information. For example, it needs to know that one particular bit of information represents a name, and another bit of information represents an e-mail address.

Unlike HTML, XML doesn't define formatting information. Instead it uses tags to define each bit of information in the document. An application can navigate its way through the tags and find the information it needs. Another important feature of XML is that it doesn't define a limited vocabulary. The “X” in XML stands for “extensible.” XML is what's known as a meta-markup language: a language that you can use to define your own vocabulary. You can create your own tags to describe the contents of your specific documents. For example, you can create a tag called <Contactinfo> and use it to represent contact information in an electronic document. HTML doesn't give you the ability to create your own tags. You can use only the tags defined in the HTML vocabulary.

XML Schema

You can define the structure, semantics, and constraints of a particular type of XML document using the W3C standard known as XML Schema. For example, you might create a schema to define the structure of a purchase order document. This schema would specify which elements are required or permitted in a purchase order document, the order and hierarchy of the elements, and any constraints on the number of occurrences of each element.

A schema makes it easier for an application to process the contents of a document. First of all, the application needs to know what structure to expect. In many cases it also needs to know what kind of data it's working with. For example, it needs to know whether a particular element contains simple text, an integer, a decimal number, or a value representing a date. XML Schema allows you to specify each element's datatype.

Applications can be picky about data structures. If an application receives a document that doesn't match the expected structure, it may cause an error. You can avoid these errors by validating the document against its schema at runtime. The validation process compares the document structure with the rules defined by the schema and identifies any discrepancies. If an application receives a document that it determines to be invalid, it can reject the document, or it can try to modify the document to make it valid. In this way, it may be possible to fix an error in transit and continue execution.

XSLT

One of the reasons XML is such a flexible data representation format is that you can transform documents. Extensible Stylesheet Language Transformations (XSLT) is a language for transforming an XML document into a different structure. You use XSLT to create a style sheet. A style sheet provides instructions on how to modify or restructure a document. For example, you can change the name of element tags, reorder the sequence or hierarchy of the elements, and add or remove elements.

Transformability makes XML flexible and adaptable. You can use XML and XSLT to build automatic application adapters. For example, if you receive a purchase order document that doesn't quite match the structure required by your application, you can use a style sheet to transform it into the structure you need. You can use a style sheet to merge information from multiple documents into a single document or vice versa. You can also use a style sheet to transform a proprietary format used by one of your internal applications into a standard format used by the rest of your industry.

You can also use a style sheet to transform an XML document into a different type of document, such as an HTML page, a Wireless Markup Language (WML) card deck, or a VoiceXML file. WML is an XML-based markup language that can be displayed in a Wireless Application Protocol (WAP) browser on a mobile handset. VoiceXML is an XML-based markup language that represents spoken text. You can build an automated interactive voice response (IVR) interface using VoiceXML.

XML's transformability is one of the reasons Web services work well for building portals. As shown in Figure 3-2, backend Web services deliver content to the portal in XML format. The portal presentation logic, called a portlet, can transform the content as needed to work with the portal user interface, which might be a frame within a browser, a portable handset, or a standard telephone (using voice generation and voice recognition software).

A multichannel portal application can take XML content from a Web service and use XSLT to transform it into the appropriate format for the device being served.

Figure 3-2. A multichannel portal application can take XML content from a Web service and use XSLT to transform it into the appropriate format for the device being served.

XML Versus Other Data Representations

Web services communicate by exchanging XML messages. In contrast, traditional middleware technologies encode their data using a binary format, and each middleware system uses a different one. ONC uses the External Data Representation (XDR). CORBA uses the Common Data Representation (CDR). DCOM uses the Native Data Representation (NDR). Java RMI sends data as serialized Java objects. These different data representations mean that these middleware systems can't talk to each other. Web services are different. Web services use XML. Any programming language on any platform can interpret XML, and this gives Web services complete heterogeneous support.

XML Schema gives you the ability to define standard XML document templates. For example, you can define the format of your corporate purchase order document using XML Schema. Or you can adopt a standard purchase order format defined by an industry group such as RosettaNet. Any XML document can be validated to determine whether it conforms to a particular schema. Traditional middleware systems don't have the capacity to define and validate document structures.

XSLT makes it easy to transform messages during processing. Validation and transformation during processing gives Web services tremendous flexibility. In traditional middleware technologies, if a message doesn't exactly match the expected format, the communication process fails. With Web services, on the other hand, the client or service can validate a message, and, if it doesn't match the expected format, the application can transform the message into an acceptable format.

This feature is particularly helpful if you must deal with multiple versions of a service. Suppose that you want to upgrade a deployed service, and you have a number of clients already using the service. Your new upgrade adds support for a “frequent customer” program. To take advantage of the program, the client must add a frequent customer number to the purchase order. This means that you need to add a new element to the schema for your purchase order, and you need to change your API. In a traditional RPC-based middleware environment, this kind of change would break all your existing clients. But not so in a Web services environment. You can arrange to let your upgraded Web service support clients using both the old API and the new API. When you receive a request based on the old API, you can validate it and automatically transform it so that it matches the new API.

XML resolves the Traditional Middleware Blues associated with tightly coupled connections. Traditional RPC-based middleware, such as RPC, RMI, CORBA, and DCOM, relies on tightly coupled connections, resulting in inflexible and brittle applications. Any change you make to the service may cause all connections to break. Message-oriented middleware (MOM) doesn't suffer from this problem. It uses loosely coupled connections, meaning that there is a greater layer of abstraction between the application and the message format. But MOM doesn't do as much work for the application as RPC-based middleware does, so MOM is harder to use.

Web services support the ease of use of RPC-based middleware, and they support loosely coupled connections as MOM does. Web services don't require an exact match between client and service. The system can transform messages in transit to keep the applications up and running. XML makes Web services flexible and adaptable.

SOA

The service-oriented architecture (SOA) describes a set of well-established patterns that help a client application connect to a service. These patterns represent mechanisms used to describe a service, to advertise and discover a service, and to communicate with a service. RPC-based middleware systems, such as RPC, CORBA, DCOM, and RMI, rely on these SOA patterns. Web services also rely on these patterns. These patterns are very familiar to most programmers.

Figure 3-3 depicts the conceptual roles, artifacts, and operations of the SOA. The three basic roles in the SOA are the service provider, the service consumer, and the service broker. The service provider supplies the service, the service consumer uses the service, and the service broker facilitates the advertising and discovery process.

SOA describes patterns used to describe a service (via a contract published by the service provider), to advertise and discover a service (via a service broker), and to communicate with a service (via protocols defined in the service contract).

Figure 3-3. SOA describes patterns used to describe a service (via a contract published by the service provider), to advertise and discover a service (via a service broker), and to communicate with a service (via protocols defined in the service contract).

The three basic artifacts in the SOA are the client, the service, and the service contract. The client is the code that the service consumer uses to access the service; the service is the code that supplies the service; and the service contract describes the API that the client uses to access the service.

The three basic operations in the SOA are register, find, and bind. When a service provider makes a service available, it describes the service by publishing a service contract. The service provider then registers the service with a service broker. A service consumer queries the service broker to find a compatible service. The service broker gives the service consumer directions on how to find the service and its service contract. The service consumer uses the contract to bind the client to the service, at which point the client and service can communicate.

Web services rely on these SOA patterns. Although Web services can be developed using any XML-based language and communications protocol, the industry is converging on a core set of technologies to enable language and platform independence and to ensure multivendor interoperability. The standard technologies for implementing the SOA patterns with Web services are Web Services Description Language (WSDL), Universal Description, Discovery & Integration (UDDI), and Simple Object Access Protocol (SOAP).

WSDL, UDDI, and SOAP

WSDL, UDDI, and SOAP are the three core technologies most often used to implement Web services. WSDL provides a mechanism to describe a Web service. UDDI provides a mechanism to advertise and discover a Web service. And SOAP provides a mechanism for clients and services to communicate. Figure 3-4 shows these technologies mapped to the SOA.

In the Web services architecture, you use WSDL to describe a service, UDDI to advertise and discover a service, and SOAP to communicate with the service.

Figure 3-4. In the Web services architecture, you use WSDL to describe a service, UDDI to advertise and discover a service, and SOAP to communicate with the service.

The primary advantage of using Web services rather than traditional SOA-compliant middleware is that WSDL, UDDI, and SOAP are much more flexible than other systems. They support any language on any platform, and they don't require you to install specialized, homogeneous software on every client and server machine. Web services are simpler, less expensive, and more pervasive than traditional SOA-based middleware.

Description (WSDL)

WSDL is an XML language that describes a Web service. You use WSDL to create a Web service contract. A WSDL document describes what functionality a Web service offers, how it communicates, and where to find it. You can separate the various parts of a Web service description into multiple documents to provide more flexibility and to increase reusability.

The what part of a WSDL document describes the abstract interface of a Web service. This interface description specifies which operations the service supports, and it defines the format of the messages that must be exchanged to perform the operation.

This interface description essentially defines an abstract Web service type. A type is an abstract representation of a thing. The concept is similar to that of a car model. I drive a Volkswagen Cabriolet. The car model, Volkswagen Cabriolet, is an abstract type of car. The car in my garage is one particular implementation of that type of car. There are many other Volkswagen Cabriolet implementations in the world. A Web service type is called a portType. As with car models, you can have multiple implementations of a particular portType. All these implementations support the same abstract interface and do essentially the same thing, but they are supplied by different service providers.

The how part of a WSDL document maps an abstract interface to a concrete set of protocols. This mapping is called a binding. The binding specifies the technical details of how to communicate with a service. It indicates how the input and output messages defined in the abstract interface should be packaged into a message. It indicates how the message should be structured and how the data should be encoded; it indicates which schema defines the message format; and it indicates which Web application protocol should be used to transfer the message.

The where part of a WSDL document describes a specific Web service implementation. A Web service implementation can support one or more portTypes, each with one or more bindings. Each portType binding is called a port. Each port specifies an endpoint, which is the URL used to access the service. A business might offer multiple endpoints to a particular service, each implementing a different binding to support multiple protocols.

A WSDL document containing all three parts describes everything that you need to call a specific Web service implementation. This file can be compiled into application code, which a client application can use to invoke the Web service, as shown in Figure 3-5. This generated application code is called a client proxy. The proxy represents the Web service to the client application. The client application calls the client proxy, and the proxy constructs the messages and manages the communication on behalf of the client application.

A WSDL document can be compiled into a client proxy, which automatically manages the SOAP message exchange at runtime.

Figure 3-5. A WSDL document can be compiled into a client proxy, which automatically manages the SOAP message exchange at runtime.

A WSDL document can also be interpreted at runtime, permitting much more dynamic integration. At development time a developer can compile only the WSDL what part to create an abstract programming interface, which the client application can use to invoke any Web service that implements the portType. At runtime the application retrieves an implementation WSDL file and generates a dynamic proxy to bind to the specific Web service implementation. This process is known as dynamic binding.

For example, as shown in Figure 3-6, suppose you have a procurement application that can place orders with any of your contracted suppliers. Let's assume that all your suppliers support the same procurement abstract interface (as defined by a portType), which accepts your purchase order as input and returns a confirmation. Each of your suppliers uses a slightly different protocol binding, and each Web service is accessed through a different URL. At runtime, your procurement application selects a specific supplier and dynamically generates the proxy code needed to talk to that specific Web service based on the supplier's WSDL file.

A single client application can interpret WSDL at runtime to dynamically bind to multiple implementations of the same service type.

Figure 3-6. A single client application can interpret WSDL at runtime to dynamically bind to multiple implementations of the same service type.

WSDL Versus Other Description Languages

Most traditional SOA systems use an interface definition language (IDL) to describe the service contract. IDL describes the signature of a service. This signature describes the procedures or methods that the service offers, and it specifies the order of the input and output parameters required for each method. Different IDL languages are used for ONC, DCE, CORBA, and DCOM. Java RMI is slightly different. Rather than use a separate IDL language, Java RMI simply uses Java to describe the service signature. To some degree IDL is comparable to the WSDL what part. The major difference between IDL and a WSDL portType, though, is that portTypes are abstract, and IDL is specific and concrete.

An IDL-defined signature represents a very tightly coupled connection. The order of the parameters in the input and output messages in the signature is critical. The input and output messages must exactly match the signature, or else the communication will fail.

Unlike WSDL, RMI and IDL systems do not specify information about how to communicate with the service. Traditional middleware systems support only one set of communication protocols. It's assumed that you will use the protocols defined by the middleware, so there's no need to specify a how part. In contrast, Web services let you use different protocols based on the specific requirements of your application or network.

Unlike WSDL, RMI and IDL systems do not specify information about where to find a service. Traditional middleware systems do not use a URL to identify, reference, and access a service. Instead traditional middleware systems rely on their advertising and discovery system to keep track of the location of the service. Each middleware system uses a different advertising and discovery service. DCOM systems use the NT Registry, CORBA systems use the CORBA Naming Service, DCE systems use the Cell Directory Service, and RMI systems use either the Java Naming and Directory Interface (JNDI) or the RMI registry.

In all these traditional middleware systems, the client must query the advertising and discovery service at runtime to obtain the location of the service before the client can bind to the service. The reason for this overhead is that the service does not have an abstract address such as a URL. A client must reference the service by its physical network address, and the physical network address may change each time a service is started. Because these systems don't use a URL to identify a service, it's somewhat difficult to make these systems work across the Internet.

Web services are designed to work across the Internet. Each Web service is identified, referenced, and accessed by its URL, and this address is resolved at runtime using native Internet services (DNS). A Web service client doesn't need to query the advertising and discovery service at runtime if it already knows the URL of the service, and the URL is specified in the WSDL where part. Web services usually communicate using Web application protocols, and these protocols work well with firewalls. Even so, Web services can communicate using any type of protocol. A service provider might make the same service available via multiple protocols to support different client requirements. The WSDL how part describes the protocols that can be used to access the service.

From my perspective, I'd say that the most important difference between WSDL and traditional middleware description languages is that WSDL allows you to specify the interface definition separately from the implementation definition in the service contract. Although all SOA-based middleware systems support the separation of interface from implementation, none of the traditional middleware systems lets you define that separation in the service contract. By defining the what part separately from the how and where parts, WSDL allows an application to dynamically bind to a different service each time you use it. No other middleware system supports this level of dynamic interaction.

Advertising and Discovery (UDDI)

UDDI provides a mechanism to advertise and discover Web services. UDDI is a registry for Web services, and UDDI is itself a Web service. The primary difference between a UDDI registry and other registries and directories is that UDDI provides a mechanism to categorize businesses and services using any number of taxonomies. These taxonomies help service consumers find a particular service that matches their requirements. You can use UDDI on a very large scale, such as the entire Internet, or on a smaller scale, such as within a private business community or within your enterprise. Chapter 4 talks more about these options.

A UDDI registry manages information about service types and service providers. The service type registrations represent abstract services and industry standards. The service provider registrations specify information about a business and the specific services it provides.

A service type represents an abstract service that can be offered by one or more service providers. It is comparable to a WSDL portType. Programmers, businesses, industry groups, and standards bodies can define service types, and each service type can be categorized using any number of descriptive taxonomies. Consumers can search the registry using these taxonomies to find service types that match their requirements, and they can search for service providers that support these service types.

A service type is defined by a UDDI registry entry called a tModel, which stands for “technical model.” A tModel represents an abstract and reusable resource, and it provides a pointer to the specification (for example, a WSDL file) that describes the resource. You should use a tModel to register the abstract interface (the WSDL what part) of a Web service and use additional tModels to register each of the bindings (the WSDL how part) available for the Web service. You should not use a tModel to represent a specific implementation of a service. A specific service implementation is associated with the service provider that provides it.

A service provider registers its business and all the services that it offers. Conceptually, the service provider registration can be thought of as containing white pages, yellow pages, and green pages information.

The white pages information includes basic identity information for the business, such as the name of the business, the business address, contact information, and business identifiers such as a D-U-N-S number or Thomas Register supplier identifier. These identifiers provide some assurance to the service consumer that the provider is an established business.

The yellow pages information includes categorization information for the business and its services. UDDI allows you to categorize a business or service using any number of taxonomies. UDDI provides built-in support for a number of standard international taxonomies, such as the United Nations Standard Products and Services Codes system (UNSPSC), the North American Industry Classification System (NAICS), and the ISO 3166 Country Codes standard. UDDI also allows you to define your own taxonomies to support more focused categorization. Service consumers can search for businesses or services using any combination of built-in and custom taxonomies.

The green pages information provides access to the technical specifications that describe a service implementation. The technical specification information is maintained in a UDDI registry entry called a binding template. A binding template represents an endpoint of a service implementation. The binding template references its technical specifications through a set of tModels. It points to the tModels that represent the WSDL portType and the WSDL binding that the service implements. The binding template also specifies the access point for the service implementation. A service consumer can use the binding template to retrieve all the information needed to access the Web service.

This registration model might seem complicated, but it offers tremendous power and flexibility in terms of service advertisement and discovery. In particular, it supports the concept of industry standard service types. A number of vertical industry groups, such as ACORD, OMA, OTA, and RosettaNet, are defining XML-based standards for electronic business for their particular industries. These standards can be registered as tModels in UDDI. Businesses can indicate that they support these industry standards by referencing these tModels. Potential consumers can quickly and easily find businesses that support the standards by searching for services that implement the tModels.

Figure 3-7 provides an overview of the information in a UDDI registry, and it shows the relationship between the UDDI entries and the types of WSDL descriptions. In this diagram, an industry standards group has defined a standard for an Order service. This standard is registered in the UDDI registry as a tModel. The tModel points to the WSDL portType definition (the what part) that defines the industry standard. Companies A and B provide Order services that implement the industry standard. Each company registers its business (represented by a business entity) and the service it offers (represented by a business service). Each company also registers technical specification information for its particular implementation of the Order service (represented by a binding tModel, which points to the WSDL how part). The company then uses a binding template to associate the business service with the industry standard tModel and the specific binding tModel. The binding template also points to the service implementation access point.

An overview of the information in a UDDI registry.

Figure 3-7. An overview of the information in a UDDI registry.

When looking for a Web service, a service consumer queries the UDDI registry searching for a business that offers the type of service desired. Users can search the registry for services by service type or by service provider, and queries can be qualified using the taxonomies.

For the most part, a UDDI registry is used at development time to locate available services that can be used to implement a solution. A developer queries the registry using a browser interface, using human intellect to qualify and select an appropriate service from all the potential services registered in the UDDI registry. The developer uses the UDDI registry to find a specific service implementation and to retrieve the service access point and its WSDL description. The developer then uses the WSDL file to generate the appropriate communication code needed for the client application.

A UDDI registry can also be used at runtime. A UDDI registry is itself a Web service, and applications can access the UDDI registry using SOAP messages. A developer can program the application to query a UDDI registry to look for a service that implements a particular service type, locate the implementation WSDL file, and generate a dynamic proxy to perform the communication.

UDDI Versus Other Discovery Systems

Traditional SOA systems don't use URLs to reference services, and they don't use DNS to resolve the network address. Instead they use their advertising and discovery systems as a naming service to resolve network addresses at runtime. These systems don't rely on the Internet to enable communication.

Web services rely on the native features of the Internet (URLs, DNS, and TCP/IP) to enable communication. You don't need to use UDDI at runtime to obtain the physical address of a service. UDDI is an optional service used to help match service consumers with service providers. It is used primarily to categorize services. It is also used to publish corporate or industry standard service types and schemas and to indicate that services conform to those standards.

Traditional SOA systems don't use their advertising and discovery services to categorize services. Instead these systems assume that the client knows which service it wants to use. Only CORBA provides something somewhat comparable to UDDI. CORBA defines an optional service called a Trader, which helps match clients to services. The Trader service allows you to define arbitrary properties about a service. These properties could be used to implement taxonomies, but the Trader doesn't allow you to search based on these taxonomies. You must search based on a service signature. You can use the properties only to qualify the search.

From my perspective, the most important difference between UDDI and traditional middleware discovery mechanisms is UDDI's ability to characterize services by their service types. Traditional SOA discovery systems don't support the concept of service types. Only UDDI provides a mechanism to publish standard service types, to indicate that a service conforms to the standard, and to search for services based on those service types. This feature perfectly complements the dynamic binding features of WSDL to support dynamic interaction.

Communication (SOAP)

Simple Object Access Protocol (SOAP) is an XML protocol used to communicate with Web services. SOAP provides a simple, consistent, and extensible mechanism that allows one application to send an XML message to another application.

SOAP provides an envelope for an XML message. Just as you put a letter into an envelope to send it through the postal service, you put an XML message into an envelope to send it across the network. The SOAP envelope provides a container for the XML message.

You can think of a SOAP envelope as being similar to a transportation shipping container. A shipping container can be carried by a variety of transport systems, such as boat, rail, and truck. A SOAP envelope can also be carried by a variety of transport systems. In this case we're talking about communication protocols. The most common way to transfer SOAP messages is to use HTTP, although you can also transfer messages using other Web protocols, such as SMTP and FTP, or using non-Web protocols such as IBM WebSphereMQ or a JMS implementation. You can select the transport protocol based on the requirements of your specific application.

A SOAP message consists of two parts, as shown in Figure 3-8. The SOAP header contains directive information, and the SOAP body contains the message payload.

A SOAP message contains a header and a body and is packaged in a transport packet.

Figure 3-8. A SOAP message contains a header and a body and is packaged in a transport packet.

The SOAP header includes system-level information most often used to manage or secure the message. The SOAP header can include information such as security credentials, transaction context, message correlation information, session identifiers, or management information. If no system-level information is required, the SOAP header can be omitted.

The SOAP body contains the message payload—the information that is being sent to the target application. In the example shown in Figure 3-8, the payload is a purchase order.

The WSDL description associated with the Web service defines the structure of the SOAP message. The contents of the message payload should conform to the input and output messages defined in the WSDL what part. The WSDL how part defines how the messages should be packaged in the SOAP envelope, and it defines the information that should be included in the SOAP header. The WSDL how part also specifies which transport protocol should be used to transfer the message.

A SOAP message can be transferred directly from the SOAP sender to the SOAP receiver, or it can be routed through any number of SOAP intermediaries. A SOAP intermediary can perform a variety of functions, such as auditing or logging a message, storing the message for reliability, checking security credentials, encrypting or decrypting the message, validating the payload, transforming the payload, or routing the message. A SOAP intermediary can operate transparently to the SOAP senders and receivers.

SOAP supports two ways to structure messages: document-style and RPC-style. Document-style SOAP messaging supports very loosely coupled communications between two applications. The SOAP sender sends a message, and the SOAP receiver determines what to do with it. The SOAP sender doesn't really need to know anything about the implementation of the service other than the format of the message and the access point URL. It is entirely up to the SOAP receiver to determine, based on the contents of the message, how to process it. The formats of the document input and output messages are defined by XML Schema definitions, which can be defined in the Web service's WSDL document or in a separate schema. Formatting a SOAP message according to a specified schema is called literal encoding.

A more tightly coupled communication scheme uses the SOAP RPC convention. An RPC-style input message simulates an RPC invocation. It specifies the name of the procedure to be invoked, and it contains a set of input parameters. An RPC-style output message simulates an RPC response. It contains the return value and any output parameters of the procedure. You can literally define the format of the RPC input and output messages using XML Schema, or you can dynamically construct the messages using a data model called SOAP encoding. This SOAP encoding data model is designed to make it easy to map complex object-oriented data structures to XML.

The advantage of using document or RPC messages structured using a literal XML Schema definition is that the messages can be validated using the schema. RPC messages constructed using SOAP encoding cannot be validated because there is no schema that describes the structure. The advantage of using SOAP encoding is that it provides a simpler and more efficient method to represent and transfer object-oriented data in XML, but it limits your ability to validate messages in transit. SOAP encoding has also been a source of interoperability issues.

Extending SOAP

SOAP provides a built-in extension mechanism that allows you to add advanced middleware functionality to the basic communication environment. As mentioned earlier, you can pass directive or control information in a SOAP header. Either your SOAP runtime server or an intermediary can automatically process these directives. You can use this extension mechanism to add automatic middleware functionality, such as security, auditing, transactions, reliable delivery, load balancing, asynchronous communications, long-term conversations, and version control.

Securing SOAP Messages

For example, let's take a look at how you can use SOAP headers to support security. Quite a few people have the mistaken impression that SOAP isn't secure. In fact, it's easy to add security to SOAP messages. The simplest way to implement security is to use a secure communication protocol, such as Secure Sockets Layer/Transport Layer Security (SSL/TLS). Most SOAP implementations provide support for the Hypertext Transfer Protocol Secure (HTTPS) protocol, which runs over SSL/TLS. HTTPS automatically encrypts the SOAP messages before sending them across the network.

In some cases, encryption of network traffic may not provide you with sufficient protection. Perhaps you need to control access to sensitive services, or perhaps you need to provide a digital signature with a message. In these circumstances, you must apply security at the application layer rather than at the network layer. SOAP permits you to do so using SOAP headers.

Security is an expansive topic, so let's start with some basic groundwork. There are five functions that fall under this topic:

  • Authentication is the process that you use to verify a user's or an application's identity. There are a number of mechanisms that you might use for authentication, such as digital certificates or a login process.

  • Authorization is the process that you use to determine whether an authenticated entity has permission to perform a particular action or function. You may want to define access control policies for all services at a given location, for individual services, or for specific operations in a service.

  • Confidentiality prevents unauthorized access to the contents of the message. You usually use encryption to ensure confidentiality.

  • Integrity prevents unauthorized modification of the message. You usually use encryption to ensure message integrity.

  • Nonrepudiation provides proof that a particular user or application sent a message. A digital signature provides proof that the signed data were sent from a specific authenticated identity. All or part of a message can be signed.

You can secure your SOAP communications by including security information in the SOAP header. Figure 3-9 shows a secured SOAP message. In this example, we're signing the contents of the SOAP body, and we're passing authentication information and the digital signature in the SOAP header.

The contents of the SOAP body in this secured SOAP message have been digitally signed. The SOAP header contains authentication information and the digital signature.

Figure 3-9. The contents of the SOAP body in this secured SOAP message have been digitally signed. The SOAP header contains authentication information and the digital signature.

You can send either login information (user ID and password) within the SOAP header, or you can send a security token. If you send login information, you want to make sure that you encrypt the message. A security token is an encrypted value that represents security information, such as your identity. If you use a security token you don't need to encrypt the message just to protect your identity information. You can get a security token from an authentication authority, such as a single sign-on service or a certificate authority. Examples of security tokens are digital certificates and Kerberos tickets. There is an Organization for the Advancement of Structured Information Standards (OASIS) standard called the Security Assertions Markup Language (SAML), which defines a way for you to exchange security information in XML, enabling an XML-based single sign-on facility. You can use a SAML authentication assertion as a security token.

You can set up your Web services environment to automatically manage security on behalf of your applications. You normally implement the security functionality in a SOAP header processor, which gets called automatically by the SOAP runtime system. Alternatively you can implement security by routing your messages through one or more security intermediaries. Either way, your security processors can perform functions such as authenticating the user, mapping the user's credentials to a known role in your security system, verifying that the user is entitled to perform the requested operation, encrypting or decrypting the contents of the message, and validating a digital signature. These security processors may be supplied for you by your Web services platform, or they may be additional services that you install to augment your environment.

SOAP Versus Other Communication Systems

Most traditional SOA systems aren't designed to work over the Internet. At the time these systems were being developed, TCP/IP had not yet been adopted as the universal network protocol. At that time many companies still used other network protocols, such as DECnet, SNA, and Internetwork Packet Exchange (IPX), as the corporate backbone, and systems such as DCE and CORBA were designed to run across any of these types of networks. These systems use a dedicated application protocol that can run across any of these networks. Unfortunately, this application protocol must be installed on every client or server that wants to engage in the conversation. In contrast, SOAP doesn't require a specific application protocol. It can use pervasive Web protocols, such as HTTP and SMTP, so you don't need to install special networking software on every machine.

Traditional SOA systems encode the message data in a binary format that can be interpreted only by that specific middleware system. These binary formats are opaque to firewalls, and intermediaries cannot validate or transform the messages in transit. In contrast, SOAP encodes messages using XML. SOAP messages can be interpreted by any application written in any language running on any platform. Firewalls can inspect the XML to determine whether the requests may pass. Intermediaries can validate and transform the messages in transit, making SOAP much more flexible and adaptable than any other SOA communication system.

Traditional SOA systems aren't as extensible as SOAP is. As a SOAP user, you can include additional control information in a SOAP header to add any kind of middleware functionality to your system. Other SOA systems may provide additional middleware functionality, but you can use only the services provided. You don't have the ability to extend the environment any way you'd like.

Other Web Service Technologies

Although WSDL, UDDI, and SOAP have emerged as the predominant Web services technologies, you can also build Web services using other technologies. There are a number of XML protocols that predate SOAP, and quite a few of the earliest Web services were built using these technologies. Also a number of industries developed XML-based messaging standards before SOAP appeared on the scene.

One of the earliest XML-based B2B integration systems is RosettaNet. RosettaNet is a subsidiary of the Uniform Code Council (UCC), a nonprofit standards organization. The RosettaNet standards were developed by representatives of the semiconductor industry to define standard XML formats to support integrated supply chains. RosettaNet has since been extended and adapted to support other industries beyond semiconductor manufacturing, and it has gained wide adoption.

Companies such as Macromedia and webMethods were some of the early innovators in the area of XML protocols. They developed proprietary XML protocols that are still used widely. Although they still support these early protocols, these companies have since embraced SOAP and now also provide products based on what have become the de facto standard Web services technologies: SOAP, WSDL, and UDDI.

One of the most popular XML protocols is XML-RPC, which is actually an early offshoot of the original SOAP project sponsored by Microsoft, DevelopMentor, and Userland. XML-RPC is a simpler variant of SOAP. Numerous open source implementations of XML-RPC are available, and the protocol still has a loyal following.

ebXML

At the same time that Microsoft, IBM, and their friends were developing SOAP, two standards organizations joined forces to develop a new set of international standards for electronic business. The Electronic Business using Extensible Markup Language (ebXML) project was a joint effort of the United Nations Centre for Trade Facilitation and Electronic Business (UN/CEFACT) and OASIS. The ebXML framework defines a comprehensive set of standards for business-to-business integration.

One primary distinction between the ebXML framework and the SOAP/WSDL/UDDI architecture is that ebXML is designed specifically to support B2B integration. It includes built-in support for security and reliability that may not be necessary for in-house, lightweight, or device-based integration projects. The SOAP/WSDL/UDDI architecture provides a general-purpose Web services infrastructure that can support a wide variety of applications. SOAP can be used for internal application projects as well as B2B integration projects. SOAP can also be used in very small systems, such as a personal digital assistant (PDA) or mobile handset, for personal application integration.

The ebXML messaging system is based on SOAP, but beyond this basic level, the ebXML framework diverges from what has become the de facto standard for Web services. The ebXML Message Service (ebMS) protocol extends SOAP by adding support for attachments, security, and reliability. Other SOAP systems can't communicate with ebMS systems without knowledge of the ebMS SOAP extensions. The ebXML message payload is generally conveyed as an attachment rather than within the SOAP body, and ebXML doesn't provide a service description language on a par with WSDL.

Instead ebXML provides a number of description systems that allow you to describe and negotiate business relationships. The ebXML Collaboration Protocol Profile and Agreement (CPPA) specification provides a mechanism for specifying the details of how you support B2B integration. Your CPPA descriptions generally refer to a Business Process Specification Schema (BPSS), which describes a choreographed interchange of messages that must be exchanged to complete a specific business transaction. The ebXML framework also uses a different advertising and discovery service called the ebXML Registry and Repository. This service offers capabilities similar to those of UDDI, including support for industry standard service types and categorization through flexible taxonomies. Unlike UDDI, the ebXML Registry and Repository also provides a repository that can manage and maintain XML schemata, CPPA descriptions, BPSS specifications, and other metadata. UDDI doesn't provide a repository. You must store WSDL descriptions and XML Schema definitions in an external file, repository, or content management system.

The ebXML project was completed in April 2001, and the resulting specifications were transferred to OASIS and UN/CEFACT for long-term management. Even though the ebXML specifications are formal industry standards, ebXML has been slow to attract wide industry adoption. From a vendor perspective, the project's strongest support comes from Sun Microsystems. On the flip side, IBM has been noncommittal, and Microsoft has been dismissive of ebXML. Even so, ebXML has won endorsement from a number of powerful industry groups, including the Open Application Group, the OpenTravel Alliance, and RosettaNet. A handful of large software vendors, such as Commerce One, Documentum, PeopleSoft, and Sterling Commerce, as well as a few software startups, such as Bind Systems, Killdara, and XML Global, are developing products that support ebXML.

One of the most interesting ebXML-based projects is a data exchange platform for automotive retailers called DealerSphere, a joint project of EDS, Killdara, The Cobalt Group, and Sun Microsystems. DealerSphere proposes to create a standard, Web-enabled “data broker” that will facilitate B2B information exchange. DealerSphere will define a standard data and transaction interface that promises to enable seamless integration with popular dealer management systems.

Executive Summary

We've covered a lot of ground in this chapter. (I'm sure you'll be pleased to hear that the rest of the book isn't quite as technical as this chapter.) Let's spend a moment reviewing some of the basic concepts.

A Web service is service-oriented application that communicates over the Web using XML messages. Web services represent the convergence of three core technologies: the Web, XML, and the SOA.

The Web provides the basic infrastructure that supports Web services. The Web helps solve the Traditional Middleware Blues associated with lack of pervasiveness. The Web is pervasive and provides universal connectivity. It is free and unencumbered as well as completely platform-independent.

Web services communicate using XML. XML helps solve the Traditional Middleware Blues associated with lack of heterogeneous interoperability and high total cost of ownership. XML is the lingua franca of the Web. Any application, written in any language, running on any platform, can understand XML. In addition, XML is flexible and adaptable. You can validate and transform XML in transit. XML helps ensure that Web services are loosely coupled, reducing maintenance costs and increasing reusability.

The SOA is the most common architecture used to support application-to-application integration. It's the foundation for most RPC-based middleware systems, including DCOM, CORBA, RPC, and RMI. The SOA is familiar and intuitive to your developers.

The SOA specifies mechanisms for describing services, advertising and discovering services, and communicating with services. The most common technologies used to implement these functions in Web services are WSDL, UDDI, and SOAP. Figure 3-10 shows these three technologies working together.

A client finds a service and its WSDL via UDDI. It uses the WSDL to generate a proxy. The client uses the proxy to talk to the service.

Figure 3-10. A client finds a service and its WSDL via UDDI. It uses the WSDL to generate a proxy. The client uses the proxy to talk to the service.

WSDL is an XML language for describing Web services. A WSDL document describes what a service does, how it communicates, and where to find it. You can compile a WSDL document and generate a client proxy, which contains all the code you need to communicate with the service.

UDDI facilitates the advertising and discovery process. It is a Web service registry. A service provider uses UDDI to advertise its business and services. When you register a service you can categorize it using a variety of taxonomies. These taxonomies make it easier for service consumers to find services that match their specific requirements. You can also associate your service with the industry standards that you support. A service consumer uses UDDI to find a service and its WSDL description. The consumer can search by business, by taxonomy, by service type, or by a combination of factors.

SOAP is an XML protocol that Web services use to communicate. SOAP provides a simple, consistent, and extensible mechanism that allows one application to send an XML message to another application. The WSDL description associated with the Web service defines the structure of the XML message.

Together, these technologies implement an extensible, light weight SOA-based infrastructure. The SOA has evolved over the past 15 years to support high performance, scalability, reliability, and availability. But most SOA-based systems suffer from the Traditional Middleware Blues. They don't support heterogeneity. They don't work across the Internet. They rely on tightly coupled connections. They're expensive to use and even more expensive to maintain. Web services take all the best features of the SOA and combine them with the Web and XML. The result is an architecture that eliminates the Traditional Middleware Blues. Web services support heterogeneous, low-cost, flexible integration.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.196.175