Appendix C. Attaching and securing binary data in SOAP

In chapters 6 and 7, you saw how to encrypt and sign a message or parts of a message. The techniques you learned there allow you to encrypt and sign any data that is part of the SOAP message. As you recall from chapter 2, XML does not allow binary data. What if you want to send binary data with SOAP? And how do you encrypt and sign such data? We will answer these questions in this appendix.

XML does not allow arbitrary bytes (more precisely, octets, as the size of a byte isn’t guaranteed to be 8 bits on some rarely encountered platforms) to be included in element content. Only characters defined by the Unicode character set (excluding a few) are allowed using a character encoding scheme such as US-ASCII and UTF-8. This helps keep the text character of XML, allowing XML code to be inspected by humans using text editors.

But what if we want to include binary data (data containing arbitrary octets) such as certificates, signatures, compressed data, encrypted text, and images? We introduced a common trick, namely base64 encoding, to encode arbitrary bytes into character data, in chapter 4. Such encoded data can be embedded in XML, as seen in multiple places in chapter 4-7.

Using base64 encoding presents its own problems in a few common cases. It bloats data to four thirds of its original size. This is fine if the data to encode is only a few kilobytes, but when it comes to dealing with large amounts of data (such as images), the size increase leads to an unacceptable performance penalty. The extra computational cost to encode and decode large amounts of data is also a concern (although it pales before the cost we’re already paying in most cases for encryption).

Multiple approaches have been proposed to address these concerns. They differ in their impact on the encryption and signature techniques that are already in use. In this appendix, we are going to show the two most popular approaches currently vying for adoption: SOAP with Attachments (SwA) and SOAP Message Transmission Optimization Mechanism (MTOM). We will especially look at their impact on how we encrypt and sign SOAP messages.

C.1 SOAP with Attachments (SwA)

These days, all of us are familiar with the concept of attaching files to email messages. For a long while, it was not possible to attach files to email. Email messages were restricted to consist of text (and that, too, using 7-bit characters only with no more than 1,000 characters per line). A standard named Multipurpose Internet Mail Extensions (MIME) changed all that. For the purposes of our discussion here, MIME provided two significant features (among other things):

  1. A way to package arbitrary data items into a single message.
  2. A way to declare metadata about the data items within the MIME package, such as the relationship among the items in the package (for example, are they alternative representations of a same item or are they simply disparate items packaged together), the type of data within each item (is it text, HTML, or a GIF image), and how each item is encoded (7-bit text, 8-bit text, base64-encoded text, or binary data).

Although MIME seems to have been designed explicitly for use in email applications (as the name reflects), it was found to be of use in other applications as well. SwA takes advantage of MIME to attach arbitrary data items to a SOAP message. Listing C.1 shows a simple example.

Listing C.1. Example of an SwA message

SwA requires the MIME message to be declared at the transport level as consisting of multiple related parts using Multipart/Related as the main content type declaration . In this example, we assume HTTP as the transport protocol, and an HTTP header named Content-type is used to declare the body as a Multipart/Related message. Observe that a blank line is used to separate the HTTP headers from the MIME message in the HTTP message body.

MIME relies on a boundary marker string to separate different items in a multipart message. The boundary string used should not occur anywhere within the contents of any part.

In Multipart/Related messages, one of the parts is identified as a root so that applications know which part to process first. The type parameter should provide the content-type of the root part. In our case, the type is text/xml, as our root part is a SOAP envelope. Each part in the MIME package is required to be identified using a Content-ID or a Content-Location header. In Multipart/Related messages, a start parameter can be used to identify the root part using its Content-ID. If the start parameter is not specified, the first part is assumed to be the root.

The start of a message part , is indicated by two dashes (--), followed by the boundary marker we declared in the transport-level Content-Type header. Each MIME part has its own MIME headers , . A blank line indicates the end of MIME headers for the following part . SOAP Envelope is the first part in this example. In this case, the body consists of a request to create an account in our brokerage. The second part in this example is a JPEG image. Notice that we declare the content encoding for the JPEG image in this part to be binary .

Where necessary, references can be made in the SOAP message to content in other parts. For example, we are here referring to the JPEG image attached as the second MIME part. The Content-ID of the referred part is converted to a URI using cid: as the prefix, and the href (hyper-reference, used so widely in HTML) attribute is used to establish the reference .

The end of a MIME multipart package is marked using two dashes (--), followed by the boundary marker we declared in the transport-level Content-Type header and two more dashes (--).

That is how any binary data can be attached to a SOAP message. Let’s understand the pros and cons of this mechanism before going to the alternate approach.

C.1.1. Issues with SwA

SwA is often criticized for two reasons:

  1. MIME attachments break the idea of SOAP messages being XML. Because all implementations make that assumption, they break when confronted with a non-XML message. Or, they have to be retrofitted to accommodate the possibility of MIME packaging. Use of the XML Encryption and XML Signature standards in WS-Security now needs to be reviewed to see what enhancements are needed to secure attachments as well.
  2. An application needs to scan through a part looking for the boundary marker in order to locate where the part ends. The scan obviously takes a few more computational cycles than would be required if we knew the part’s length in advance.

An alternate attachment scheme known as Direct Internet Message Encapsulation (DIME) attempted to address the performance problem. It did not address the bigger issue with attachments: their impact on WS-Security. At the same time, attempts to marry SwA with WS-Security gained momentum, and this has led to the abandonment of DIME. The Web Service Interoperability (WS-I) organization picked SwA over DIME.

There are multiple ways to standardize support for SwA in WS-Security. We will look at the most popular choice next.

C.1.2 WS-Security SwA Profile

A WS-Security profile has already been defined for securing attachments made using SwA. The underlying ideas are quite simple.

  • Use cid: URIs to reference attachments.
  • Define new transforms to distinguish between the possible targets when referring to an attachment: the attachment’s content as well as its MIME headers or just the attachment content.
  • Define canonicalization algorithms for MIME headers and attachment content.
  • Sign all attachments or use an application-specific mechanism to detect malicious addition or removal of new attachments.

Although these ideas seem simple, they may not be widely implemented, as SwA may itself be replaced by a competing standard named MTOM. We describe MTOM next.

C.2. SOAP MTOM

An alternative way of marrying attachments with WS-Security has emerged, in the form of SOAP Message Transmission Optimization Mechanism (MTOM). MTOM considers the use of attachments as merely a pragmatic way of representing parts of a SOAP envelope over the wire, and asks applications to pretend that attachments are present inline in XML in base64-encoded form. That is, by the time applications see the message, the attachments are no longer there; they are contained within the message. This minor adjustment in the way we look at attachments brings back the pure-XML character of SOAP messages without sacrificing the size advantage that binary attachments give us over inlined base64-encoded data.

How does MTOM impact WS-Security? An application that relies on WS-Security to secure attachments will have to first create the pure XML form of the message (or in the case of a receiver, recreate it from the attachments received on the wire). Once this is done, the application can pretend that attachments are not even used.

How do the applications know where to inline each attachment? The XML-binary Optimized Packaging (XOP) specification, which MTOM relies on, defines an xop:Include element to indicate where each attachment (identified by its Content-ID using the cid: URI scheme) should be included in the original XML document.

XOP/MTOM are not restricted to MIME attachments; they allow other schemes of packing an XML document along with the pieces extracted out of the XML document for optimization in transmission. In the case of MIME-based attachments, the Content-Type information for each attachment is preserved using an xmlmime:contentType attribute, where the xmlmime prefix refers to a special namespace established by a specification that standardizes how media types can be assigned to binary data in XML.

Listing C.2 shows all of these ideas in action.

Listing C.2. An example SOAP message serialized as a MIME message, in accordance with MTOM

XOP mandates the use of application/xop+xml as the Content-Type of the root part. This aids in detection of XOP-optimized transmissions. In addition, XOP requires us to preserve the Content-Type and SOAPAction header values from the original message using a type parameter as part of the Content-Type value.

As we have seen with SwA, Multipart/Related messages can provide the root part’s Content-ID using a start parameter . Another optional parameter, startinfo, can be added to further aid the recipient in processing the envelope. Here, we are providing the Content-Type and SOAPAction header values from the original message in startinfo.

If you compare the MTOM wire representation example shown here with the SwA example shown previously, you will find only a few trivial changes. Instead of using text/xml as the root part’s type, we are using application/xop+xml. We added startinfo to the MIME headers. And, we added xop:Include as a child of the identityProof element instead of directly adding an href attribute to it. The main difference is how applications process it.

In this appendix, you have seen two standards, SwA and MTOM, that address how attachments can be made in SOAP. While attachments made to SOAP messages in compliance with SwA can be encrypted and signed as specified by the SwA profile, MTOM makes encrypting/signing attachments no different than encrypting/signing SOAP messages without attachments. Practitioners of web services security who care about interoperability in the short run will need to understand and implement the WS-Security SwA profile. Others can take advantage of MTOM and do nothing special when encrypting/signing SOAP messages with attachments.

Suggestions for further reading

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.209.131