MIME Header Fields

MIME headers come in two flavors: MIME message headers and MIME part headers. MIME message headers are just additional RFC 822-style message headers. They denote that a message is MIME compliant and inform a receiving MUA of the structure and encoding of the message. MIME part headers reside in a message body and describe the contents of each part of a multipart message.

If a MIME header is part of a message header block, it applies to the entire message. If it appears at the beginning of a message part, it applies only to that part.

MIME message headers are:

  • MIME-Version

  • Content-Type

  • Content-Transfer-Encoding

  • Content-ID

  • Content-Description

  • Content-Disposition (experimental)

MIME part headers can be any valid MIME header except the MIME version header. In other words, MIME part headers must begin with “Content”.

If a later MIME version is defined, its headers will start with “Content-”. As with RFC 822 messages, any user-defined headers that are used should start with “X-” to avoid any potential conflicts with emerging standards.

MIME headers, whether message headers or part headers, may occur in any order.

MIME messages have, at least, the minimum simple text headers (described in Chapter 2) and additional MIME headers. These are called MIME message headers, to distinguish them from the headers used at the beginning of each body part, the so-called MIME part headers. The MIME message headers give the MIME version, the type of message structure used, and the string that separates the parts of the MIME message. For example:

Date: Tue, 07 Apr 1998 14:38:12 +1000
From: Bradley Marshall <[email protected]>
MIME-Version: 1.0
To: [email protected]
Subject: Some MIME Headers
Content-Type: multipart/mixed; boundary="------------4FEA92EA2642085E2EF0E7AA"

The MIME-Version header gives the version number of MIME used for this message. So far, the only MIME version used has been 1.0. There have been minor changes to the MIME standard, but not enough to justify a version change in the format. This version information should be used by any MIME-compliant MUA to make certain that the message will be decoded properly. If a different version number is detected, the MUA should fail gracefully if it is not equipped to handle it.

RFC 822 comments should be ignored when parsing the MIME-Version header. A header that looks like this:

MIME-Version: 1.0 (Implementation by Fred Nerk, Esq.)

is parsed as a MIME version of 1.0. The comments (wherever they appear in the header) should be ignored.

The Content-Type header serves a dual purpose. First, it identifies the type of data in the message well enough that an MUA can parse it for display. Data is identified as being one of many defined media types. Each media type consists of a top-level type and a subtype. Top-level media types are:

  • text

  • image

  • audio

  • video

  • application

  • multipart

  • message

The text media type obviously refers to content that is textual in nature. The image, audio, and video types should also be obvious. The application type is used to specify that the data is to be handled in a particular fashion. For example, a PostScript document (destined for printing) has a media type of application/postscript, even though it is technically in a textual format and can be sent as text/plain. Sending it with the application/postscript media type gives the receiving MUA more information about what the attachment is so it knows how to treat it. In this case, an MUA might decide to show the attachment in a PostScript viewer, such as Aladdin Enterprises’ Ghostscript.

Arbitrary binary data can be attached to a message by using the application/octet-stream media type. Indeed, anytime an attachment is made to a message where the MUA does not recognize the type of file (and hence cannot assign it a known media type), a media type of application/octet-stream must be used. Similarly, if an attachment is received by an MUA and the Content-Type value is unrecognized, a media type of application/octet-stream should be assumed.

The last two top-level media types are known as composite types and are used when further processing by a MIME parser is required. That is, anytime a message contains more than one body part, a composite type will be used as a “wrapper” for them. For example, a message might have a MIME message header specifying a Content-Type of multipart/mixed and two message parts, one containing some text (text/plain) and the other containing an image in JPEG format (image/jpeg).

If a message contains another message, a message top-level media type can be used to ensure that its integrity is kept while parsing. For example, a forwarded message might have a message part (an attachment) of type message/rfc822, meaning that a simple plain-text RFC 822-compliant message is attached to the new message.

Each top-level media type has one or more subtypes defined under it. For example, plain ASCII text is denoted by text/plain and hypertext markup is denoted by text/html. A reasonably complete list of common MIME media types is given in Appendix B, MIME Media Types.

In the previous example, a media type of multipart/mixed shows that there will be more than one piece of content in the message, each of which will be described by a MIME part header.

The Content-Type header may also include parameters, which follow the media type and a semicolon. The example shows a MIME boundary marker as a parameter. Each message must have a MIME boundary marker if the message consists of more than one part in order to separate the parts. Similarly, since parts can be nested as we shall see later, MIME part Content-Type headers may also include MIME boundary markers.

Anytime a multipart media type is used, a boundary parameter must be given. Anytime a text media type is used, a charset parameter must be given.

When parsing parameters, MIME-compliant implementations are required to ignore any that they cannot comprehend.

If no Content-Type header is specified, the message is assumed to be in US-ASCII plain text. This allows any traditional simple text messages to be treated as MIME compliant, since they consist of this type of data. Of course, if one wanted to explicitly state the media type of a part as US-ASCII plain text, one could do that:

Content-Type: text/plain; charset=us-ascii

The MIME standard suggests that this media type be assumed when an invalid Content-Type header is discovered. The problem with this suggestion is that a syntactically invalid Content-Type header is likely to be an error in message creation or transmission, and a well-behaved MUA may attempt to recover as best it can. Simply assuming US-ASCII plain text may limit the MUA’s ability to solve a simple problem.

The Content-Transfer-Encoding header is the single most important piece of MIME information in a message. This header shows the type of encoding performed on a message or message part and therefore gives information on how to decode it.

As with the other MIME headers, if the Content-Transfer-Encoding header is part of a message header block, it applies to the entire message. If it appears at the beginning of a message part, it applies only to that part.

Since the Simple Mail Transfer Protocol limits email messages to US-ASCII 7-bit characters and lines of fewer than 998 data characters (see Chapter 9, The Extended Simple Mail Transfer Protocol), all email messages transiting the Internet must adhere to those restrictions. That means that any binary data or textual data in other character sets must be encoded into a limited ASCII representation. The way that this is done is dependent on the data involved. The Content-Transfer-Encoding value shows the recipient how it was done.

Content-Transfer-Encoding values can be any of the following:

  • 7bit

  • 8bit

  • binary

  • quoted-printable

  • base64

  • Custom, or user-defined schemes

All values are case insensitive: BINARY or Binary is the same as binary.

The first three (7bit, 8bit, and binary) indicate that no encoding has been performed on the data. The quoted-printable and base64 values, on the other hand, refer to specific types of encoding that reduce the data into 7-bit form.

Seven-bit data is simple US-ASCII text with the restrictions placed on it that are specified in RFC 821.[8] It is the only type of data that is guaranteed to be “safe” to transport across the Internet mail system. Eight-bit data allows the rest of the extended ASCII character set and is allowed by some mail systems. Binary data can consist of any characters at all.

MIME translates binary files and encodes them into 7-bit ASCII text.

7bit, 8bit, and binary Content-Transfer-Encoding values indicate that no encoding at all has been performed on the contents, but that the type of data is as indicated. This allows MTAs along the message transit path to encode the data as needed to pass through their parts of the network. Note that only 7-bit data will pass through the Internet’s mail transport system, but that 8-bit and binary data may pass through some attached networks’ mail systems. In general, only 7bit, quoted-printable, and base64 encodings are used for Internet transport.

In practice, many messages are sent to the Internet that contain 8-bit or binary data. It is the responsibility of the first Internet-connected MTA to translate the message to 7-bit, quoted-printable, or base64, as appropriate, so that the message will be assured of transport.

Many MUAs now create Internet-safe encodings by default. The 8-bit and binary encodings are simply not used.

The algorithms by which data is encoded into one form or another is addressed in the section “MIME Encoding,” later in this chapter.

If you wish to use your own encoding scheme for a message part (that is, define your own Content-Transfer-Encoding value), you may do so, but the name of the encoding scheme must begin with “X-” to prevent naming collisions with any future standards. Note that if you do use your own encoding scheme and the message is received by an MUA that doesn’t know about your special encoding, your message will not be properly decoded. While this may seem like a good idea at the time, new or custom Content-Transfer-Encoding schemes are strongly discouraged to support interoperability. If you need to use a new encoding scheme for reasons of your own, you should ensure that you understand the severe restrictions placed on its use by RFC 2045 and RFC 2048. For example, a message part that has a Content-Type of multipart simply cannot have any other Content-Transfer-Encoding than 7bit, 8bit, or binary.

The body of one message can reference the body of another message by use of the Content-ID header. This header consists of the same syntax as the Message-ID header in RFC 822 messages. The section “Dynamic Headers,” in Chapter 2, describes the Message-ID header. In general, it provides a world-unique identification number to an email message. Content-ID values must also be world-unique.

The use of the Content-ID header is uncommon and optional but becomes mandatory if a Content-Type of message/external-body is used. The external-body MIME type is used to refer to attachments that are held on a remote server and are not actually attached to the message itself. To date, commercial vendors have not implemented this part of the standard and are unlikely to, now that the World Wide Web exists. It is easier to send a URL than to create messages with external-body references.

The Content-ID header is also used to augment the multipart/alternative media type. For examples of its use in message/external-body and multipart/alternative entities, see the section “MIME Message Creation Gotchas,” in Chapter 4.

Free text descriptions of the contents of an message part may be given by using the Content-Description header. The Content-Description header is always optional. For example, one could add a textual description to a binary file attached to an email message like this:

Content-Description: The attachment below is PGP-encoded and
        will appear as gibberish to those whose MUAs can't directly
        handle PGP.

Descriptions in the US-ASCII character set are assumed by default, but descriptions in other character sets may be given by using a special character encoding mechanism. This mechanism is fully described in RFC 2047. In summary, the content of a header that is not in US-ASCII is encoded by starting with “=?”, ending with “?=”, and having two question marks (?) in between. The original text is encoded according to an encoding method, which puts the text into a US-ASCII representation. The encoding includes information on the character set and encoding method used. All of this is done to avoid breaking existing Internet mail system elements.

If you use an MUA that supports RFC 2047 encoding, it should hide the details of the encoding and decoding process from you. It should take input in a given character set and encode it into the created message. Similarly, on receipt of a message that includes encoded text, the text should be decoded and displayed to the user in the original non-US-ASCII character set.

The Content-Disposition header is used to provide presentation hints to an MUA and is not an Internet standard. It is experimental and not currently widely implemented.

This header was intended to tell an MUA whether to display an attachment inline (if possible) or as an external attachment, for which additional user interaction is necessary. Some modern MUAs have begun to use this header to denote which attachments should be viewed inline. Consider this clipping from a message forwarded by Netscape Mail:

--------------87B146AF565BEF68E095AC24
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Return-Path: <[email protected]>
Received: from mailhost.plugged.net.au
        by morris.staff.plugged.com.au (8.8.8/8.8.8) with ESMTP id PAA20154
        for <dwood@morris>; Tue, 7 Apr 1998 15:08:33 +1000
...

This clipping is the top of a MIME message part. The first line is a MIME boundary and is followed by the MIME part headers. We can tell by the Content-Type header that the content of the part is another email message. In other words, one email has been sent as an attachment within another. This is what happens when a MIME-compliant MUA forwards a message to someone else.

Next, the Content-Transfer-Encoding header shows that the attachment as a whole was not encoded. Finally, the Content-Disposition header shows that the attachment is to be displayed inline by the receiving MUA, if that MUA is capable of doing so. This means that the receiver will be able to read the forwarded message directly, instead of being forced to explicitly act on it as an attachment.

Note that most MUAs will allow their users to set preferences in regard to display of attachments. It is up to the MUA author to decide how literally to implement support for the Content-Disposition header.

For more information about the Content-Disposition header, see RFC 1806.



[8] No octets with ASCII decimal values of zero or more than 127. Carriage return and linefeed characters will occur only as part of a line-ending CRLF sequence.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.98.71