Chapter 16. Essential XHTML

Probably the biggest XML application today is XHTML, which is W3C's implementation of HTML 4.0 in XML. XHTML is a true XML application, which means that XHTML documents are XML documents that can be checked for well-formedness and validity.

There are two big advantages to using XHTML. First, HTML predefines all its elements and attributes, and that's not something you can change—unless you use XHTML. Because XHTML is really XML, you can extend it with your own elements, and we'll see how to do that in the next chapter. Need <INVOICE>, <DELIVERY_DATE>, and <PRODUCT_ID> elements in your Web page? Now you can add them. (This aspect of XHTML isn't supported by the major browsers yet, but it's coming.) The other big advantage, as far as HTML authors are concerned, is that you can display XHTML documents in today's browsers without modification. That's the whole idea behind XHTML—it's supposed to provide a bridge between XML and HTML. XHTML is true XML, but you can use it today in browsers. And that has made it very popular.

Here's an example; this page is written in standard HTML:

<HTML>
    <HEAD>
        <TITLE>
            Welcome to my page
    </HEAD>
    <BODY>
        <H1>
            Welcome to HTML!
        </H1>
    </BODY>
</HTML>

Here's the same page, written in XHTML, with the message changed from Welcome to HTML! to Welcome to XHTML!:

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <head>
        <title>
            Welcome to my page
        </title>
    </head>

    <body>
        <h1>
            Welcome to XHTML!
        </h1>
    </body>
</html>

I'll go through exactly what's happening here in this chapter.

You save XHTML documents with the extension .html to make sure that browsers treat those documents as HTML. This document produces the same result as the previous HTML document, except that this document says Welcome to XHTML! instead, as you can see in Figure 16.1.

Figure 16.1. An XHTML document in Netscape.


Take a look at this XHTML document; as you can see, it's true XML, starting with the XML declaration:

<?xml version="1.0"?>
    .
    .
    .

Next comes a <!DOCTYPE> element:

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    .
    .
    .

This is just a standard <!DOCTYPE> element; in this case, it indicates that the document element is <html>. Note the lowercase here—<html>, not <HTML>. All elements in XHTML (except the <!DOCTYPE> element) are lowercase. That's the XHTML standard—if you're accustomed to using uppercase tag names, it'll take a little getting used to.

The DTDs that XHTML use are public DTDs, created by W3C. Here, the formal public identifier (FPI) for the DTD that I'm using is "-//W3C//DTD XHTML 1.0 Transitional//EN", which is one of several DTDs available, as we'll see. I'm also giving the URL for the DTD, which for this DTD is "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd".

Using an XHTML DTD, browsers can validate XHTML documents, at least theoretically (and, in fact, browsers such as Internet Explorer will read in the DTD and check the document against it, although as we've seen, you must explicitly check whether errors occurred because the browser won't announce them).

Note also that the URI for the DTD is at W3C itself: "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd". Now imagine 40 million browsers trying to validate XHTML documents all at the same time by downloading XHTML DTDs like this one from the W3C site—quite a problem. To avoid bottlenecks like this, you can copy the XHTML DTDs and store them locally (I'll give their URIs and discuss this in a few pages), or do without a DTD in your documents. However, my guess is that when we get fully enabled validating XHTML browsers, they'll have the various XHTML DTDs stored internally for immediate access, without having to download the XHTML DTDs from the Internet. (As it stands now, it takes Internet Explorer 10 to 20 seconds to download a typical XHTML DTD on a typical modem line.)

After the <!DOCTYPE> element comes the <html> element, which is the document element. It starts the actual document content:

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    .
    .
    .

I'm using three attributes of this XHTML element here, as is usual:

  • xmlns defines an XML namespace for the document.

  • xml:lang sets the language for the document when it's interpreted as XML.

  • The standard HTML attribute lang sets the language when the document is treated as HTML.

Note in particular the namespace used for XHTML: http://www.w3.org/1999/xhtml, which is the official XHTML namespace. All the XHTML elements must be in this namespace.

The remainder of the page is very like the HTML document we saw earlier—the only real difference is that the tag names are now lowercase:

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <head>
        <title>
            Welcome to my page
        </title>
    </head>

    <body>
        <h1>
            Welcome to XHTML!
        </h1>
    </body>
</html>
			

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.177.135