Chapter 2. Document Structure

Users often encounter compatibility problems when working with electronic documents created on different computers because the format is tied to the software and hardware used to create the document. They may not have the software required to open and work with the document. If they are able to open the file, they may find the document illegible because they don’t have the necessary fonts. The Web came into being in part to address the challenges of sharing electronic documents. HTML is a “device-independent” document format that can be read by any number of devices.

Perhaps the most fundamental attribute of a device-independent format is the ability to separate content and presentation. When content is handled separately from presentation, documents can be accessed without requiring specific fonts, operating systems, software, and display formats. Content can be read using different devices—graphical browsers, text-only browsers, speech synthesizers, printed output, PDAs, Web-enabled appliances, and so on. The actual rendering of the pages is left up to the client software.

Historically, designers avoided HTML structural elements because their display did not coincide with our design aesthetic. Instead, we used presentational formatting to gain control over visual design and layout. We used FONT and B tags instead of H1-6 tags to control the size and style of headings; we used line breaks instead of P tags to control the margin between paragraphs; we used tables to position elements on the page; we used images for page elements. This design approach became common practice, and structural markup fell by the wayside. As a result, today’s Web is a vast system for exchanging static electronic documents, and not the smart, accessible Web envisioned by its creators.

Cascading Style Sheets, or CSS, has changed all of that. CSS provides the means for designers to control page display while maintaining the separation of content and presentation. Because CSS is now well supported among modern browsers, we can abandon old practices and begin building the Web as it was intended. Presentation markup is unnecessary, as are the workarounds we once relied on to gain control over visual design. Instead, we can build Web pages using structural HTML elements, and use CSS to design their display. We can separate content and presentation by managing content in structured HTML documents and presentation in style sheets.

When we manage content and presentation separately, our design decisions need not get in the way of access. Some users require access only to content. Nonvisual users, for example, do not gain from accessing presentation information. Other users need access to content with a customized display. When our pages are free of presentation markup, users can access the content—whether rendered with our styles, user-defined styles, software-defined styles, or no styles—depending on the requirements of the user.

2.1. Basic Principles

2.1.1. Separate content and presentation

When information is displayed on a printed page, its content and presentation are inextricably bound. The expression of the information is tied to its visual design, and the reader must be able to access and interpret the information as presented. When access is subject to requirements such as 20/20 vision, the information is bound to be inaccessible to some readers.

The Web is a medium designed to remove access requirements and to make content accessible to all. Universal access to Web content is achieved in a number of ways, one of which is device independence. With device independence, access to Web content is not bound to a certain operating system, software, or even a computer; indeed, some kitchen appliances read Web documents. Document display does not require a certain typeface, size, margins, or colors. In fact, Web content can be displayed in any size and color—the choice of display is in the hands of the user.

The key to device independence and universal access is in the separation of content and presentation. Divorcing content from presentation is a foreign concept for many of us, however. When designing a document, we are accustomed to making choices about the visual properties of elements such as headings, paragraphs, and lists. We use typography to differentiate these elements: We make headings big and bold, paragraphs indented, and lists indented and bulleted.

Web documents can be built using the same design approach—B for bold, I for italic, and so on. The notion of one static visual design, however, is contrary to the nature of the medium. Web documents are meant to adapt, providing not one view but many. They are also machine readable, and machines can’t make much sense of visual markup. What does bold mean? What does an indent signify? Many Web functions rely on access to document structure in order to take meaningful actions with Web documents.

Rather than building documents using visual design practices, we need to build structure into our Web documents. Headings must be encoded as headings, paragraphs as paragraphs, lists as lists. Creating an HTML document is not a visual process but an intellectual one, in which each element is identified and assigned the appropriate semantic meaning. The resulting document is rich with meaning beyond what is displayed on the screen, and can be rendered by any Web-enabled device in the appropriate format (Figure 2.1, previous page).

Image

FIGURE 2.1 Wikipedia uses structural markup to identify headings, lists, and paragraphs, and CSS to define their visual properties.

Structured documents can be styled in several ways. Software that renders Web pages applies styles to HTML elements. For example, the STRONG tag produces bold text in most browsers, and EM for emphasis produces italicized text. The visual formatting of heading levels—large to small, bold and/or italic—is assigned using styles. Web designers could simply develop structured documents and leave the rendering to the client software (Figure 2.2).

Image

FIGURE 2.2 When structured documents are displayed without styles, the client software determines the visual formatting. Here, the structured Wikipedia page is displayed without styles in Safari. Select Safari style definitions are shown in the inset.

Most designers prefer to have a hand in the visual design of their pages, however, and CSS provides the means to control page display while maintaining the separation of content and presentation. With CSS, we can control the interpretation of the tags we use to create a structured document. We can assign font size and weight to headings, indents to paragraphs, custom bullets to lists, and more. CSS has far more formatting options than standard HTML presentation markup. And since content and presentation are separate, one document can have many different designs simply by applying a different style sheet (Figure 2.3).

Image

FIGURE 2.3 When content and presentation are separate, alternate formats are easy to provide by applying a different style sheet. Through linked style sheets, Boxes and Arrows allows users to choose between regular (1), large-font (2), and print (3) versions.

In a nutshell. Content that is encoded without display requirements can be accessed by any software or device. Use HTML documents for content, and CSS for presentation.

2.1.2. Mark up document structure

A markup language like HTML wraps standard document elements in identifying “tags” that describe the meaning of each element, such as titles, headings, paragraphs, lists, tables, links, addresses, citations, and quotes. The resulting document is “machine-friendly”—it can be read and interpreted by software. When a document has structural markup, software can make use of it by displaying the document title in the browser title bar, by providing a list of links or headings, and so on. Screen reader software can use structure to modulate tone by reading headings more slowly than main text, or by reading links using a different voice. Search engines can index a structured document more accurately than a plain text document because phrases marked as headings help the software determine the document’s subject and primary focus.

Many of today’s Web documents do not contain structural markup, or they make use of only the most basic tags: TITLE, BODY, maybe a P or two. Many documents do contain structural markup, but for visual purposes, such as BLOCKQUOTE for margins and TABLE for page layout. Most other markup is presentation markup: tags that describe the visual attributes of page elements. These include such tags as BR for line breaks, FONT for setting type size and typeface, B for bold, and I for italic.

On the surface, a nonstructured document may look no different than a structured one. Whether a designer marks paragraphs with a P or two BRs (line breaks) is not visually apparent. However, the logical structure underlying a well-structured document adds a layer of meaning that gives power and utility to the Web. Software can read text documents; with structured text documents, software can both read and derive meaning. A truly interconnected Web requires documents that can be cataloged and connected by software. To do this well, software needs structure.

Take the title of this book. <i>Access by Design</i> is visually identifiable as a book title because it is italicized and uses title case—two conventions that denote book titles. However, software cannot recognize the phrase as a title because I means italics—nothing more. On the other hand, <cite>Access by Design</cite> is universally identifiable as a book title because the HTML tag CITE is used to denote citations. When a book title is tagged for structure, software can do useful things, such as scan the Web for all instances where the book is cited in other Web documents. In this case, instances marked with I would fall through the cracks.

To build structured documents, encode content using structural markup. Identify page sections—header, navigation, content, footer—and the elements contained within the sections—headings, paragraphs, lists, and tables. Instead of thinking about what each element should look like, think about what each element is—and tag it using the appropriate HTML structural tag (Figure 2.4). Avoid meaningless tags, such as FONT, BR, B, and I, and do not misuse structural tags for presentation purposes, such as tagging paragraphs with the BLOCKQUOTE tag to create margins. When it comes time to think about visual design, turn to CSS to define the appearance of structural elements.

Image

FIGURE 2.4 Table of common structural tags.

In a nutshell. Semantic markup produces content that can be read and interpreted by software. When encoding content, tag the meaning of document elements using structural HTML.

2.1.3. Use style sheets for presentation

Before CSS, designers avoided structural tags because the browser, not the designer, determined their visual appearance. Browsers did not always make the most elegant decisions about formatting, creating huge headings, lists bulleted and indented, and paragraphs marked with a blank line, to name a few. To get around these defaults and to gain control over visual formatting, we constructed pages using nonstructural tags, such as FONT and B, tables for layout, and images for page elements.

These methods served the visual aspect of the Web, but at a cost. Web pages coded with presentation markup are heavy with unnecessary code. Complex layout tables are easily broken, and changes to page elements designed as images require significant effort. As a result, we tend not to make changes to sites designed this way because, like a house of cards, one small change could bring the entire design crashing down.

Now that CSS is well supported in browsers, there is no need to resort to “old-school” design methods. We can use styles to override browser formatting and to apply visual formatting to structural elements. CSS offers more control, in fact, than presentation markup—for example, typographic control over leading and tracking is available using CSS.

Style sheets offer control over nonvisual attributes as well. With aural styles, we can design the audible experience of our Web pages. For example, we can control the voice type and inflection used for reading different elements—perhaps using a female voice for links and a male voice for all other content. Another useful aural style is the “speak” style, which can be used to tell screen reader software whether to speak (XEROX) or spell out (WWW) abbreviations and acronyms.

Other benefits of CSS are consistency and ease-of-use. When content is marked up structurally, one CSS document can control the design of all linked documents. With one master style sheet, all headings share the same visual properties, all paragraphs, all lists, and so on. Changing visual attributes is easy: One small change to the style sheet is all that’s needed to change the page background color, or to use a different typeface. Major redesigns can be accomplished without ever touching the content pages (Figure 2.5).

Image

FIGURE 2.5 CSS Zen Garden is an illustration of the range of designs that can be accomplished using Cascading Style Sheets (CSS). Different style sheets are applied to the same HTML document to produce these (and many more) distinctive pages.

When content and presentation are separate, custom views are easy to provide by applying different style sheets. Designers can offer different versions—for example, a large-type version, a high-contrast version, a printing version, and different versions for various devices. Users with specific needs can then apply their own custom style sheet that meets their access requirements.

In a nutshell. Style sheets provide control and flexibility for designers and users. Use style sheets to control the presentation of Web pages.

2.1.4. Design pages that function without style sheets

Under some circumstances, styles may not be part of a user’s experience of Web pages. Some users use browsers that do not support styles. Some users turn off style sheets or apply a custom style sheet. Nonvisual users do not access visual styles. These users may encounter difficulties if designers rely on styles, for example, to divide functional areas of a page or to group related elements. To support users who do not access styles, structured documents must be functional, comprehensible, and usable without the formatting supplied via style sheets.

For pages to function without style sheet formatting, content in the HTML document must be logical when read or viewed in sequence. The sequence of page elements must follow a logical order—for example, header, navigation, content, and footer. Moreover, the content belonging in each section must be contained within each element.

In general, pages that are structured using basic HTML tags will work best when users access them without styles, or with user-defined styles. Pages that are designed using nonstructural elements do not adapt as well as pages designed using structural elements, such as paragraphs and lists. Browser or user-defined styles and screen readers cannot account for custom elements—<div id="banner">, <div id="footer">—and will not have the means to differentiate these elements. However, if elements are designed using structural markup—<p id="banner">, <ul id="footer">—software can differentiate these elements by accessing the structural tags.

In a nutshell. Some users do not access styles. Design pages that are comprehensible and usable without style sheet formatting.

2.2. Markup

2.2.1. Write valid code

Building to standards means following a standard set of specifications that defines the syntax or rules for the structure of HTML or other code. One of the benefits of building to standards is that results can be measured against specifications to ensure that a project meets code. Web pages are built primarily using the standards for markup (HTML and XHTML) and presentation (CSS) developed by the World Wide Web Consortium (W3C). When Web pages are built to standards, they are more likely to function properly with browsers that also conform to specifications.

To ensure standards compliance, we need to begin each page with a DOCTYPE, or document type declaration. DOCTYPE tells software which set of specifications to use in handling pages:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
   Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-
   transitional.dtd">

In the above example, the document type is identified (XHTML 1.0 Transitional) along with the URL for the DTD, or document type definition. For page access, DOCTYPE tells browsers which set of standards to follow, increasing the likelihood that pages will display properly across software and devices. For page validation, DOCTYPE tells validation software which set of specifications to measure against in assessing standards compliance.

In a nutshell. Valid code is essential to solid document structure. Identify document type and validate pages to ensure quality and compatibility.

2.2.2. Use linked style sheets

Style can be applied to a Web page either by including style information in the HTML document or by linking to one or more external style sheets from the HTML document. Linked styles have several advantages over embedded styles.

Linked styles save bandwidth. When style information is embedded within an HTML page, the style information must be downloaded with each page. On the other hand, linked styles are downloaded once and cached by the browser, then applied to each page in a site that references the style sheet file.

Linked styles promote consistency and reduce site maintenance workload. When all presentation information is referenced within one file, all pages share the same design. A site-wide design change, such as a different page background or font replacement, requires one change in one document, rather than potentially thousands of individual pages. This ease of maintenance clearly benefits the site developer, but users also stand to gain. Design consistency makes sites more usable and accessible because users only have to learn the user interface once to use the site. Inconsistent designs require the user to relearn the interface at each page.

Linked styles enable alternate views. We can use linked styles to provide alternate styles for different access methods: for example, the “print” media type for printed pages, or the “handheld” media type for use on small devices, such as PDAs. We can also use alternate styles to provide options for viewing our pages, such as a large-text or high-contrast view (Figure 2.6).

Image

FIGURE 2.6 Using linked style sheets, BBC gives users the option (1) of viewing a low-graphics version (2) of its news pages.

In a nutshell. Linked style sheets promote design consistency and produce faster downloads. Include style information in a linked style sheet rather than on each Web page.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.18.112.250