CHAPTER 7

image

The Truth About HTML5’s Other New Elements

We’ve covered structural markup in some depth; now let’s get down to the nitty-gritty. HTML5 redefines several inline and block-level elements and introduces a few more. We’ll run through some of the changes and additions and then consider the broader philosophy behind these elements.

Be Bold or Die Trying

Let’s start our look at these elements with something as seemingly innocuous as the <b>, <i>, <strong>, and <em> tags and  what’s changed in HTML5.

Only in web standards land can we turn something as straightforward as bold and italic into a complicated mix of dogmatism, high-level theory, and broken pragmatic reality.

When web standards took off, we all endeavored to separate presentation from content. No longer would font tags and tables clutter our markup. Instead, we’d style our pages with CSS and describe our content in a meaningful (in other words, semantic) way with appropriate tags.

This put the poor old <b> and <i> tags in a tough spot. They were ostensibly presentational—they described how text should look, not what it meant—and we were running away from presentational tags as fast as our fingers would carry us. So, we all embraced the <em> tag instead of <i> (for “emphasis”), and the <strong> tag instead of <b> (for “strong emphasis”). These new tags now described the meaning of the text—it was emphasized, and how ”emphasized” text looked (or sounded) was (theoretically) up to the browser or screen reader. We could then use <b> and <i> for purely stylistic reasons and <strong> and <em> for semantic purposes—a subtle difference but a difference nevertheless.

I embraced the change (you may have, too), thinking that mattered. But it didn’t. Yes, it helped draw a line in the sand between presentational and semantic markup, but this was splitting rather narrow hairs. There was no pragmatic benefit. Here are some examples:

  • We just swapped one for the other: Given <strong> still bolded text and <em> still italicized text, we all just swapped <b> for <strong> and <i> for <em>, and that was that. The difference between what was “emphasized” and what was just bold or italicized styling without any particular emphasis was lost, given we kept using them for presentational purposes anyway. WYSIWYG editors were particularly guilty of this. The difference was just too subtle.
  • Screen readers ignored them altogether: The main benefit of these “semantic” tags (supposedly) was that screen readers could read the text with “emphasis” or “strong emphasis.” In fact, screen readers, by and large, ignore them altogether. (See www.paciellogroup.com/blog/2008/02/screen-readers-lack-emphasis/ for further discussion.)
  • Search engines don’t care: Google treats <strong> and <b> and <em> and <i>exactly the same. (See Matt Cutts’ video here: http://www.youtube.com/watch?v=awto_wCeOJ4.)

So, for all the dogmatism about these elements, the reality is pretty simple—use whatever you want. The humans who read it won’t care, and the machines that read it (screen readers and search engines) don’t care either.

But where do these elements fit in HTML5?

I guess if you’re writing a spec, you have to try to make some sense of how these elements are used, with some emphasis (pun intended) on how they should be used. Here’s what the spec says (emphasis added):

<i>—The i element represents a span of text in analternate voice ormood, or otherwise offset from the normal prose.

<em>—The em element representsstress emphasisof its contents.

<b>—The b element represents a span of text to bestylistically offsetfrom the normal prose without conveying any extra importance.

<strong>—The strong element representsstrong importancefor its contents.

HTML5doctor.com has an entire article on how this might work in theory (see http://html5doctor.com/i-b-em-strong-element/), but it’s really pure fiction. If you think people will actually mark up their documents in this way, I have 15 billion web pages I’d like to show you. And Ian Hickson himself likes to say this (www.webstandards.org/2009/05/13/interview-with-ian-hickson-editor-of-the-html-5-specification/):

[I]f they [browser vendors] don’t implement it, the spec is nothing but a work of fiction. […] I don’t want to be writing fiction.

If the HTML5 spec documented actual behavior (that is, “paving cowpaths”), the spec would just say <b> and <strong> make text bold, <i> and <em> make text italic, and screen readers tend to ignore them altogether. That’s the reality. Everything else is fiction.

This may seem like small fry, but we’ve touched on a bigger philosophical question: how much of marking up a document in HTML is word processor–like formatting, and how much is marking up the meaning of the text? For most web authors—usually our clients using the content management systems we set up for them—it’s about word processor–like formatting, and that’s OK. We’ll return to this shortly.

Wrap Your Anchor Around This, and Other Bits and Pieces

Let’s do a quick roundup of some other features and elements available in HTML5.

Wrap Anchors Around Block-Level Elements

We can now do things like wrap a link around an <h1> heading and paragraph, which could be useful for items such as blog posts. We need to set the wrapping <a> element to display:block; or there could be unexpected behavior. There were issues with this in early versions of Firefox (3.5), but they’ve since been fixed. I still recommend testing thoroughly when wrapping links around block-level elements.

<mark>

There’s a new <mark> element we can use to highlight text (with appropriate CSS) instead of, say, <span class="mark">keyword</span>. This could highlight search keywords in search results, for example.

<figure> and <figcaption>

The <figure> and <figcaption> elements let us mark up a photo, chart, table, code snippet, or any other self-contained content that’s referenced from “the main flow of the document,” as the spec says. So, we might have this:

<figure><img src="myphoto.jpg"><figcaption>Yup, this is my photo.</figcaption></figure>

(See the spec for more examples: www.whatwg.org/specs/web-apps/current-work/multipage/grouping-content.html#the-figure-element.)

These elements may be mildly helpful for accessibility (that is, screen readers could read out the figure and its caption), but it’s a complex issue. See this extensive write-up by Steve Faulkner for more: www.paciellogroup.com/blog/2011/08/html5-accessibility-chops-the-figure-and-figcaption-elements/.

These elements also suffer from the same IE6–8 no-JavaScript styling problem we discussed earlier.

<time>

The new <time> element was included mostly for microformats (well before Schema.org was born) but should be useful for future micro-semantic initiatives. Beyond that, <time> is deceivingly complex. It’s the drama queen of HTML5 elements, and if <time> were a TV show, it would be The Bold and the Beautiful.

In 2011 alone it was killed off by Ian Hickson, then half-revived in the W3C HTML5 spec, and then re-added by Hickson in an improved way to the HTML5-but-we-just-call-it-HTML WHATWG spec. Bruce Lawson blogged about <time>’s removal and reappearance at www.brucelawson.co.uk/2011/goodbye-html5-time-hello-data/  and at www.brucelawson.co.uk/2011/the-return-of-time/.

And it has been subject to a great deal of debate on the WHATWG mailing list before all the 2011 drama (Ian Hickson summed up one debate in 2009 here: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-March/018888.html).

It’s worth pondering how the HTML5 editor can arbitrarily kill off an element on a whim, add a new one (<data>), and then reinvent the previously dead element in the face of a backlash, in what is supposed to be a specification that browser makers can implement.

(If you’re a sucker for punishment or are coming off some kind of Charlie Sheen-esque bender and really need some sleep, there’s also a 8,000+ word WHATWG wiki entry on <time> here: http://wiki.whatwg.org/wiki/Time.)

So, how do we use this new, risen-from-the-grave version of <time>?

In its current incarnation, the <time> element allows a variety of strings such as a year string (2011), a month string (2011-11), a date string (2011-11-12), a time string (14:54) with or without seconds and microseconds, combinations of date and time strings (2011-11-12T14:54:39.92922), and more complex strings with timezone offsets (2011-11-12T06:54:39.92922-08:00).

For example, you could use it like this:

<p>The Y2K bug destroyed civilization on <time>2000-01-01</time>.</p>

This is more liberal than the original incarnation of the <time> element, and for a full list of valid strings, see the spec: www.whatwg.org/specs/web-apps/current-work/#the-time-element.

The <time> element also allows a machine-readable date/time value that can be stuck in the datetime attribute, with something more human-friendly in the <time> tags (or indeed, nothing at all), such as this:

<p>The Y2K bug destroyed civilization at the <time datetime="2010-01-01"> beginning of this year</time>.</p>

This is handy for micro-semantics, such as Schema.org microdata.

You can also add a boolean pubdate attribute to indicate when an <article> (or the overall document if it’s not within an <article>) was published:

<p>This Y2k article published on <time pubdate datetime="2000-01-01T01:42">Dec 31, 1999</time>.</p>

(Remember, a boolean attribute simply means “yes” by including it, in other words,“this is the publication date,” or “no” by excluding it, in other words, it doesn’t accept values.)

<details> and <summary>

The new <details> element functions as a show/hide box without having to use JavaScript. It has a boolean attribute (that is, stand-alone, with no value) of open, which tells the browser to display the box as open by default. But if the attribute is absent, it will be collapsed, with the <summary> element describing what appears when collapsed.

Here’s an example:

<details>
   <summary>Show/hide me</summary>
   <p>You can see this when expanded<p>
</details>

This would give the result shown in Figure 7-1.

9781430264156_Fig07-01.jpg

Figure 7-1. The <details> element closed (above) and open (below)

The spec suggests it could be used in complex forms (and uses OS X’s file info window as an example) where you want show or hide certain settings or form inputs. Browser vendors are still working out how they should style this by default. Currently, only Chrome, Safari, and Opera support it.

This is a strange addition to the spec and is one of the WHATWG’s curious little innovations. Common patterns of JavaScript- or CSS-powered behavior have become quite prevalent in recent years (think tabs, drop-down menus, pop-overs, lightboxes, and so on), and yet there’s no desire to have that functionality replicated in pure HTML. A show/hide triangle control was, however, deemed worthy of being included in the spec. Such are the little mysteries of the WHATWG’s HTML5.

<small>

Some existing HTML4 elements have also been redefined.

For example, the <small> element now means “fine print,” not “visually small.” I find the idea of redefining an element this late in the game weird, but there you go.

<address>

I didn’t even know there was an <address> element. It’s a block-level element that, in HTML5, is for contact information for a given section (for example, an <article>, perhaps in the <article>’s <footer>) or the document itself. The spec says it’s explicitly not for arbitrary postal addresses, which should just be in <p> tags. If someone from the WHATWG finds out you’ve used it for an arbitrary postal address, expect to have a finger shaken very firmly in your direction.

<cite>

In HTML5, <cite> has been redefined to exclude the previously acceptable use of citing people’s names. It’s now only for works. This really annoyed Jeremy Keith, who wrote about it on 24 Ways (see http://24ways.org/2009/incite-a-riot). Again, it’s weird the HTML editor can just redefine elements on a whim. It raises the question of whether we should bother with these elements for “inline semantics” at all, which brings us to…

Should We Even Use These Obscure Little Tags?

If and when new functionality arrives in browsers or other agents (and not just the bowels of the HTML5 specification), sure, some of these elements may prove handy from time to time.

But let’s step back and consider the bigger picture of the purely “semantic” text-level elements. We’ve already touched on the question of simple word processor–style formatting versus marking up meaning when discussing <b> versus <strong> and <i> versus <em>. Now let’s consider the <address> element, for example. In November 2009, Jack Osborne wrote the following on HTML5doctor.com (http://html5doctor.com/the-address-element/):

The address element has been around since the HTML3 spec was drafted in 1995, and it continues to survive in the latest drafts of HTML5. But nearly fifteen years after its creation, it's still causing confusion among developers. So how should we be using address in our documents?

Perhaps, after 15 years, it’s time for a rethink. What’s our aim here? Are we going to give it another 15 years? After 30 years, will the Web finally be using <address> correctly? And if it is, so what?

Fifteen years ago we may have assumed that “one day” someone will do something useful with our carefully marked-up pages. We now know better. It’s time to reevaluate. We’ve spent 15 years experimenting with HTML to see what works in terms of semantics and functionality. It’s time to take stock of the results.

If HTML5 were truly paving cowpaths here, it would open up the definition (instead of tightening it) for elements such as <address> and <cite> or, better still, make them obsolete altogether. We don’t need them. They don’t do anything. Micro-semantics on top of HTML make them obsolete. The search engines have demonstrated through  Schema.org (and earlier initiatives such as Rich Snippets) that they want micro-semantics, not redefined HTML elements. Authors have little use for them. So, why keep them?

This is the truth we need to acknowledge when it comes to these finer aspects of markup. HTML for documents has proven to be pretty lousy for anything but basic semantics that are explicitly tied to formatting (header, paragraph, list, link, and so on) and providing generic page structure (using <div>s, now with some ARIA roles sprinkled liberally), but that’s its beauty; it’s what makes it so universal and accessible.

Digging into the details of HTML5’s markup reveals yet another mixed bag, containing some interesting inclusions, some baffling ones, a lot of squabbling over some incredibly minor issues, and a lack of a coherent vision to really take markup, and the Web, forward.

Then again, criticism of HTML in this regard is hardly new. Here’s Clay Shirky in his piece “In Praise of Evolvable Systems” (www.shirky.com/writings/evolve.html) from—wait for it—1996:

HTTP and HTML are the Whoopee Cushion and Joy Buzzer of Internet protocols, only comprehensible as elaborate practical jokes. For anyone who has tried to accomplish anything serious on the Web, it's pretty obvious that of the various implementations of a worldwide hypertext protocol, we have the worst one possible.

Except, of course, for all the others.

And it was ever thus.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.79.206