CHAPTER 5: The Truth About HTML5 Micro-semantics and Schema.org

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER 5

The Truth About HTML5 Micro-semantics and Schema.org

A common claim made about the new HTML5 structural elements is that they are “more semantic.”

In my view, the new elements are “more semantic” in the same way fruit-flavored candy bars are “more nutritious”—not at all.

Nevertheless, the question of semantics in HTML5 gives us an excellent excuse to take a quick trip through the big picture of “semantic” markup. We’ll look at where semantic markup came from and what semantic markup promised to deliver but never quite did, and we’ll finish with a quick look at something you can use right now—new schemas put forward by the major search engine companies (Google, Microsoft, and Yahoo) that will ideally improve the display of your search results.

By the end of this chapter, your markup nerd-dar will be so finely tuned you’ll be able to separate the markup poseurs using semantic as a mere buzzword from the hard-core markup wonks who are still waiting for the Semantic Web to arrive, any day now...

Semantics in a Nutshell

When it comes to the Web, there are actually two kinds of “semantics:” the nitty-gritty markup of a given web page and the so-called Semantic Web. Let’s start with the semantic markup we practice every day as web designers.

“Semantic markup” was one of the cornerstones of the web standards movement. In 2003 Jeffrey Zeldman, perhaps the best-known advocate for semantic markup and web standards, wrote this on his blog (www.zeldman.com/daily/0303a.shtml):

CSS combined with lean semantic markup makes sites faster, more portable, and more accessible. The combination helps sites work in more existing environments and is the best hope of preparing them for environments that have not yet been developed.

This was a major change in both theory and practice for web designers. We’d keep all the styling information about a page in a separate CSS file and describe the content with “lean, semantic markup,” as Zeldman put it.

Here’s a (slightly reworked) example of semantic markup Zeldman used in a 2002 Digital Web article (www.digital-web.com/articles/999_of_websites_are_obsolete/). First, Zeldman borrowed some “unsemantic” markup from an e-commerce site to show what we were moving away from (try not to shudder when you read it):

<td width="100%"><font face="verdana,helvetica,arial" size="+1" color="#CCCC66"><span class="header"><b>Join now! </b></span></font></td>

And then, with CSS handling the styling, the markup simply became this:

<h2>Join now!</h2>

And lo and behold, there it was: lean, semantic markup that we pretty much take for granted now. It was a big, and extremely worthwhile, shift in practice.

But what makes this example “semantic” and not the first one? Semantic is just a fancy way of saying “meaningful,” and by using heading tags (<h2>), it now means something to browsers (and screen readers): “This is a heading.” Screen readers can (and do) use these headings to navigate around a document, and browsers can give these elements default styling (for example, making it a block-level element).

It also makes it easy for us humans to read. When we scan the markup, there’s no doubt about what this text is—it’s a heading. Simple, right?

This highlights the two key groups that matter in “semantic markup”: humans and machines (browsers, screen readers, search engines, and so on). It should be both “human readable” and “machine readable”—Semantic Markup 101.

“Machine readable” semantic markup has other benefits. Search engines can scan, index, and search our content in a way that’s much harder (if not impossible) with Flash sites or web sites consisting purely of images (as print designers are occasionally wont to churn out).

That said, Google doesn’t care much about what markup you use.

These Problems Have Been Solved

Here’s the thing: the solution to these problems has been around for more than a decade, no matter what flavor of HTML you are using. Search engines can index our content, screen readers can understand it, and our lean, semantic markup makes it easy to read and maintain.

Then the pedants took over.

The people in web design circles began to think “Well, if semantic is good, then more semantic must be better, right?”

Not really.

Beyond the point of human readability and basic machine readability, “more semantic” doesn’t mean anything (irony ahoy!). But this hasn’t stopped people debating which elements are more semantic or more appropriate, which nine times out of ten is about as useful as debating whether it’s “splade” (or, more correctly, “splayd” for you sticklers out there) or “spork.” (Splade, obviously).

There’s No Such Thing As “More” Semantic

I humbly propose that the unqualified use of “more semantic” be banned from web design discussions about HTML elements posthaste.

Whenever you hear someone going on about something being “more semantic,” ask them this simple question:

“For who?”

If all they can come back with is “But it’s ... MORE SEMANTIC!” they’re just making a vague claim about nothing. But if they say something like “More semantic for screen readers,” that’s a valid claim we can evaluate.

Do screen readersreally do anything different for these “more semantic” elements? Are they supported at all? Or do they cause bugs like the HTML5 elements did when they were first used? (See www.accessibleculture.org/blog/2010/11/html5-plus-aria-sanity-check/.)

(Remember: because of the no-JavaScript IE6–8 issues, using HTML5 elements for accessibility is about as useful as dieting on doghnuts.)

Likewise, if they say “But it’s more semantic for search engines,” we can evaluate that specific claim. What does Google’s developer guidelines say? What does the SEO community think? And so on.

But please, no more unqualified claims of “But it’s more semantic” when discussing HTML5. These dubious assumptions have been attaching themselves like barnacles to the good ship Web Standards for years, and it’s time we revved up the high-pressure hose and cleaned them off (assuming that’s how barnacles are, in fact, removed).

OK, mini-rant over. The human readability and basic machine readability problems have been solved, this is where we’re at, and we may hope that HTML5 will take us forward. But before we get to HTML5’s approach, let’s talk about the Big Idea™ behind semantic markup.

Big Ideas in Semantic Markup: The Semantic Web

What if we could take the “machine-readable” part of semantic markup further? What if the machines (and browsers in particular) could read our markup and know not just what content appeared but what given blocks of content actually meant?

That’s the big idea behind semantic markup. If we can describe the content of our pages accurately and specifically, then machines can do cool stuff with the data.

This is (or perhaps was) partly the idea behind the Semantic Web—a big, broad concept that would be driven by the XML-ified Web. (Read more about it here: http://en.wikipedia.org/wiki/Semantic_Web.) The Web would be a perfectly described library of documents, marked up in excruciating detail with XML. An XML-based future was something many influential people believed in. In fact, in the earlier markup example from 2002 and the use of <h2>, Zeldman described web standards as a way we can “transition from HTML, the language of the Web’s past, to XML, the language of its future.”

However, as we saw in Chapter 1, the move to XML died, and with it the dream of a true Semantic Web. Instead, the Web became a wonderful platform for applications, went social, and kept on being the Web we know and love. But it wasn’t the capital-S Semantic Web people had hoped for.

We need to keep this history in mind when people talk about “semantic” elements in any situation, whether it’s HTML5 or whatever future HTML evolves. What kind of “semantics” are they referring to—basic human- and machine-readable semantics we all use every day or the dead-end dream of the XML-powered Semantic Web?

Semantics: Not Dead Yet (Or: Google & Co Drop a Micro-Semantic Bombshell)

There’s actually a third option that sits between the lean, semantic markup we use now and the pie-in-the-sky Semantic Web, called microdata (and microformats), which adds a layer of metadata to our markup.

(A variety of approaches compete here, particularly microformats, microdata, and RDFa. But I’ll just be referring to the overall concept as micro-semantics, which is also known as “structured data.”)

With micro-semantics, we simply embed semantic data into our existing HTML document. Let’s look at how micro-semantics could help daily life on the Web.

E-commerce with Real (Micro) Semantics

Let’s use online shopping as an example. Here, truly semantic markup could theoretically help desktop browsers (in other words, all of us), the visually impaired using screen readers, and search engines.

Desktop browsers: Let’s say we’re shopping online for a new TV and doing our research by visiting a bunch of web sites to compare features and prices for specific models. In most cases (well, if you’re as obsessive about research as I am), this means copying and pasting the relevant information from each site into a separate document—which is both tedious and prone to error.
Now, imagine if these e-commerce sites all marked up their pages with a <productdetail> tag and nested <price> and <specs> tags. Our browser could easily find the product detail part of the page, and with a single click we could add the price and specs into a comparison shopping list. With specific, meaningful tags, your browser—a machine—can find, compile, and sort certain information for you very quickly. After all, that’s what they do best.
Screen readers: It could also help the blind or vision-impaired. Imagine a blind person doing the same research for a new sound system. If the e-commerce pages were marked up with these <price> and <specs> tags, their screen reader could theoretically read out just the price and product specs. They could then save those details to their comparison shopping list and move on. But until it happens, they have to try to navigate around highly complex pages by having headings and content read to them.
Search engines: With the prices and specs marked up correctly, Google, Bing, and other search engines could display the price of a product fairly reliably in their search results and improve the whole search experience. (This is actually possible right now, which we’ll get to).

Those are just a few examples of what’s possible when we have truly semantic markup. Machines—browsers, screen readers, and search engines—can easily pick out useful information and do cool things with it (such as create a comparison shopping list).

The problem is, to use different tags to describe this data, the HTML spec would need a squillion different tags. Every kind of content—from poems to products to policy documents—would need its own tags so the machines knew what the content was. The list of HTML tags would literally be a small dictionary or, rather, a very large dictionary as more and more tags were added to the spec. Authors writing about HTML would quite likely lose their minds.

The good news is we can mark up our content and make this comparison shopping possible (especially the search engine example) without needing any more HTML tags. We simply annotate our existing HTML with attributes and values that machines can read. (I’ll talk more about this soon).

Adding a handful of new elements HTML5-style, however, is not a path to “more semantic” documents. They don’t help machines do much with the data, and our markup becomes more cluttered—hardly a way to make it more readable.

Instead, we need a new mechanism to describe this data. Ideally that’s where HTML5 will lead us.

Can the Real Semantics Please Stand Up?

I know what you’re thinking. “If only we had a way of adding tags that didn’t pollute the entire spec. Some sort of eXtensible Markup Language.” But as we saw in Chapter 1, we tried that, and it failed.

Clearly we need a way to extend HTML that doesn’t involve adding a dictionary’s worth of elements to the spec or trying to XML-ify the Web.

There is a third option, and a bunch of people have been working on various solutions for quite a few years.

Here’s the idea in a nutshell: just attach attributes with values from an agreed bunch of terms to our existing HTML. Here’s an example (I’ve made up the attribute and value):

<div class="myclass" semanticdata="mysemanticvalue"> ... content ...</div>

As you can see, it’s pretty simple. But it’s worth teasing out the terminology because the different terms and implementations can make a simple idea seem far more complex than it actually is.

We need to distinguish between several pieces of the micro-semantics pie:

The infrastructure: We can take different technical approaches when adding semantic data to a document (that is, what HTML infrastructure we use). It boils down to which attributes we use—the existing class attribute (microformats), the new HTML5 attributes such as itemprop (microdata), or attributes such as property and content (RDFa). Not surprisingly, people interested in the nitty-gritty get all worked up over which is best. But it’s what we say, not how we say it, that’s much more interesting.
The vocabularies: What we say—the kind of data that we stick in these attributes—is where the rubber hits the road. And we need to work together to make it work. The people implementing the data (web designers) and the companies that might do something with it (for example, search engines) need to agree on a stable set of terms—a vocabulary—to describe a review, person, or event so everyone is on the same page.
The concept: And then there are the communities built around different ways to implement this infrastructure and vocabularies and do cool things with it.

One group that has been doing cool things with micro-semantic data is the microformats community. They have an active community (http://microformats.org/), a microformats way to use HTML as infrastructure (the class attribute), and specific microformat vocabularies. These are the various parts of the micro-semantics pie and demonstrate how communities have been able to come together to do semantics in a meaningful way on the Web.

You may have heard of and perhaps implemented microformats in the past. Unfortunately, as I write, its future has been more or less killed off by the search giants that have proposed a new way forward for micro-semantics or “structured data.”

Why Should We Care About Micro-semantics?

In 2011 Google, Microsoft, and Yahoo launched what may be the biggest effort to get real semantics into HTML documents in the history of the Web.

And how did they launch it? With a blog post and a web site that had all the pizzazz of a “My First HTML Page” template knocked out during a hurried lunch break (see Figure 5-1 ). And they also managed to single-handedly annoy everyone already invested in the process who’ve been evangelizing micro-semantics for years. Not a good start.

Figure 5-1. Schema.org. Who said semantics weren’t sexy? Oh ... everyone. Right

Schema.org: The Future of Semantics?

In mid-2011 a handful of engineers from Google, Microsoft, and Yahoo decided they didn’t like the current, community-driven approaches and announced they were picking HTML5’s microdata as the winning infrastructure (that is, the HTML attributes we should use to add micro-semantic data). And so they released Schema.org (http://schema.org/)—a list of vocabularies, or “schemas,” that the major search engines would use to display richer search results.

In this way, all three parts of the micro-semantic pie were changed. The infrastructure (HTML5’s microdata), the vocabularies (Schema.org), and the drivers (corporations, not communities) were all new.

(You can read Google’s announcement at http://googlewebmastercentral.blogspot.com/2011/06/introducing-schemaorg-search-engines.html , Microsoft’s announcement at www.bing.com/community/site_blogs/b/search/archive/2011/06/02/bing-google-and-yahoo-unite-to-build-the-web-of-objects.aspx, and Yahoo’s at http://developer.yahoo.com/blogs/ydn/introducing-schema-org-collaboration-structured-data-44741.html).

Figure 5-2 shows an example of a richer search result.

Figure 5-2. Google iPhone review, and you’ll get results similar to this one. Note how much metadata is included—rating, reviewer, date, and breadcrumbs are all present here

Couldn’t We Do This Before?

This is similar to the Rich Snippets micro-semantics initiative Google launched in 2009, which you may have heard about (or even implemented). But Rich Snippets supported only a handful of existing vocabularies and let authors choose between microdata, microformats, and RDFa. (Plus, it was supported only by Google.)

Now we have one “approved” infrastructure for implementation (microdata), one set of vocabularies at a central location, and a big reason for implementing them: support in Google, Bing, and Yahoo.

That’s a big deal.

(Keep in mind this is purely for search result display, not search ranking. It’s important our clients know the difference).

What’s remarkable isn’t the search giants choosing one infrastructure but rather the 300-odd vocabularies that will potentially define semantics on the Web for years to come. And it was all done behind closed doors with no standards process (or community involvement) whatsoever.

The Semantic Web We’ve Been Waiting For?

Make no mistake, this is the biggest, actually-supported thing to happen for semantics on the Web since, well, pretty much forever.

Way back in Chapter 1 we looked at how XML was supposed to transform semantics on the Web but didn’t. (It was just Architecture Astronauts at work.) We’ve also looked at how HTML5 adds a few semantic elements that either are harmful or add up to very little. (Adding more elements to HTML proper isn’t a solution for semantics.)

This approach of micro-semantics promises a middle way interested communities have been exploring for some time. Let’s run through the existing approaches before we look at the Schema.org launch (and everything that was so horribly wrong with it).

Microformats

The microformats community has been developing and advocating micro-semantics with reasonable success for years, after kicking off in 2004 (see http://microformats.org/wiki/history-of-microformats). This is from http://microformats.org/about:

Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards. Instead of throwing away what works today, microformats intend to solve simpler problems first by adapting to current behaviors and usage patterns (e.g. XHTML, blogging).

For example, in February 2011, all of Facebook’s events were published using microformats (see http://microformats.org/2011/02/17/facebook-adds-hcalendar-hcard). And with the appropriate browser extension (such as the Google Calendar extension for Chrome), a button would appear next to an event, which you could click to add the details to your calendar. Pretty neat, eh?

How did Tantek Çelik, one of the founders of Microformats.org, react to the Google and Microsoft Schema.org announcement (http://twitter.com/#!/t/status/77083481494142976)?

#schemaorg spits in the eyes of every person and company that worked on open vocabularies like vCard, iCalendar, etc.

Ouch.

RDFa

Microformats was a simple, straightforward, limited-by-design approach to micro-semantics.

RDFa (or Resource Description Framework—in—attributes) was the W3C’s much more complex (but more flexible) approach to machine-readable data that’s been kicking around since 1997 as just “RDF.” (RDFa was started in 2004.) It never really captured developer interest in any significant way, but it’s still hanging around.

As debate raged about the Schema.org announcement mid-June, Mark Pilgrim quipped the following (http://twitter.com/#!/diveintomark/status/80980932957450240—link now 404s; this was before Pilgrim’s Internet disappearing act):

The W3C: failing to make RDF palatable since 1997

Zing.

But there have been some interesting real-world uses, such as the GoodRelations vocabulary for e-commerce (www.heppnetz.de/projects/goodrelations/) that could drive the e-commerce example we looked at earlier.

Web designers generally prefer the simplicity of microformats to the flexibility and complexity of RDFa. Nevertheless, a community interested in micro-semantics had grown around RDFa.

How did Manu Sporny, the current chair of the W3C’s RDF Working Group, react to Google and Microsoft’s Schema.org announcement? In “The False Choice of Schema.org” (http://manu.sporny.org/2011/false-choice/), he said this:

Schema.org is the work of only a handful of people under the guise of three very large companies. It is not the community of thousands of Web Developers that RDFa and Microformats relied upon to build truly open standards. This is not how we do things on the Web.

Yikes.

Microdata

Finally we have microdata, the new format used in Schema.org.

Nothing compels web authors to add esoteric metadata to their pages like several competing, slightly different metadata formats. So, Ian Hickson, the HTML5 editor, decided microformats was too cold and RDFa was too hot, so he invented a third approach—microdata—that he felt was just right (so to speak). (Here’s Hickson’s lengthy WHATWG post introducing the feature: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019681.html.)

Note that microdata, as far as the HTML5 spec is concerned, is about providing the infrastructure (with new, valid attributes) for adding micro-semantics. It doesn’t specify what those vocabularies should be or who should invent or maintain them. It is completely separate from the actual vocabularies on Schema.org (for example).

And this is the format that won, in a blessed-by-the-tech-giants sense.

(For a lengthier discussion of the various formats and the implications of Schema.org, see Henri Sivonen’s excellent “Schema.org and Pre-Existing Communities” at http://hsivonen.iki.fi/schema-org-and-communities/.)

Microdata and Schema.org

Now Google, Microsoft, and Yahoo are pushing not only a single format (microdata) but also a single set of vocabularies for real semantics on the Web.

Everything has a specific vocabulary (or “schema”): books, movies, events, organizations, places, people, restaurants, products, reviews ... you name it. (See the full list here: http://schema.org/docs/full.html.)

There are even schemas for identifying parts of web pages themselves, including the header, footer, sidebar, and navigation. I guess ARIA, HTML5, and so on, weren’t enough.

If this takes off, and that’s a big “if,” it will be a huge revolution in how we mark up our pages—bigger than XHTML, HTML5, and whatever flavor of HTML comes next.

Has the Semantic Web finally arrived?

How Not to Launch an Initiative

“schema.org ... there’s just nothing quite like throwing away years of vocabulary/ ontology work”

—Jay Myers, June 3, 2011;http://twitter.com/#!/jaymyers/status/76344419867037696

Well, not if the tepid launch of this new initiative is anything to go by. It was pretty much a textbook case of what not to do.

Here are a few things they could have handled slightly better:

Consultation: The Schema.org announcement came from nowhere—no consultation with the community, no heads up, just a desire to “get something out there.”
Outreach: It’s generally not a great idea to piss off the people who’ve spent years advocating something similar to what you’re launching. Instead of getting the microformats and/or RDFa communities on board (or at least encouraging a migration path), Google, Microsoft, and Yahoo completely ignored them. And that made them very unhappy campers.
Human face: Schema.org launched as an utterly generic site with almost no human aspect, just a “feedback” button. Who edits it? Who thought it up? Who do we talk to? What’s the process? Is there a process? It’s a complete mystery as far as the web site goes. Indeed, the FAQ asks “Who is managing schema.org on an ongoing basis?” and answers “Google, Bing, and Yahoo are managing schema.org on an ongoing basis.” Right, well, I guess we can always just contact Google, Bing, and Yahoo then. (To be fair, they eventually got a Schema.org blog up here: http://blog.schema.org/.)
Newbie friendly: For most web site designers, micro-semantics is a pretty “out there” concept. And while Schema.org does have a “Getting started” guide (http://schema.org/docs/gs.html), it needs a much friendlier explanation of the how, why, and what of micro-semantics (including schema examples) if they want anyone besides the in-the-know über nerds to pick it up.

A not-comically-bad web site: The list of schemas was originally presented in an ASCII art–style list, with dire markup like this (http://schema.org/docs/full.html):

<br> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;<a name=Movie><a href=../Movie>Movie</a>:&nbsp;<span class="slot">duration</span>,&nbsp;<span class="slot">director</span>,&nbsp;<span class="slot">actors</span>,&nbsp;<span class="slot">producer</span>,&nbsp;<span class="slot">trailer</span> <br>&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span class="slot">productionCompany</span>,&nbsp;<span class="slot">musicBy</span><br>

(That describes movies, by the way.) How are we supposed to take these micro-semantics seriously when they can’t even use basic HTML semantics on their own web site? (Update: In 2012 this markup was improved by changing the list to ... a giant table, complete with nested tables and spacer cells. Go figure.)

And let’s not mention the huge number of schemas listed (more than 300!), the fact microdata hasn’t been implemented correctly (see http://jenitennison.com/blog/node/156), or the issues about patents (see www.seobythesea.com/?p=5608 halfway down). What a mess.

All this for potentially the biggest change to web semantics since the Web kicked off.

What Do the People Behind Schema.org Think?

Kavi Goel, a product manager at Google, participated in a session at SemTech 2011 (the “Semantic Technology Conference”) that discussed Schema.org. And some of the responses don’t exactly inspire confidence. (See the W3C’s official transcript here: www.w3.org/2011/06/semtech-bof-notes-smaller.html.)

Here’s an example (slightly abridged):

Ivan Herman: Schema.org is out there, ... how do you envisage the process for the future whereby schema.org might be a place where new vocabs are developed. I [sic] place to make it a more open social process?

Kavi Goel: I don’t have a great answer right now. I don’t think any one company wants to own this in its entirety. By going with 3, we showed we [Google] weren't just doing it. [...]

Then it leaves the question of where is the completely open discussion ... We don’t have an answer yet, but this is important. We’ll need to sort out the stuff that's out there.

Kevin Marks: Ours [microformats] has an edit button, yours has a feedback button. The CORE of microformats is we reach agreement. YOU said “we did it in a closed room”. You haven’t shown your work, your evidence, how others can get involved. This is the most worrying thing.

Kavi Goel:That’s a totally valid point. Microformats did a great job creating an open community.

There’s no good answer for why we didn’t do that.

Coming to microformats with a whole bunch of new things could have been an option. We did want to get something out there.

Earlier in the discussion, Goel said this:

The achievement was to get something out there. We know it’s not perfect. We can make it better. We hope this can be a step toward great adoption.

Here’s hoping. The rush to “get something out there” seems to have done more harm than good at this stage, but they can redeem themselves. We now have one format and one set of vocabularies to use for micro-semantics on the Web. If Google (and/or Microsoft) actually throws some resources at it and someone at either company actually takes ownership of the project, it could be a very big deal indeed.

To the credit of those involved in Schema.org, consultation is finally taking place, and interested parties are discussing a way forward. See, for example, “Schema.org Workshop—A Path Forward” at http://semanticweb.com/schema-org-workshop-a-path-forward/.

Also see the sporadically updated Schema.org blog for further outreach efforts: http://blog.schema.org/.

Wrapping Up: Semantics and HTML

The waves from the Schema.org announcement are still rippling out across the Web as I write. But even so, we can still say a few things about semantics, HTML, and what we should do:

The semantic problem: True semantics that describe meaning, and not just structure, happen in a layer on top of HTML. This seems to be the solution to the longstanding problem of semantics on the Web. XML won’t bring us true semantics, nor will more HTML tags. It’s a layer of micro-semantics on top of our existing HTML that will.
Microformats and RDFa are probably dead ends: Microformats has done really well over the years, and I love its simple format. But the decision by Google and Microsoft makes these formats look like dead ends, and the micro-semantics ecosystem (including browser add-ons and validators) will presumably move to microdata and Schema.org vocabularies. Of course, Schema.org could flop hopelessly too, and the microformats community (for one) could keep plugging away. (Google is not dropping any support, in any case).
Get involved: It’s worth reading up on and experimenting with microformats tools that already exist (such as browser extensions and bookmarklets) to get a taste for what’s possible with micro-semantics. But the fact Schema.org appears to be the future means we as a community need to study the various schemas and provide feedback.
Questions remain: Many questions about the process (will there be one?) and future of Schema.org remain unanswered—questions even the instigators can’t answer, as Kavi Goel demonstrated. And there are bigger questions about its widespread use. Will mainstream adoption lead to attempts to spam search engines? (People will certainly try.) Will it all turn to “metacrap” (www.well.com/∼doctorow/metacrap.htm)? We will see.
It’s ready to go: Google and Microsoft’s “It’s better to ask for forgiveness than permission” attitude with Schema.org means the standards process won’t be going on for years—it’s good to go right now. And if it is widely adopted, our online shopping example may eventually become a reality. At the time of this writing, Schema.org continues to be the dominant player in the semantic markup-adjusting-things arena, though adoption is slow and primarily restrained to web gurus who know enough to implement it (like you, gentle reader). For now, here’s the February 2012 announcement of using Schema.org micro-semantics to describe videos, which is “now the recommended way to describe videos on the Web”: http://googlewebmastercentral.blogspot.com/2012/02/using-schemaorg-markup-for-videos.html. And here’s a 2013 article by Searchengineland.com describing the importance of Schema.org’s microdata in Google’s Hummingbird search algorithm: http://searchengineland.com/schema-org-7-things-for-seos-to-consider-post-hummingbird-172163.
It’s being used right now: Companies such as eBay, IMDB, Rotten Tomatoes, and others have implemented Schema.org’s semantics and are benefiting from improved display of their search engine results right now, as this article demonstrates: www.seomoz.org/blog/schema-examples.

Ultimately, Schema.org is a case of glass half-full/glass half-empty. We now have a well-supported, standard set of semantic schemas we can easily add to any HTML structure. And if we search with Google, Bing, or Yahoo, we can get tangible results. The chicken-and-egg problem of adding semantic data has been solved, the format has been chosen, and the schemas have been released.

But rushing the launch (which was underwhelming, to say the least), abandoning any standards process whatsoever for the vocabularies, and trampling years of existing work are heavy prices to pay.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for CHAPTER 5: The Truth About HTML5 Micro-semantics and Schema.org

Create new playlist

Sign In

Sign Up

Table of Contents for
CHAPTER 5: The Truth About HTML5 Micro-semantics and Schema.org