Chapter 5. Important Concepts from the W3C RDF Vocabulary/Schema

When discussing the Resource Description Framework (RDF) specification, we’re really talking about two different specifications—a Syntax Specification and a Schema Specification. As described in Chapter 3 and Chapter 4, the Syntax Specification shows how RDF constructs relate to each other and how they can be diagrammed in XML. For instance, elements such as rdf:type and pstcn:bio are used to describe a specific resource, providing information such as the resource’s type and the author of the resource. The different namespace prefixes associated with each element (such as rdf: and pstcn:) represent the schema that particular element is defined within.

In the context of RDF/XML, a vocabulary or schema is a rules-based dictionary that defines the elements of importance to a domain and then describes how these elements relate to one another. It provides a type system that can then be used by domain owners to create RDF/XML vocabularies for their particular domains. For example, the pstcn:bio element is from a custom vocabulary created for use with this book while the rdf:type element is from the RDF vocabulary. These are different vocabularies and have different vocabulary owners, but both follow rules defined within the RDF Vocabulary Description Language 1.0: RDF Schema.

However, before getting into the details of the RDF Schema, consider the following: if RDF is a way of describing data, then the RDF Schema can be considered a domain-neutral way of describing the metadata that can then be used to describe the data for a domain-specific vocabulary.

If all this seems convoluted, then you’ll appreciate reading more about the concept of metadata, its importance to existing applications, and how RDF fits into the concept, all discussed in the next section.

Tip

The material in this chapter references the RDF Vocabulary Description Language 1.0: RDF Schema. The most recent version of the document can be found at http://www.w3.org/TR/rdf-schema/.

RDF Vocabulary: Describing the Data

The last few chapters have emphasized that the RDF specification is about metadata—data about data. This is a key RDF concept; by creating a domain-neutral specification to describe resources, the same specification can then be used with many different domains but still be processed by the same RDF agents or parsed by the same RDF parsers.

Because of the importance of understanding metadata’s role within RDF, we’ll start by taking a closer look at the concept of metadata, particularly as it’s used in applications today.

Metadata’s Role in Existing Applications

If you’ve worked with any kind of relational database such as Oracle, Sybase, MySQL, or Microsoft’s SQL Server, you’ve used metadata. The way that these database management systems can be used for many different applications, and to store many different types of data, is by using metadata structures.

For instance, an application database might have three database tables such as CUSTOMER, ORDER, and CUSTOMER_ORDER, with both the CUSTOMER and ORDER tables related to the third CUSTOMER_ORDER table through primary/foreign key relationships, as diagrammed in Figure 5-1.

Three related database tables
Figure 5-1. Three related database tables

The ORDER table could have other fields associated with it such as ORDER_DATE and TOTAL_COST, each containing values describing the order date and cost, respectively. Additional information could be stored about the fields, such as the ORDER_DATE is a timestamp and a required value, while the total cost field is a currency value that can be null.

To create storage specifically designed to store CUSTOMER, ORDER, and CUSTOMER_ORDER might be effective for one application but won’t be useful for another application that needs to store information about objects such as STUDENT and CLASS (for an academic setting). In other words, change the domain and the domain-specific storage constructs become pretty useless.

To facilitate multiple uses of the same storage mechanism for different domains, the relational database schema defines elements such as database tables, primary and foreign keys, and columns that provide a domain-neutral description of the information about the different aspects of the CUSTOMER, ORDER, and CUSTOMER_ORDER objects. In SQL Server, the information would be stored in constructs such as TABLES, COLUMNS, and KEY_COLUMN_USAGE. COLUMNS contain a row for each element within the domain being described. Therefore, TABLES would contain one row for each of the application data objects CUSTOMER, CUSTOMER_ORDER, and ORDER; the COLUMNS table would contain one row for each table column; and so on. More complex information such as column constraints and foreign key relationships are also stored, individually, as rows within some metadata table.

Tip

Within any tablelike structure, you can think of metadata as column headers converted to rows. The describer then becomes the described.

At runtime, the database management system hides the higher-level nature of the data storage by allowing applications to access objects such as CUSTOMER, CUSTOMER_ORDER, and ORDER, directly, as if they were actual objects rather than mappings between domain elements and a generic relational database schema. This process works so well that there are few companies in the world that don’t have at least one relational database, and many have several.

The concept of runtime metadata can be extended to applications other than just relational databases. Large multiuse applications such as PeopleSoft, SAP, and Oracle Financials also make use of the concept of real-time metadata. Even without viewing each of these application’s actual data stores, one can assume that the applications allow extensions to their systems by the expedient of recording metadata as records rather than as columns within a table. With this, the applications can create a generic application that follows a well-defined business model—such as a Customer Resource Management (CRM) system—that can then be extended and used within many different types of businesses.

RDF acts in a manner similar to a relational database system or these large, multiple-purpose application frameworks. Within RDF, instead of creating a custom XML vocabulary to describe resources, you use a predefined syntax and schema that allow you to store information about the resource domain, but in such a way that automated RDF processes can access and process the data regardless of the domain.

Based on this domain-neutral approach, you don’t store information about a web resource in a domain-specific XML element called WEB_PAGE; instead, you store it in an rdf:Description element and use RDF to define the properties for this new resource. This same syntax can then be used to describe online books, photos, or even an article on giant squids (as demonstrated in Chapter 2). Most importantly, the same automated processes can manipulate the information regardless of either the resource or the domain.

Within relational database systems, the metadata process works because the schema used to capture the business information follows specific rules and makes use of a common set of system objects, such as tables and columns. The same applies to RDF: for all this to work, the RDF Schema also has to be described, and that’s where the concept of metadata about metadata enters the picture. It is at this point that the RDF Schema enters the RDF specification universe.

RDF Schema: Metadata Repository

In the last section, you had a chance to see that relational databases can provide storage for a multitude of domains through the use of a set of objects that store information about every aspect of the domain, but in a neutral manner. These objects form what is known as the database system’s system objects or metadata schema objects.

Within SQL Server, the objects can be queried through a custom view called the INFORMATION_SCHEMA, which contains references to elements such as the aforementioned TABLES and COLUMNS, though the actual internal tables are hidden to allow the SQL Server architects to make changes if necessary without impacting the exposed view.

The basic elements underlying the INFORMATION_SCHEMA view, such as TABLES and COLUMNS, aren’t specific to any one relational database vendor; they’re based on the relational database schema, defined within an industry standard. All of these elements are then governed by well-understood (and mathematically proven) rules and procedures. Because of this, you can use different relational database systems and be assured that for certain basic objects and functionality, the exposed behavior is the same regardless of the type of system. Within an Oracle database, you can have at most one primary key for a table; this same rule applies to a table within Microsoft’s SQL Server and a table within Sybase.

In other words, the relational database schema, its objects, rules, and regulations are the metadata used to define and describe the metadata (TABLES, COLUMNS) that are then used to describe and manage domain-specific data (CUSTOMER, ORDER, CUSTOMER_ORDER).

Tip

A key characteristic of the relational data model is that data is viewed logically rather than physically. Data is viewed within the context of its use rather than its physical storage method. For more on the relational model, see the classic article on the subject, “A Relational Model of Data for Large Shared Data Banks” from E. F. Codd, found at http://www.acm.org/classics/nov95/toc.html. Read more about the association between relational data and RDF in Chapter 10.

The RDF Schema provides the same functionality as the relational database schema. It provides the resources necessary to describe the objects and properties of a domain-specific schema—a vocabulary used to describe objects and their attributes and relationships within a specific area of interest.

The best way to fully understand how the RDF Schema works is by looking at the elements that make up the schema.

Core RDF Schema Elements

The RDF Schema elements are marked by a specific namespace, identified within a document with the following namespace declaration:

xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"

Within the Schema Specification, there is a core group of classes and properties used to describe domain-specific RDF elements. These, combined with a specific set of constraints (described later in Section 5.3), form the foundation of the schema.

Note

RDF Schema elements are defined in the RDFMS as well as within the RDF Schema. RDFMS elements are identified with the rdf namespace:

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

Overview of the RDF Classes

There are surprisingly few RDF Schema classes:

rdfs:Resource

All resources within RDF are implicitly members of this class.

rdfs:Class

Type or category of resource.

rdfs:Literal

Literals within RDF documents, such as text strings.

rdfs:XMLLiteral

Literals with RDF documents that use XML syntax.

rdfs:Container

Superclass of all container classes.

rdfs:ContainerMembershipProperty

Members of containers.

rdfs:Datatype

Data typing information.

Taking a closer look at each of these classes, the rdfs:Resource element is used to describe a basic resource within the RDF. It is the set of these elements that literally forms both the reason and focus of the entire RDF specification.

Example 5-1 demonstrates a very simple RDF/XML document that contains a description of an article, including the article’s title and author.

Example 5-1. Demonstrating the explicit resource property type
<?xml version="1.0"?>
<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:pstcn="http://burningbird.net/postcon/elements/1.0/">
  <rdf:Description rdf:about="http://burningbird.net/articles/monsters3.htm">
    <pstcn:author>Shelley Powers</pstcn:author>
    <pstcn:title>Architeuthis Dux</pstcn:title>
  </rdf:Description>
</rdf:RDF>

Every resource within an RDF document, such as the article shown in Example 5-1, has a common ancestor class: rdfs:Resource. Because of this commonality, you generally won’t see an explicit use of rdfs:Resource within an RDF vocabulary document. However, if you did, you would see it used in a manner similar to the following schema representation of the article resource from Example 5-1:

<rdfs:Class rdf:ID="Article">
<rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource" />
</rdfs:Class>

The RDF fragment also uses rdfs:Class. All new resource types are identified by an rdfs:Class statement, including the rdfs:Resource element itself. The Class element is very similar to its same-name counterpart in object-oriented development—a unique object that can be described and can have associated behaviors.

Tip

In the RDF Schema Specification, the rdf:Description element is also used to identify a particular class.

Within the RDF Schema, RDF properties (discussed in the next section) have a given range of allowable values, such as rdfs:Class, rdfs:Property, or rdfs:Literal. The last is used to describe what the Schema Specification terms self-denoting nodes, which are nodes that can contain literals such as strings. An example of one such property is rdfs:Comment, used as follows:

<rdfs:Comment>This is a comment within the RDF Schema</rdfs:Comment>

The comment’s value is a text string, a literal, parsed out in its entirety without additional processing.

rdfs:Container is the superclass of all RDF container elements: rdf:Bag, rdf:Seq, and rdf:Alt. The rdfs:ContainerMembershipProperty class consists of the Container elements themselves (usually denoted by _1, _2, _3, and so on). It also contains rdfs:member.

The rdfs:Datatype class is the class of all data types and is, itself, a subclass of rdfs:Literal. The data type values follow the constraints defined for RDF data types, covered in Chapter 2 and Chapter 3, which means that there is a mapping of both the value as well as the data type.

Tip

Actual instances of data types are recorded using rdf:datatype within each instance, basically associating each field with a specific data type. However, you can specify a data type within the schema, also, but there’s nothing associated with RDF Schemas that would enforce data types between the schema and each instance. This disconnect can potentially lead to some problems, as detailed in Chapter 6.

rdfs:XMLLiteral is a subclass of rdfs:Literal and an instance of rdfs:Datatype, and is the class of all XML literals. This is somewhat equivalent to CDATA within XML and HTML, and allows one to embed XML into the RDF/XML document that is not processed as RDF/XML. Associated with the XML is an arbitrary but fixed pattern:

"<rdf-wrapper xml:lang='"
 
lang
 
"'>"
 
str
 
"</rdf-wrapper>"

According to the document, rdf-wrapper is arbitrary but fixed. This means that the format remains the same, but the actual element names can differ. This makes sense—whatever is contained within the field designated as XMLLiteral would, we assume, follow standard XML formatting.

In addition to the RDF Schema classes, a few RDF classes cross the boundary between the metalanguage and instances of the same. These are:

rdf:Statement

Class of all RDF statements

rdf:Bag , rdf:Seq, and rdf:Alt

Container classes

rdf:List

Class of all RDF lists

rdf:Property

Resources that are RDF properties

The rdf:Statement class includes as members all reified RDF statements within a vocabulary (all resources that have an rdf:type of rdf:Statement).

The container classes — rdf:Bag, rdf:Seq, and rdf:Alt — are used to group members, positioning within the grouping dependent on the type of container class (Chapter 4 goes into detail on the container classes).

The rdf:List class has as members all RDF lists within a vocabulary, as rdfs:Container is a superclass of all RDF container elements.

The rdf:Property class is used to define the attributes that, in turn, describe the resource. In Example 5-1, the attributes for Article are author and title. The minimum RDF Schema definition that could describe the RDF/XML used in this example resource are shown in Example 5-2.

Example 5-2. RDF Schema for Article
<?xml version="1.0"?>
<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
  xmlns:pstcn="http://burningbird.net/postcon/elements/1.0/">

<rdfs:Class rdf:about="http://burningbird.net/postcon/elements/1.0/Article">
  <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
</rdfs:Class>

<rdf:Property rdf:about="http://burningbird.net/postcon/elements/1.0/title">
  <rdfs:domain rdf:resource="http://burningbird.net/postcon/elements/1.0/Article" />
</rdf:Property>

<rdf:Property rdf:about="http://burningbird.net/postcon/elements/1.0/author">
  <rdfs:domain rdf:resource="http://burningbird.net/postcon/elements/1.0/Article" />
</rdf:Property>
</rdf:RDF>

In this document, Article is defined as an rdf:Resource (subclass of the Resource class), and each property of Article (title and author) is related to it through the use of the RDF Schema domain constraint (discussed in Section 5.3).

The Article class and its associated properties are associated with the Resource and Property classes, respectively, through the subClassOf property. This and other core RDF properties are discussed next.

Demonstrations of the RDF Schema Properties

The RDF specification’s purpose is purely to define resources and associated facts, and then provide a way to allow these resource/fact mappings to interact. This is accomplished through capturing statements about the resource, with each statement consisting of a specific property such as title and author for the Article resource. The RDF Schema is no exception—statements about each resource are captured as individual properties. The only difference between the two is that one is an instance of business data (such as Article), and the other is metadata (related to the RDF model).

Following are the core properties (from both the RDF and RDFS namespaces) that are essential to the RDF Schema:

  • rdfs:subClassOf

  • rdfs:subPropertyOf

  • rdfs:seeAlso

  • rdfs:isDefinedBy

  • rdfs:member

  • rdfs:comment

  • rdfs:label

  • rdf:type

  • rdf:subject

  • rdf:predicate

  • rdf:object

  • rdf:first

  • rdf:rest

  • rdfs:domain

  • rdfs:range

  • rdf:value

The rdf:value property was described in Chapter 3. The rdf:subClassOf property identifies a class that is a subclass of another. For instance, in Example 5-2, Article is a subclass of the more generic Resource class, which all resources belong to. Article could also be a subclass of another class such as WebPage, which is, in turn, a subclass of Resource, as demonstrated in the following RDF/XML snippet:

<rdfs:Class rdf:ID="WebPage">
  <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
</rdfs:Class>

<rdfs:Class rdf:ID="Article">
  <rdfs:subClassOf rdf:resource="http://burningbird.net/schema#WebPage"/>
</rdfs:Class>

The use of inheritance within the RDF Schema classes allows us to define super-classes such as WebPage. New subclasses of WebPage then not only inherit the properties and constraints of the superclass Resource, they also inherit the additional properties and constraints from WebPage.

The rdfs:subPropertyOf property is used when one property is a refinement of another property. For instance, in the Article schema, one of the properties is author. This property could be further refined to specify whether an author is a primary or secondary author, via the primaryAuthor and secondaryAuthor subproperties, respectively. Example 5-3 shows the use of this property refinement through the rdfs:subPropertyOf property.

Example 5-3. RDF Schema example of property refinement
<?xml version="1.0"?>
<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
  xmlns:pstcn="http://burningbird.net/postcon/elements/1.0/">

<rdfs:Class rdf:about="http://burningbird.net/postcon/elements/1.0/Article">
  <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
</rdfs:Class>

<rdf:Property rdf:about="http://burningbird.net/postcon/elements/1.0/title">
  <rdfs:domain rdf:resource="http://burningbird.net/postcon/elements/1.0/Article" />
</rdf:Property>

<rdf:Property rdf:about="http://burningbird.net/postcon/elements/1.0/author">
  <rdfs:domain rdf:resource="http://burningbird.net/postcon/elements/1.0/Article" />
</rdf:Property>

<rdf:Property rdf:about="http://burningbird.net/postcon/elements/1.0/primaryAuthor">
  <rdfs:domain rdf:resource="http://burningbird.net/postcon/elements/1.0/Article" />
  <rdfs:subPropertyOf rdf:resource="http://burningbird.net/postcon/elements/1.0/author" />
</rdf:Property>

<rdf:Property rdf:about="http://burningbird.net/postcon/elements/1.0/secondaryAuthor">
  <rdfs:domain rdf:resource="http://burningbird.net/postcon/elements/1.0/Article" />
  <rdfs:subPropertyOf rdf:resource="http://burningbird.net/postcon/elements/1.0/author" />
</rdf:Property>
</rdf:RDF>

The rdfs:seeAlso property is used to identify another resource that contains additional information about the resource being described. An example of using this property could be the following RDF fragment, showing the relationship between an article and a document maintaining the history of the article, identified as a class called ArticleHistory:

<rdfs:Class rdf:about=" http://burningbird.net/postcon/elements/1.0/ArticleHistory">
  <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
</rdfs:Class>

<rdfs:Class rdf:about=" http://burningbird.net/postcon/elements/1.0/Article">
  <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
  <rdfs:seeAlso rdf:resource="http://burningbird.net/postcon/elements/1.0/
ArticleHistory" />
</rdfs:Class>

Within the RDFS vocabulary, rdfs:seeAlso is also used to link the vocabulary document with a second document:

<rdf:Description rdf:about="http://www.w3.org/2000/01/rdf-schema#">

  <rdfs:seeAlso rdf:resource="http://www.w3.org/2000/01/rdf-schema-more"/>
</rdf:Description>

With this, additional schema elements can be added to the vocabulary without having to edit or modify the original schema.

According to the RDF Schema Specification, rdfs:seeAlso can be refined through the rdfs:subPropertyOf property to provide additional information about the manner in which the one resource provides additional information about the second resource:

<rdf:Property rdf:about=" http://burningbird.net/postcon/elements/1.0/
historyProvidedBy">
  <rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-schema#Property"/>
  <rdfs:subPropertyOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#seeAlso" />
</rdf:Property>

<rdfs:Class rdf:about=" http://burningbird.net/postcon/elements/1.0/ArticleHistory">
  <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
</rdfs:Class>

<rdfs:Class rdf:about=" http://burningbird.net/postcon/elements/1.0/Article">
  <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
  <bbd:historyProvidedBy rdf:resource=" http://burningbird.net/postcon/elements/1.0/
ArticleHistory" />
</rdfs:Class>

The rdfs:isDefinedBy property identifies the namespace for the resource, preventing any ambiguity or confusion about namespace ownership. For example, if a resource is identified by a GUID (Globally Unique Identifier), the rdfs:isDefinedBy property could be attached to the Resource class, to provide the URI for the schema.

Within the RDFS Schema vocabulary, the rdf:Statement class is defined to be a part of the RDF syntax namespace:

<rdfs:Class rdf:about="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement">
  <rdfs:isDefinedBy rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#"/>
  <rdfs:label xml:lang="en">Statement</rdfs:label>
  <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
  <rdfs:comment>The class of RDF statements.</rdfs:comment>
</rdfs:Class>

However, the rdfs:Literal class is defined to be a part of the RDF Schema namespace:

<rdfs:Class rdf:about="http://www.w3.org/2000/01/rdf-schema#Literal">
  <rdfs:isDefinedBy rdf:resource="http://www.w3.org/2000/01/rdf-schema#"/>
  <rdfs:label xml:lang="en">Literal</rdfs:label>
  <rdfs:comment>This represents the set of atomic values, eg. textual strings.</rdfs:
comment>
</rdfs:Class>

The rdfs:member property is a superproperty for each numbered container element (such as _1, _2, and so on).

Tip

At the time of this writing, the RDF Working Group is working to resolve whether rdfs:member should be a member of rdfs:ContainerMembershipProperty. Check the RDF Schema specification for final resolution of this issue.

Two properties provide human readability to an RDF model. The rdfs:comment property is used to provide documentation of resources, and rdfs:label provides a readable version of the resource’s name. In addition, you can attach the XML attribute xml:lang to the rdfs:label element and provide different labels for different languages.

You can add comments to an RDF/XML document using XML comments such as the following:

<!--Class defining Web articles-->
<rdfs:Class rdf:about="http://burningbird.net/postcon/elements/1.0/Article">
  <rdfs:subClassOf rdf:Resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
</rdfs:Class>

However, to formally attach documentation to an element in such a way that the documentation itself can be easily accessible through RDF parsers or other automated processes, then you need to have RDF Schema elements that can be used specifically for schema documentation. These elements are rdfs:comment and rdfs:label. The rdfs:comment provides a description of the resource, while the rdfs:label provides a human-readable version of the name.

Adding documentation to Example 5-3 results in the RDF/XML shown in Example 5-4. As you can see, just a few extra lines can provide considerable information.

Example 5-4. Using RDF Schema documentation elements
<?xml version="1.0"?>
<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
  xmlns:pstcn="http://burningbird.net/postcon/elements/1.0/">

<rdfs:Class rdf:about="http://burningbird.net/postcon/elements/1.0/Article">
  <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
  <rdfs:comment>Unique Online article</rdfs:comment>
  <rdfs:label 
xml:lang="en">Article</rdfs:label> 
</rdfs:Class>

<rdf:Property rdf:about="http://burningbird.net/postcon/elements/1.0/title">
  <rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-schema#Property"/>
  <rdfs:domain rdf:resource="http://burningbird.net/postcon/elements/1.0/Article" />
  <rdfs:comment>Online Article Title</rdfs:comment>
  <rdfs:label xml:lang="en">Title</rdfs:label>
</rdf:Property>

<rdf:Property rdf:about="http://burningbird.net/postcon/elements/1.0/author">
  <rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-schema#Property"/>
  <rdfs:domain rdf:resource="http://burningbird.net/postcon/elements/1.0/Article" />
  <rdfs:comment>Primary author of article</rdfs:comment>
  <rdfs:label xml:lang="en">Author</rdfs:label>
</rdf:Property>

</rdf:RDF>

When viewing the schema in Example 5-3, you can understand what’s being described because you have this chapter to provide information. However, in real life, a vocabulary and schema may not have associated documentation, or the link between the documentation and the vocabulary may not be maintained. By providing both comments and a readable label, you’re providing information to the users about exactly what’s being defined. This is no different than providing inline documentation and using good naming practices within code among application developers.

The rdf:type property defines the type of resource. As mentioned earlier, all resources are of type Resource, as well as being a more granular type, such as Article. The type property designates that the resource being referenced is an instance of this class.

Within an RDF/XML document, the rdf:type is usually assumed and isn’t explicitly given. However, you can explicitly use the rdf:type property to remove any possibility of confusion between the RDF/XML document and N-Triples or an RDF graph generated from the document. This holds true for RDF Schema vocabulary documents. For instance, you can attach an rdf:type property to the author property to refine the definition, though its use in a schema is usually redundant.

<rdf:Property rdf:about="http://burningbird.net/postcon/elements/1.0/author">
  <rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-schema#Property"/>
  <rdfs:domain rdf:resource="http://burningbird.net/postcon/elements/1.0/Article" />
  <rdfs:comment>Primary author of article</rdfs:comment>
  <rdfs:label xml:lang="en">Author</rdfs:label>
  <rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Property" />

</rdf:Property>

The rdf:subject , rdf:predicate , and rdf:object properties are used with reification to explicitly define an RDF statement. In addition, the rdf:first and rdf:next properties are used to explicitly define the relationships within a collection. Since both reification and collections are covered in depth in Chapter 4, I won’t repeat the details here.

The remaining two properties, rdfs:domain and rdfs:range, are described in the next section.

Refining RDF Vocabularies with Constraints

Within RDF Schema, constraints define class associations for properties. In addition, there are subclasses of both Property and Resource that are specific to constraints.

In Example 5-4, the rdfs:domain property was used to associate a property with the resource it modified. It was used with both the author and title properties to associate them with the Article resource. The RDF Schema property is further constrained to be used only with properties by specifying an rdfs:domain of Property for the rdfs:domain itself.

An RDF property can be used for more than one resource type. Something such as title can then be used for Article but can also be used for Person (a person’s work title), as well as something such as Painting (title of a painting). The only limitation is the domain scope. The rdfs:range property is used to specify the classes the property can reference as values. Unlike the domain element, only one RDF range constraint can be attached to any property—equivalent to the restriction in most programming languages that a variable can be of only one data type, constraining the allowable values that the variable can contain.

To specify more than one class as range constraint for a property (more than one data type if you will), you can use a master class for all classes that will be designated by a specific range and then use inheritance to extend the class with sub-classes.

In Example 5-5, a new class is added to the example schema called Directory. This class has one property, contains, which is used to identify web resources the directory contains. A new contains property is created and tied back to the class through the rdfs:domain property.

The web resources can be articles or examples; to allow both in the new contains range, a master class, WebPage, is created; it is then refined through the use of rdfs:subClassOf to create Article and Example classes. The master class is used as the value for the rdfs:range property of the contains class.

Example 5-5. Using RDF Schema constraints to refine an RDF schema
<?xml version="1.0"?>
<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
  xmlns:pstcn="http://burningbird.net/postcon/elements/1.0/">

<rdfs:Class rdf:about="http://burningbird.net/postcon/elements/1.0/WebPage">
  <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
</rdfs:Class>

<rdfs:Class rdf:about="http://burningbird.net/postcon/elements/1.0/Article">
  <rdfs:subClassOf rdf:resource="http://burningbird.net/postcon/elements/1.0/WebPage"/>
</rdfs:Class>

<rdfs:Class rdf:about="http://burningbird.net/postcon/elements/1.0/Example">
  <rdfs:subClassOf rdf:resource="http://burningbird.net/postcon/elements/1.0/WebPage"/>
</rdfs:Class>

<rdfs:Class rdf:about="http://burningbird.net/postcon/elements/1.0/Directory">
  <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
</rdfs:Class>

<rdf:Property rdf:about="http://burningbird.net/postcon/elements/1.0/contains">
  <rdfs:domain rdf:resource="http://burningbird.net/postcon/elements/1.0/Directory" />
  <rdfs:range rdf:resource="http://burningbird.net/postcon/elements/1.0/WebPage" />
</rdf:Property>

</rdf:RDF>

RDF Schema Alternatives

RDF isn’t the only specification related to describing schemas. XML documents (and their SGML predecessors) have long been validated through the use of Document Type Declarations (DTDs), described in the first release of the XML specification and still in heavy use. DTDs generally define how elements relate to one another within a schema; for example, they allow applications to check whether a specific element is required or one or more elements can be contained within another.

While useful for validating how elements within a schema relate to one another, DTDs have long had their critics. First of all, DTDs are based on a syntax totally unrelated to XML. This forces a person to become familiar with not one but two syntaxes in order to create a valid as well as well-formed XML document. The following DTD fragment defines an Items element, its child item, and the contents of item:

<!ELEMENT Items (item*)>
<!ELEMENT item (productName, quantity, USPrice, comment?, shipDate?)>
<!ATTLIST item 
   partNum CDATA #REQUIRED>
<!ELEMENT productName (#PCDATA)>
<!ELEMENT quantity (#PCDATA)>
<!ELEMENT USPrice (#PCDATA)>
<!ELEMENT comment (#PCDATA)>
<!ELEMENT shipDate (#PCDATA)>

As you can see, the DTD syntax is fairly intuitive; however, syntactic elegance or not, DTDs do not provide the same type of functionality as the RDF specification. XML DTDs define how elements within a vocabulary relate to one another, not how they relate to the world at large, and the description of their contents is pretty vague. #PCDATA and its attribute cousin, CDATA, just mean “text.” RDF provides a means of recording data within a global context, not just how elements in one specific vocabulary relate to one another.

Another mechanism to record schemas is defined by the W3C XML Schema 1.0 Specification. This specification is more closely related to the functionality used to define a relational table or to describe an object in object-oriented development. Schemas are used to define elements in relation to one another, as with the DTD syntax; it goes beyond DTDs, though, by providing a means of recording data types about the elements and attributes—a functionality long needed with XML vocabularies, as shown in the following fragment based on the specification:

<xsd:element name="Items">
 <xsd:complexType name="Items">
  <xsd:sequence>
   <xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
    <xsd:complexType>
     <xsd:sequence>
      <xsd:element name="productName" type="xsd:string"/>
      <xsd:element name="quantity">
       <xsd:simpleType>
        <xsd:restriction base="xsd:positiveInteger">
         <xsd:maxExclusive value="100"/>
        </xsd:restriction>
       </xsd:simpleType>
      </xsd:element>
      <xsd:element name="USPrice"  type="xsd:decimal"/>
      <xsd:element ref="comment"   minOccurs="0"/>
      <xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
     </xsd:sequence>
     <xsd:attribute name="partNum" type="SKU" use="required"/>
    </xsd:complexType>
   </xsd:element>
  </xsd:sequence>
 </xsd:complexType>
</xsd:element>

As you can see, W3C XML Schema is an effective specification for defining XML elements, their relationships, and much more information about associated data types than DTDs provide.

A third approach, RELAX NG Compact Syntax, offers a combination of DTD readability and W3C XML Schema data typing, though it also has a mathematical foundation that in some ways has more in common with RDF than with DTDs or W3C XML Schema. The same example in RELAX NG Compact Syntax looks like:

Items = element Items { item* }
item =
  element item {
    att.partNum, productName, quantity, USPrice, comment?, shipDate?
  }
att.partNum = attribute partNum { text }
productName = element productName { text }
quantity = element quantity { xsd:positiveInteger {maxExclusive="100"}}
USPrice = element USPrice { xsd:decimal }
comment = element comment { text }
shipDate = element shipDate { xsd:date }

start = Items

All of these schema approaches facilitate automated processing of XML. Still, the various XML Schema tools can’t replace the functionality provided by the RDF specification. To overgeneralize, XML tools are concerned with describing markup representations and their contents, while RDF tools are concerned with describing models. You can get a model from a representation or vice versa, but the two approaches focus on different things.

The RDF specification defines information about data within a particular context. It provides a means of recording information at a metadata level that can be used regardless of the domain. RDF’s relationship with XML is that XML is used to serialize an RDF model; RDF is totally unconcerned whether XML is valid (that is, conforming to a DTD, RELAX NG description, or W3C XML Schema) as long as the XML used to serialize an RDF model is well formed. In addition, concepts such as data types and complex and simple element structures—focal points within the W3C XML Schema—again focus on XML as used to define data, primarily for data interchange; they have nothing to do with recording data about data in order to facilitate intelligent web functionality.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.199.51