5.1. XML Overview

As a power user or developer, chances are you have at least heard of XML. XML is to data what Hypertext Markup Language (HTML) is to displaying Web pages in a browser. While XML describes the data to be used, HTML describes how information is to be presented on a page. Another big difference between the two is that although HTML is primarily for the Web, XML can be used anywhere data is to be utilized. This includes single systems on a desktop, multiple business systems, or utilizing data over the Internet.

The World Wide Web Consortium (W3C), creator of HTML, first met in 1996. It is also the W3C who sets the standards for XML.

5.1.1. What Is XML?

XML is a data file standard. Where the format is agreed upon, the actual commands (tags) used will depend on the technology, system, or development language utilizing the data. There are technologies commonly used by applications that take advantage of XML, a common one being XML documents. An XML document can consist of a single table or an entire database. A good example of various systems that use XML documents are the Office applications. The majority of them can both import and export XML, including Word. You can see an example of exporting XML in this chapter in the next section.

As with the HTML, various XML files use tags. However, that is all you specify in XML, what the data is. You are not specifying how the data looks. One difference between HTML and XML is that XML has much stricter requirements for creating tags. There are certain rules that you must follow. If you have followed those rules in the initial creation of your document, then you are said to have created a well-formed document.

5.1.2. XML Documents

While the information and specific commands included in XML differ based on the technology using them, XML documents are said to be well formed if they conform to the following basic rules of XML:

  • Each XML document must have a unique root element (an element encompassing the entire document).

  • The document has matching start and end tags.

  • The elements do not overlap.

  • Certain reserve characters are part of the XML syntax and will not be interpreted as the characters themselves if used in the data portion of an element.

Besides the *.xml file created, additional files are created when you work with an XML document. XML describes what the data is and is separate from how the data is actually presented. You can use the same data and present it differently based on separate specification files.

5.1.3. Standard XML Files

XML documents can be made up of a single file if necessary. When only one file is used, however, it is up to the systems reading the data to figure out the type of data they're dealing with. When using more than one document to specify the data, you can specify properties such as data types and other attributes.

Here are some of the extensions and types of files used for XML documents:

ExtensionDescription
*.xmlThe XML data document. This is a static snapshot of data itself.
*.xsdThe schema file. This schema was based off the persisted table or query and is in the W3C XSD standard.
*.xslPresentation document. The XSL document specifies how the data in the XML is to be displayed, transforming the data for presentation purposes. A *.XSL is also used for XSLT documents, which use a subset of commands from XSL. One big difference is that XSLT also performs the transformation permanently, and in fact can create other types of files such as HTML from the XML.
*.htmFinal Package. This ties the *.xml (data) and *.xsl (presentation) together to be used on the Web.

There are ways to embed the definition information inside the *.XML, and thereby not have to include the *.xsd file, however, this is not the recommended practice. The reason for not embedding the definition is that the business systems need to be able to read the embedded definitions. Another reason is that it is generally accepted that, just as you want to keep the presentation of the data separate from the data itself, it is a good idea to keep the definition separate as well.

If you are passing the data to another business application, then you will probably just send the *.xml and *.xsd files. In the case of InfoPath, if you are going to create a form based off the structure of existing XML data, you have only to specify the *xsd. You will see more on this in the section titled "Creating an InfoPath Form Using an Existing XML Document" later in this chapter.

One of the best ways to understand XML files is to see what they actually look like. For the purposes of this section, you will see the XML files created by exporting the tblCustomers table to XML. You can see the original table structure in Figure 5-1.

Figure 5.1. Figure 5-1

Although this isn't an Access book, if you open Chapter 5.mdb in the samples folder, you can right-click the tblCustomers table in the Table tab of the database window, and choose Export . . . from the menu. In the Export table dialog box, set XML (*.xml) as the type of document to export to and click Export. You are then presented with the XML Export dialog box displayed in Figure 5-2.

Figure 5.2. Figure 5-2

In Figure 5-2, you can see three of the basic types of files created. For the purposes of using the file structure to base an InfoPath form on, you only need the XSD. But to get a good look at what the XML looks like both the data and the schema were exported.

5.1.3.1. The XML Data Document (*.xml)

The XML data document is just that, data. So, the tblCustomers table would look like this:

<?xml version="1.0" encoding="UTF-8"?>
<dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  xsi:noNamespaceSchemaLocation="tblCustomers.xsd" generated="2004-09-20T16:04:16">
   <tblCustomers>
        <CustomerID>ALFKI</CustomerID>
        <CompanyName>Alfreds Futterkiste</CompanyName>
        <ContactName>Maria Anders</ContactName>
        <ContactTitle>Sales Representative</ContactTitle>
   <Address>Obere Str. 57</Address>
    <City>Berlin</City>
     <PostalCode>12209</PostalCode>
      <Country>Germany</Country>
      <Phone>030-0074321</Phone>
      <Fax>030-0076545</Fax>
   </tblCustomers>
   <tblCustomers>
       <CustomerID>ANATR</CustomerID>
       <CompanyName>Ana Trujillo Emparedados y helados</CompanyName>
       <ContactName>Ana Trujillo</ContactName>
       <ContactTitle>Owner</ContactTitle>
       <Address>Avda. de la Constitución 2222</Address>
       <City>México D.F.</City>
       <PostalCode>05021</PostalCode>
       <Country>Mexico</Country>
       <Phone>(5) 555-4729</Phone>
       <Fax>(5) 555-3745</Fax>
   </tblCustomers>
...

<tblCustomers>
       <CustomerID>WOLZA</CustomerID>
       <CompanyName>Wolski  Zajazd</CompanyName>
       <ContactName>Zbyszek Piestrzeniewicz</ContactName>
       <ContactTitle>Owner</ContactTitle>
       <Address>ul. Filtrowa 68</Address>
       <City>Warszawa</City>
       <PostalCode>01-012</PostalCode>
       <Country>Poland</Country>
       <Phone>(26) 642-7012</Phone>
       <Fax>(26) 642-7012</Fax>
    </tblCustomers>
    </dataroot>

The first line of code, <?xml version="1.0" encoding="UTF-8" ?>, specifies the version of the XML and encoding format being used.

The <dataroot> tag line specifies other information about the whole XML file itself, such as the schema file being used and when it was generated.

The next tag, <tblCustomers>, describes the table. If there were multiple tables included, this tag would be repeated for a different table after the field tags for the tblCustomers were specified. After listing the various fields in a record, the end tag of </tblCustomers> is used.

There are additional customer records included in tblCustomers.xml, but they are represented with an ellipsis so as not to waste space.

Finally, the end tag for the dataroot is displayed: </dataroot>. You can see from the preceding listing that no information about the structure of the data was included other than the names for the table and fields. The other information was specified in the *.xsd file. Field tags in the *.xml document match up directly with field tags in the schema document.

5.1.3.2. The Schema File (*.xsd)

The schema file specifies not only the definitions for the individual fields, but for the whole XML file itself. The <xsd:element> tags in the following listing are used to specify each element in the *.xml file used for tblCustomers. Other tags are used to specify different attributes matching the properties set for the table and fields in Access.

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:od="urn:schemas-microsoft-com:officedata">
<xsd:element name="dataroot">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="tblCustomers" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="generated" type="xsd:dateTime"/>
</xsd:complexType>
</xsd:element>
<xsd:element name="tblCustomers">
<xsd:annotation>

<xsd:appinfo>
<od:index index-name="PrimaryKey" index-key="CustomerID " primary="yes" unique="yes" clustered="no"/>
<od:index index-name="City" index-key="City " primary="no" unique="no" clustered="no"/>
<od:index index-name="CompanyName" index-key="CompanyName " primary="no" unique="no" clustered="no"/>
<od:index index-name="FavoriteShipperID" index-key="FavoriteShipperID " primary="no" unique="no" clustered="no"/>
<od:index index-name="PostalCode" index-key="PostalCode " primary="no" unique="no" clustered="no"/>
<od:index index-name="Region" index-key="Region " primary="no" unique="no" clustered="no"/>
</xsd:appinfo>
</xsd:annotation>
<xsd:complexType>
<xsd:sequence>
<xsd:element name="CustomerID" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:maxLength value="5"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element name="CompanyName" minOccurs="1" od:jetType="text" od:sqlSType="nvarchar" od:nonNullable="yes">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:maxLength value="40"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
...
<xsd:element name="FavoriteShipperID" minOccurs="0" od:jetType="longinteger" od:sqlSType="int" type="xsd:int"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>

When you are passing data from one business system to another, as long as the system can read XML, the two file types discussed here are all you need to hand off. If you want to specify how the presentation is handled, use the *.xsl file. For our purposes the first two files do the job.

As with the tblCustomers.xml file, there are additional elements included in the tblCustomers.xsd file, but they are represented by an ellipsis so as not to waste space.

5.1.3.3. Try It Out: Exporting tblShippers from Access to XML

To give you more exposure to working with XML files and transferring them between systems, using Access you will work with the Chapter 5.mdb database to export the tblShippers table. You will also utilize the *.xsd file you create in a section later in this chapter:

  1. Open the Chapter 5.mdb database in Access.

  2. Click on the Tables tab.

  3. Right-click the tblShippers table, and choose Export . . . from the menu. The Export table dialog box opens.

  4. Select XML (*.xml) for the Save as type. The dialog then looks like Figure 5-3.

    Figure 5.3. Figure 5-3
  5. Click Export. The XML Export dialog box then appears. For this example you once again use the default of creating the *.xml and *.xsd files.

  6. Click OK to create the files.

To test the files you can go to the folders they were created in, and open them in NotePad, WordPad, or some other XML editors.

Now that you have created an XML file, read on to see how to use these files for InfoPath forms.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.148.144.228