MSXML supports two basic APIs for processing XML: DOM and SAX (the Simple API for XML). Let's start with DOM.
As I've mentioned, the DOM method involves parsing an XML document and loading it into a tree structure in memory. An XML document parsed via DOM is known as a DOM document (or just DOM, for short). Listing 8.9 presents a simple VB application that demonstrates parsing an XML document via DOM and querying it for a particular node set. (You can find the source code for this app in the CH08domltest subfolder on the CD accompanying this book.)
Private Sub Command1_Click() Dim bstrDoc As String bstrDoc = "<Songs> " & _ "<Song title='One More Day' artist='Diamond Rio' />" & _ "<Song title='Hard Habit to Break' artist='Chicago' />" & _ "<Song title='Forever' artist='Kenny Loggins' />" & _ "<Song title='Boys of Summer' artist='Don Henley' />" & _ "<Song title='Cherish' artist='Kool and the Gang' />" & _ "<Song title='Dance' artist='Lee Ann Womack' />" & _ "<Song title='I Will Always Love You' artist= _ 'Whitney Houston' />" & _ "</Songs>" Dim xmlDoc As New DOMDocument30 If Len(Text1.Text) = 0 Then Text1.Text = bstrDoc End If If Not xmlDoc.loadXML(Text1.Text) Then MsgBox "Error loading document" Else Dim oNodes As IXMLDOMNodeList Dim oNode As IXMLDOMNode If Len(Text2.Text) = 0 Then Text2.Text = "//Song/@title" End If Set oNodes = xmlDoc.selectNodes(Text2.Text) For Each oNode In oNodes If Not (oNode Is Nothing) Then sName = oNode.nodeName sData = oNode.xml MsgBox "Node <" + sName + ">:" _ + vbNewLine + vbTab + sData + vbNewLine End If Next Set xmlDoc = Nothing End If End Sub |
We begin by instantiating a DOMDocument object. The DOMDocument object is the key to everything else we do with DOM using MSXML. We next call DOMDocument.loadXML to parse the XML document and load it into the DOM tree. Once the document is loaded into memory, we can query it via XPath queries or manipulate it further by making DOMDocument method calls. In this example, we call the selectNodes method to query the document via XPath. DOMDocument's selectNodes method returns a node list object, which we can then loop through using For Each. For each node in the node set, we display the node name followed by its contents. Parsing an XML document via DOM turns the document into a memory object that we can then work with just as we would any other object. We're able to access and manipulate the document as though it were an object because that's exactly what it is.
Like DOM, SAX is a W3C standard. Rather than providing an application access to XML data by materializing the document entirely in memory, SAX is an event-driven API. An application processes an XML document via SAX by responding to SAX events. As the SAX processor reads through the document, it raises an event each time it encounters a new node or section of the document. It then triggers the appropriate application event handler code and passes the relevant data about the event to the application. The application can then decide what to do in response—it could store the event data in some type of tree structure, as is the case with DOM processing; it could ignore the event; it could search the event data for a particular node or value; or it could take some other action. Once the application handles the raised event, the SAX processor continues processing the document. At no point does it store the entire document in memory as DOM does. It's really just a parsing mechanism to which an application can attach its own functionality. This is, in fact, the case with MSXML's DOM loader—SAX is its underlying parsing mechanism. MSXML's DOM loader sets up SAX event handlers that store the data passed to them via SAX in a DOM tree.
Given that SAX doesn't persist document data in memory, it's inherently far less memory consumptive than DOM. SAX is also much more trouble to use. By persisting documents in memory, DOM makes working with XML documents as easy as working with any other kind of object.
Listing 8.10 shows some VB code that demonstrates how to use SAX. It consists of three main modules: the main form, a content handler class, and an error handler class. (You can find the full source code for this application in the SAX subfolder under the CH08 folder on the CD accompanying this book.)
' Main form Option Explicit Private Sub Command1_Click() 'Create the SAX reader object Dim reader As New SAXXMLReader 'Set up the event handlers Dim CHandler As New ContentHandler Set reader.ContentHandler = CHandler Dim EHandler As New ErrorHandler Set reader.ErrorHandler = EHandler Text1.text = "" On Error GoTo ErrorTrap reader.parseURL (App.Path & "" & Text2.text) Exit Sub ErrorTrap: Text1.text = Text1.text & "Error: " & Err.Number & " : " & Err.Description End Sub ' Content handler Option Explicit Implements IVBSAXContentHandler Private Sub IVBSAXContentHandler_startElement(strNamespaceURI As String, strLocalName As String, strQName As String, ByVal attributes As MSXML2.IVBSAXAttributes) Form1.Text1.text = Form1.Text1.text & "__ELEMENT START__" & vbCrLf & "<" & strLocalName Dim i As Integer For i = 0 To (attributes.length - 1) Form1.Text1.text = Form1.Text1.text & " " & attributes.getLocalName(i) & "=""" & attributes.getValue(i) & """" Next Form1.Text1.text = Form1.Text1.text & ">" & vbCrLf End Sub Private Sub IVBSAXContentHandler_endElement(strNamespaceURI As String, strLocalName As String, strQName As String) Form1.Text1.text = Form1.Text1.text & "__ELEMENT END__" & vbCrLf & "</" & strLocalName & ">" & vbCrLf End Sub Private Sub IVBSAXContentHandler_characters(text As String) text = Replace(text, vbLf, vbCrLf) Form1.Text1.text = Form1.Text1.text & "__CHARACTERS__" & vbCrLf & text & vbCrLf End Sub Private Property Set IVBSAXContentHandler_documentLocator (ByVal RHS As MSXML2.IVBSAXLocator) Form1.Text1.text = Form1.Text1.text & "__DOCUMENT_LOCATOR__" & vbCrLf End Property Private Sub IVBSAXContentHandler_endDocument() Form1.Text1.text = Form1.Text1.text & "__DOCUMENT END__" & vbCrLf End Sub Private Sub IVBSAXContentHandler_endPrefixMapping(strPrefix As String) Form1.Text1.text = Form1.Text1.text & "__PREFIX MAPPING__" & vbCrLf & strPrefix & vbCrLf End Sub Private Sub IVBSAXContentHandler_ignorableWhitespace(strChars As String) Form1.Text1.text = Form1.Text1.text & "__IGNORABLE WHITESPACE__" & vbCrLf & strChars & vbCrLf End Sub Private Sub IVBSAXContentHandler_processingInstruction(target As String, data As String) Form1.Text1.text = Form1.Text1.text & "__PROCESSING INSTRUCTION__" & vbCrLf & "<?" & target & " " & data & ">" & vbCrLf End Sub Private Sub IVBSAXContentHandler_skippedEntity(strName As String) Form1.Text1.text = Form1.Text1.text & "__SKIPPED ENTITY__" & vbCrLf & strName & vbCrLf End Sub Private Sub IVBSAXContentHandler_startDocument() Form1.Text1.text = Form1.Text1.text & "__DOCUMENT START__" & vbCrLf End Sub Private Sub IVBSAXContentHandler_startPrefixMapping(strPrefix As String, strURI As String) Form1.Text1.text = Form1.Text1.text & "__START PREFIX MAPPING__" & strPrefix & " " & strURI & " " & vbCrLf End Sub ' Error handler Option Explicit Implements IVBSAXErrorHandler Private Sub IVBSAXErrorHandler_fatalError (ByVal lctr As IVBSAXLocator, msg As String, ByVal errCode As Long) Form1.Text1.text = Form1.Text1.text & "Fatal error: " & msg & " Code: " & errCode End Sub Private Sub IVBSAXErrorHandler_error(ByVal lctr As IVBSAXLocator, msg As String, ByVal errCode As Long) Form1.Text1.text = Form1.Text1.text & "Error: " & msg & " Code: " & errCode End Sub Private Sub IVBSAXErrorHandler_ignorableWarning (ByVal oLocator As MSXML2.IVBSAXLocator, strErrorMessage As String, ByVal nErrorCode As Long) End Sub |
As I said earlier, an application makes use of the SAX engine by invoking the SAX parser and responding to the events it raises. To use MSXML's SAX engine in a VB application, you implement SAX interfaces such as IVBSAXContentHandler, IVBSAXErrorHandler, IVBSAXDeclHandler, IVBSAXDTDHandler, and IVBSAXLexicalHandler. Implementing these interfaces amounts to setting up event handlers to respond to the events they define. In this example code, I've implemented IVBSAXContentHandler and IVBSAXErrorHandler via the ContentHandler and ErrorHandler classes.
We begin by instantiating a SAXXMLReader object. This object will process an XML document we pass it and raise events as appropriate as it reads through the document. The code in the ContentHandler and ErrorHandler classes will respond to these events and write descriptive text to the main form.
3.135.214.6