Navigating a Non-XML Document with XPath

One of the most interesting and unique aspects of .NET’s XML implementation is the fact that you can read non-XML data as if it were XML by creating subclasses of its abstract types. You saw this already in XmlReader and XmlWriter.

Because you can read an XmlDocument from any XmlReader, it follows that you can call that XmlDocument’s SelectSingleNode( ) and CreateNavigator( ) methods to navigate the document via XPath. You can also create an XPathDocument from any XmlReader, and use its CreateNavigator( ) method. In addition to that, however, you can also create a custom implementation of XPathNavigator to navigate any source document with XPath.

Using a custom XmlReader

You’ve already created and used XmlPyxReader in Chapter 4. Since XmlPyxReader is just like any other instance of XmlReader, you can pass it to XPathDocument’s constructor and navigate it using XPath:

XmlReader reader = new XmlPyxReader(filename);
XPathDocument document = new XPathDocument(reader);

XPathNavigator navigator = document.CreateNavigator( );
XPathNodeIterator iterator = navigator.Select(xpathExpression);
Console.WriteLine("{0} nodes matched.", iterator.Count);
while (iterator.MoveNext( )) {
 Console.WriteLine(iterator.Current.LocalName);
}
writer.Close( );

Using a custom XPathNavigator

The next possibility is to skip over the custom XmlReader and go directly to a custom XPathNavigator. Like custom XmlReaders, custom XPathNavigators have numerous methods and properties to implement.

You’ve already seen that Angus Hardware uses XML files to manage their purchase orders. But what about when they want to manage the purchase orders at a higher level? For example, POs may come in from any of several clients, and they should be managed by the date on which they arrive.

Angus Hardware’s IT department maintains POs in a filesystem, with a structure as shown in Figure 6-1.

Purchase order directory structure
Figure 6-1. Purchase order directory structure

Obviously, they’d like to be able quickly to find particular invoices, either by client or by date. What they need is a custom XPathNavigator.

First, some design specifics. Each directory and PO file in the PO tree will be represented as an element. None of these elements have attributes. At the very lowest level, the PO XML file itself will be represented as an element with a name in the form ponumber.

I’m going to show you one way to write a custom XPathNavigator to navigate a filesystem using XPath. I’ll go through this code one section at a time, beginning with the constructors for the FileSystemNavigator class:

public FileSystemNavigator( ) {
  rootDir = new DirectoryInfo(Environment.CurrentDirectory);
  state.Push(new FileSystemState(rootDir));
}

public FileSystemNavigator(string path) {
  rootDir = new DirectoryInfo(path);
  state.Push(new FileSystemState(rootDir));
}

private FileSystemNavigator(FileSystemInfo d) {
  rootDir = (DirectoryInfo)d;
  state.Push(new FileSystemState(rootDir));
}

The three constructors should handle all the cases of interest: the default, which uses the current directory; one that takes a string, which is the path name; and one that takes a FileSystemInfo representing the root directory of the file structure. FileSystemInfo is the base type from which both FileInfo and DirectoryInfo are derived.

The three private instance variables hold the following: the root directory, for later reference; an XmlNameTable, which will be used externally for atomized name comparisons; and a Stack, holding the current node and its ancestors as the document is navigated:

private DirectoryInfo rootDir;
private XmlNameTable nameTable = new NameTable( );
private Stack state = new Stack( );

CurrentState is a property I’ve defined for convenience. It returns the current FileSystemState (an internal class which you’ll see on the next page) by calling Peek( ) on the state instance variable:

private FileSystemState CurrentState {
  get {
    return (FileSystemState)state.Peek( );
  }
}

These two GetChildren( ) convenience methods, the instance version of which calls the static version, know how to return only the child nodes that you’re interested in. For example, if the current node is the year2002 element, you’re only interested in seeing its elements whose names begin with month, not any other files or directories that happen to be in the year2002 directory:

private FileSystemInfo [ ] GetChildren( ) {
  return GetChildren(CurrentState.Entry);
}

internal static FileSystemInfo [ ] GetChildren(FileSystemInfo entry) {
  if (entry is DirectoryInfo) {
    DirectoryInfo dir = (DirectoryInfo)entry;
    if (dir.Name == "POs") {
      return dir.GetDirectories("client*");
    } else if (dir.Name.StartsWith("client")) {
      return dir.GetDirectories("year*");
    } else if (dir.Name.StartsWith("year")) {
      return dir.GetDirectories("month*");
    } else if (dir.Name.StartsWith("month")) {
      return dir.GetDirectories("day*");
    } else if (dir.Name.StartsWith("day")) {
      return dir.GetFiles("po*.xml");
    } else {
      return dir.GetDirectories("POs");
    }
  }
  return new FileSystemInfo [0];
}

The Clone( ) method is required in order to implement the ICloneable interface, which is inherited from XPathNavigator:

public override XPathNavigator Clone( ) {
  FileSystemNavigator fsn = new FileSystemNavigator(CurrentState.Entry);
  fsn.nameTable = this.nameTable;
  return fsn;
}

The rest of the methods override XPathNavigator’s abstract methods and properties. In the interest of saving space, I’ve not reproduced the ones that unconditionally return empty string instances (BaseURI, XmlLang, Value, GetAttribute( ), GetNamespace( ), Prefix NamespaceURI) or false (HasAttributes, MoveToAttribute( ), MoveToFirstAttribute( ), MoveToNextAttribute( ), MoveToNamespace( ), MoveToFirstNamespace( ), MoveToNextNamespace( ). MoveToId( )). Since the filesystem model is not described with a URL, and isn’t itself XML, the pseudo-elements have no value, and the model does not include attributes or namespaces, these methods and properties are irrelevant.

In this model, each node is either the root or an element. If the current directory is the root directory, the type must be XPathNodeType.Root. Otherwise, it is XPathNodeType.Element:

public override XPathNodeType NodeType { 
  get {
    if (state.Count == 1)
      return XPathNodeType.Root;
    else 
      return XPathNodeType.Element;
  }
}

Each element’s Name is simply the name of the current FileSystemInfo entry. Since the filesystem has no namespace, the Name is the same as the LocalName. Within LocalName, the name is added to the nameTable instance variable so that atomized string comparisons use the XmlNameTable properly:

public override string LocalName { 
  get {
string name = CurrentState.Entry.Name;
    nameTable.Add(name);
    return name;
  }
}

public override string Name { 
  get { 
    return LocalName;
  }
}

The NameTable property simply returns the nameTable instance variable.

public override XmlNameTable NameTable {
  get {
    return nameTable; 
  }
}

Any node with no children is empty. The HasChildren property uses the GetChildren( ) convenience method to get, and count, the child nodes:

public override bool IsEmptyElement { 
  get { 
    return !HasChildren;
  }
}

public override bool HasChildren { 
  get {
    return (GetChildren( ).Length > 0);
  }
}

The MoveTo*( ) methods make sure that the FileSystemState at the top of the stack always reflects the right information. To do this, it changes the Entry and Position variables, as necessary. Additionally, MoveToParent( ) pops the FileSystemState off the top of the stack, and MoveToFirstChild( ) pushes a new one on. MoveToRoot( ) and MoveToDocumentElement( ) clear the stack and push a new FileSystemState on, representing the top of the tree:

public override bool MoveToNext( ) {
  if (CurrentState.Position < CurrentState.Siblings.Length - 1) {
    CurrentState.Entry = CurrentState.Siblings[++CurrentState.Position];
    return true;
  } else {
    return false;
  }
}
                
public override bool MoveToPrevious( ) {
  if (CurrentState.Position > 0) {
    CurrentState.Entry = CurrentState.Siblings[--CurrentState.Position];
    return true;
  } else {
    return false;
  }
}
         
public override bool MoveToFirst( ) {
  CurrentState.Position = 0;
  CurrentState.Entry = CurrentState.Siblings[CurrentState.Position];
  return true;
}

public override bool MoveToFirstChild( )        {
  FileSystemInfo [ ] children = GetChildren( );
  if (children.Length > 0) {
    state.Push(new FileSystemState(children[0]));
    return true;
  } else {
    return false;
  }
}

public override bool MoveToParent( ) {
  if (CurrentState.Entry == rootDir) {
    return false;
  } else {
    state.Pop( );
    return true;
  }
}

public override void MoveToRoot( ) {
  state.Clear( );
  state.Push(new FileSystemState(rootDir));
}

public bool MoveToDocumentElement( ) {
  MoveToRoot( );
  return true;
}

public override bool MoveTo( XPathNavigator other ) {
  if (other is FileSystemNavigator) {
    FileSystemNavigator fsn = (FileSystemNavigator)other;
    state = fsn.state;
    return true;
  }
  return false;
}

IsSamePosition( ) compares this XPathNavigator to another one, returning true if they share the same XmlImplementation and XmlDocument, and if they both share the same context node:

public override bool IsSamePosition( XPathNavigator other ) {
  if (other is FileSystemNavigator) {
    FileSystemNavigator fsn = (FileSystemNavigator)other;
    if (CurrentState == fsn.CurrentState) {
      return true;
    }
  }
  return false;
}

As I’ve already described, the FileSystemState class is used internally in the FileSystemNavigator to hold the data about a filesystem entry, which is represented as a node in the XML tree, and about its siblings:

internal class FileSystemState {
  public FileSystemInfo Entry;
  public int Position;
  public FileSystemInfo [ ] Siblings;

  public FileSystemState(FileSystemInfo dir) {
    Entry = dir;
    Position = 0;
    if (dir is DirectoryInfo) {
      Siblings = FileSystemNavigator.GetChildren(((DirectoryInfo)dir).Parent);
    } else {
      Siblings = FileSystemNavigator.GetChildren(((FileInfo)dir).Directory);
    }
    }

    public override bool Equals(object other) {
    FileSystemState state = other as FileSystemState;
    if (state != null && state.GetHashCode( ) == GetHashCode( )) {
      return true;
    } else {
      return false;
    }
  }
  
    public override int GetHashCode( ) {
    return Entry.GetHashCode( ) | Position.GetHashCode( );
    }
}

Example 6-3 shows the complete FileSystemNavigator program.

Example 6-3. FileSystemNavigator
using System;
using System.Collections;
using System.IO;
using System.Xml;  
using System.Xml.XPath;

public class FileSystemNavigator : XPathNavigator {    

  // Constructors

  public FileSystemNavigator( ) {
    rootDir = new DirectoryInfo(Environment.CurrentDirectory);
    state.Push(new FileSystemState(rootDir));
  }

  public FileSystemNavigator(string path) {
    rootDir = new DirectoryInfo(path);
    state.Push(new FileSystemState(rootDir));
  }

  private FileSystemNavigator(FileSystemInfo d) {
    rootDir = (DirectoryInfo)d;
    state.Push(new FileSystemState(rootDir));
  }

  // Private instance variables

  private DirectoryInfo rootDir;
  private XmlNameTable nameTable = new NameTable( );
  private Stack state = new Stack( );

  // private properties

  private FileSystemState CurrentState {
    get {
      return (FileSystemState)state.Peek( );
    }
  }

  // private methods

  private FileSystemInfo [ ] GetChildren( ) {
    return GetChildren(CurrentState.Entry);
  }

  internal static FileSystemInfo [ ] GetChildren(FileSystemInfo entry) {
    if (entry is DirectoryInfo) {
      DirectoryInfo dir = (DirectoryInfo)entry;
      if (dir.Name == "POs") {
        return dir.GetDirectories("client*");
      } else if (dir.Name.StartsWith("client")) {
        return dir.GetDirectories("year*");
      } else if (dir.Name.StartsWith("year")) {
        return dir.GetDirectories("month*");
      } else if (dir.Name.StartsWith("month")) {
        return dir.GetDirectories("day*");
      } else if (dir.Name.StartsWith("day")) {
        return dir.GetFiles("po*.xml");
      } else {
        return dir.GetDirectories("POs");
      }
    }
    return new FileSystemInfo [0];
  }

  // public methods, from ICloneable

  public override XPathNavigator Clone( ) {
    FileSystemNavigator fsn = new FileSystemNavigator(CurrentState.Entry);
    fsn.nameTable = this.nameTable;
    return fsn;
  }

  // public methods, from XPathNavigator

  public override string BaseURI {
    get {
      return String.Empty;
    }
  }

  public override string XmlLang {
    get {
      return String.Empty;
    }
  }

  public override XPathNodeType NodeType { 
    get {
      if (state.Count == 1)
        return XPathNodeType.Root;
      else 
        return XPathNodeType.Element;
    }
  }
   
  public override string LocalName { 
    get {
      string name = CurrentState.Entry.Name;
      nameTable.Add(name);
      return name;
    }
  }

  public override string NamespaceURI { 
    get { 
      return nameTable.Add(string.Empty); 
    } 
  }

  public override string Name { 
    get { 
      return LocalName;
    }
  }

  public override string Prefix { 
    get { 
      return nameTable.Add(string.Empty); 
    }
  }

  public override string Value { 
    get {
      return string.Empty; 
    } 
  }

  public override bool IsEmptyElement { 
    get { 
      return !HasChildren;
    }
  }

  public override XmlNameTable NameTable {
    get {
      return nameTable; 
    }
  }

  public override bool HasAttributes {
    get {
      return false;
    }
  }

  public override string GetAttribute( string localName, string namespaceURI ) {
    return string.Empty;
  }

  public override bool MoveToAttribute( string localName, string namespaceURI ) {
    return false;
  }
    
  public override bool MoveToFirstAttribute( ) {
    return false;
  }

  public override bool MoveToNextAttribute( ) {
    return false;
  }

  public override string GetNamespace(string prefix) {
    return String.Empty;
  }

  public override bool MoveToNamespace(string prefix) {
    return false;
  }

  public override bool MoveToFirstNamespace(XPathNamespaceScope namespaceScope) {
    return false;
  }

  public override bool MoveToNextNamespace(XPathNamespaceScope namespaceScope) {
    return false;
  }
    
  public override bool HasChildren { 
    get {
      return (GetChildren( ).Length > 0);
    }
  }
        
  public override bool MoveToNext( ) {
    if (CurrentState.Position < CurrentState.Siblings.Length - 1) {
      CurrentState.Entry = CurrentState.Siblings[++CurrentState.Position];
      return true;
    } else {
      return false;
    }
  }
    
  public override bool MoveToPrevious( ) {
    if (CurrentState.Position > 0) {
      CurrentState.Entry = CurrentState.Siblings[--CurrentState.Position];
      return true;
    } else {
      return false;
    }
  }
   
  public override bool MoveToFirst( ) {
    CurrentState.Position = 0;
    CurrentState.Entry = CurrentState.Siblings[CurrentState.Position];
    return true;
  }

  public override bool MoveToFirstChild( )  {
    FileSystemInfo [ ] children = GetChildren( );
    if (children.Length > 0) {
      state.Push(new FileSystemState(children[0]));
      return true;
    } else {
      return false;
    }
  }

  public override bool MoveToParent( ) {
    if (CurrentState.Entry == rootDir) {
      return false;
    } else {
      state.Pop( );
      return true;
    }
  }

  public override void MoveToRoot( ) {
    state.Clear( );
    state.Push(new FileSystemState(rootDir));
  }

  public bool MoveToDocumentElement( ) {
    MoveToRoot( );
    return true;
  }

  public override bool MoveTo( XPathNavigator other ) {
    if (other is FileSystemNavigator) {
      FileSystemNavigator fsn = (FileSystemNavigator)other;
      state = fsn.state;
      return true;
    }
    return false;
  }

  public override bool MoveToId( string id ) {        
    return false;
  }

  public override bool IsSamePosition( XPathNavigator other ) {
    if (other is FileSystemNavigator) {
      FileSystemNavigator fsn = (FileSystemNavigator)other;
      if (fsn.CurrentState == CurrentState) {
        return true;
      }
    }
    return false;
  }
}  

internal class FileSystemState {
  public FileSystemInfo Entry;
  public int Position;
  public FileSystemInfo [ ] Siblings;

  public FileSystemState(FileSystemInfo dir) {
    Entry = dir;
    Position = 0;
    if (dir is DirectoryInfo) {
      Siblings = FileSystemNavigator.GetChildren(((DirectoryInfo)dir).Parent);
    } else {
      Siblings = FileSystemNavigator.GetChildren(((FileInfo)dir).Directory);
    }
  }
  
  public override bool Equals(object other) {
    FileSystemState state = other as FileSystemState;
    if (state != null && state.GetHashCode( ) == GetHashCode( )) {
      return true;
    } else {
      return false;
    }
  }
  
  public override int GetHashCode( ) {
    return Entry.GetHashCode( ) | Position.GetHashCode( );
  }
}

Now you can navigate the PO directory structure using XPath with the previous code, with one small change to the program in Example 6-2:

XPathNavigator navigator = new FileSystemNavigator( );
XPathNodeIterator iterator = navigator.Select(xpathExpression);
Console.WriteLine("{0} nodes matched.", iterator.Count);
while (iterator.MoveNext( )) {
  Console.WriteLine(iterator.Current.LocalName);
}
writer.Close( );
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.158.148