The Auto Attendant: generating VoiceXML using XSLT and ASP.NET

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

4.3. The Auto Attendant: generating VoiceXML using XSLT and ASP.NET

One of the best things about VoiceXML is the fact that it is XML! This means we can do all of those things that we love to do to XML: parse it, validate it, explore it with XPath, transform it with XSLT, then write it back out again. While the previous voice application examples accessed business data using relational databases or business objects, this example will look at generating VoiceXML from an XML data store.

In this section we're going to create an ASP.NET based VoiceXML application that uses XSLT to render an XML phone directory as VoiceXML. Before delving into the nuts and bolts of this example, we should first familiarize ourselves with the ASP.NET environment as well as XSLT.

4.3.1. ASP.NET

ASP.NET is the next generation of Microsoft's Active Server Page (ASP) technology. .NET is the new software development paradigm from the Microsoft camp based on a Common Language Interpreter (CLI). This allows developers to write code in whatever language they are comfortable with (excluding Java), compile their source into .NET Assemblies and run them on the CLI. Thus, C++, C#, Visual Basic, Fortran, and Eiffel code (along with code written in many other languages) can all fully interoperate.

What's more, the CLI environment used to develop and run desktop-based GUI applications is almost identical to that used to develop and deploy Web-based applications. Both desktop- and Web-based GUIs are built by dragging and dropping UI controls onto a canvas, then “wiring” them to one another and to databases using event-driven programming. One of the goals of the .NET framework is to catalyze the development of rich HTML-based user interfaces that run on a remote server and are accessed through Internet Explorer.

The .NET framework doesn't stop at HTML, however. For example, there are WML-based controls available for authoring WAP-based applications. These generate WML (Wireless Markup Language) optimized for document delivery over a WAP (Wireless Application Protocol) network to portable devices. More germane to this book, Microsoft has also begun work on SALT, an XML-based language that is somewhat analogous to VoiceXML in that it represents voice-driven dialog interfaces. SALT does differ in many ways from VoiceXML, as is discussed briefly in 4.3.7, “Conclusions,”. Thus, if a developer makes some minimal effort to separate model and view/controller, the .NET framework provides the infrastructure to deploy applications on a desktop, on a Web page, on a PDA, on a cell phone, and in a voice browser.

One other powerful addition to the .NET framework is its extensive support for XML processing. The .NET framework includes W3C-compliant implementations of SAX, DOM, XPath, and XSLT. In addition, it provides the HTTP-based XML Remote Procedure Call protocol called SOAP and various other XML-based services for the development of enterprise applications.

In short, the ASP.NET is the HTTP server-based portion of the .NET framework. It facilitates the development of XML-aware, event-driven, component-oriented interfaces delivered over HTTP, and enterprise-accessible business logic. This example will employ ASP.NET Web forms and controls to create the VoiceXML Auto Attendant application.

4.3.1.1. Web forms

A Web form can be thought of as an active Web page. The Web form for an HTML page typically consists of static HTML code, dynamic ASP code and directives, and ASP controls. This is represented as a file with a .aspx extension. IIS, Microsoft's HTTP server product, knows how to pre-process any file with a .aspx extension as an ASP.NET Web form before serving it to an HTTP client.

Web forms are not limited to HTML. For our example we'll be using a Web form comprising VoiceXML code and ASP controls that produce VoiceXML.

ASP allows you to include the code that is run on the server during the processing of an active server page. ASP.NET takes this feature one step further by facilitating the separation of presentation logic and business logic with the notion of a “code-behind” class. A .aspx page, upon being served, is transformed into a class whose Render() method writes the contents of the active page. You can specify this class' base-class by specifying a code-behind class. Everything you implement in the code-behind class will be inherited and accessible from the ASP-page class. We will use this feature in our example.

4.3.1.2. Controls

Controls are reusable UI components that can be plugged into a Web form. For HTML developers, examples of controls include DataGrids and Tree Views - reusable parameterizable chunks of markup. Controls can be implemented in one of two ways. They can be implemented using ASP-style markup just like a .aspx file. This type of control is called a user control and is stored as a .ascx file. A control can also be implemented simply as a class (written in C#, Visual Basic, etc.) that inherits System.UI.Web.Control. Such control can draw itself by implementing the Render() method.

User controls stored as .ascx files are in many ways similar to Web forms stored as .aspx files. They consist mostly of markup and ASP code, directives, and controls (controls often contain other controls). Just like .aspx files, a .ascx file can have code-behind classes.

To include a control in a Web form or within another control, you must first tell IIS what your control is. This is done with the Register directive. For example, if we have a control TestControl.ascx and want to use it from our Web form MainPage.aspx, we need to put the following at the top of MainPage.aspx:

<%@ Register TagPrefix="Demo" TagName="TestControl" 
                              Src="TestControl.ascx" %>

We arbitrarily chose Demo and TestControl as convenient names for us to use while writing our .aspx page. We can now insert the following into MainPage.aspx amidst other markup that constitutes the page:

<Demo:TestControl id="TestControl1" runat="server" />

This will cause the contents of TestControl.ascx to be inserted into MainPage.aspx where the above control tag is located. We have set two attributes on our TestControl. The id attribute gives this control a unique ID. If there is a member of type Control declared in the code-behind class whose variable name matches the above id, this instance of TestControl could be programmatically accessed using the value of id as a variable name.

Had TestControl.ascx or its code-behind class implemented a class property called Message, we could have initialized that property using the ASP.NET control tag syntax as follows:

<Demo:TestControl Message="Hello World!" id="TestControl1" 
                                         runat="server" />

One can design sophisticated controls that draw themselves based on database information, respond to and/or generate events, and can be manipulated programmatically. In addition, these controls can produce whatever markup we need: HTML, XHTML, WML, VoiceXML, etc.

4.3.2. XSLT

The Extensible Stylesheet Language Transformations (XSLT) is a technology for transforming documents conforming to one XML DTD into documents conforming to another XML DTD or a non-XML notation. XSLT stylesheets specify how the transformation is to take place. These transformations can be arbitrarily complex, but XSLT is outside the scope of this book. The goal of this section is to give the reader a feel for what XSLT stylesheets look like and how they are used and perhaps provide encouragement to learn more about this powerful data-processing technique. A more in-depth, but still brief, tutorial on XSLT is given in The XML Handbook and a thorough treatment is given in Definitive XSLT and XPath by Ken Holman, both published in this series.

Let's assume we have data in one XML format and we want to convert it to another XML format. Perhaps the nature of the data is to remain the same, say a list of names and phone numbers, but some program needs to read these from an XML document of a different structure. For example, we may have our phone numbers stored as shown in Example 4-27, but need them transformed into the XML document shown in Example 4-28.

Example 4-27. The input of our transformation

<entry name="Mike Wilkenson" phoneNumber="212-555-1234"/>
<entry name="Jim Johnson" phoneNumber="312-555-4433"/>
<entry name="Delores McGuire" phoneNumber="213-555-1212"/>

Example 4-28. The required output of our transformation

<person>
  <name>Mike Wilkenson</name>
  <phoneNumber>212-555-1234</phoneNumber>
</person>
<person>
  <name>Jim Johnson</name>
  <phoneNumber>312-555-4433</phoneNumber>
</person>
<person>
  <name>Delores McGuire</name>
  <phoneNumber>213-555-1212</phoneNumber>
</person>

One might simply write a program using either the SAX or DOM programmer interfaces to solve this problem. While the SAX or DOM approach to this sort of problem entails writing a bit of code to handle SAX events or to iterate over a DOM tree, the XSLT approach is a bit different. With XSLT, you author an XSLT stylesheet comprised of one or more template rules. A template rule describes what a node in the input document should be transformed into in the output document.

A simplified way to think of how an XSLT processor does its job is as follows. An XSLT processor loads a stylesheet and then, in theory, traverses the input XML document node by node. For each node, the XSLT processor will see what rules “match” that node as determined by the rule's match attribute. For example, if the match attribute contains the value entry, we say that this rule matches an element node in the input document with the element-type name entry. This is called the template rule's pattern. When the rule does match a node, it will generate in the output document whatever is specified by the contents of the xsl:template element. This is called the template rule's template.

For our example task above we need a single rule that simply takes each entry element in the input document and converts it into a person element in the output document. Doing so entails reading the name and phoneNumber attributes of the incoming entry elements and creating name and phonenumber children in the outgoing person elements. The XSLT stylesheet that does this is shown in Example 4-29.

Example 4-29. An XSLT stylesheet with a single template rule

<xsl:stylesheet xmlns:xsl="http://www.w3c.org/1999/XSL/Transform" 
                version="1.0">
  <xsl:output method="xml"/>
  <xsl:template match="entry">
    <person>
      <name>
        <xsl:value-of select="@name"/>
      </name>
      <phoneNumber>
        <xsl:value-of select="@phoneNumber"/>
      </phoneNumber>
    </person>
  </xsl:template>
</xsl:stylesheet>

The template rule in this example matches all entry elements. When the XSLT processor encounters such an element in the input document, it will produce, in the output document, elements as specified by this template rule's template, in this case a person element containing name and phoneNumber elements.

You'll notice that we are able to populate these output elements based on the input element's name and phoneNumber attributes using the xsl:value-of element. The “@” symbol indicates that we are selecting an attribute value from the incoming node. Had we simply written name for the select attribute of the xsl:value-of instruction, the XSLT processor would have tried to find a child element of the incoming entry element named name and insert the contents of that element here.

Whatever XSLT we have covered in this section is hardly enough to justify you adding XSLT to your resume, but it will be sufficient for us to complete our Auto Attendant example.

4.3.3. Requirements analysis

Let's return to our original task of implementing an auto attendant in VoiceXML. Consider the scenario where a small VoiceXML consulting company has their employee directory stored in an XML file containing employee names, extensions, and departments. An excerpt from this XML file is shown in Example 4-30.

Example 4-30. An excerpt from the phone directory XML document

<!DOCTYPE PhoneDirectory[
<!ELEMENT person     (name, extension, department)>
<!ELEMENT name       (#PCDATA)>
<!ELEMENT extension  (#PCDATA)>
<!ELEMENT department (#PCDATA)>
]><PhoneDirectory>
    <person>
      <name>John Smith</name>
      <extension>101</extension>
      <department>sales</department>
    </person>
    <person>
      <name>Mary Jones</name>
      <extension>102</extension>
      <department>engineering</department>
    </person>
    <person>
      <name>Vincent Contuccio</name>
      <extension>103</extension>
      <department>customer support</department>
    </person>
</PhoneDirectory>

Our task is to create a VoiceXML application that answers the phone, asks the user the name of the employee they would like to speak to, and goes to that employee's personalized VoiceXML document - each employee's first assignment is to create their own personalized VoiceXML document that may perform tasks such as transferring the call to a phone, recording voice mail from the user, sending an SMS (Short Message Service) message to the employee's cell phone or pager, etc. Each employee's personalized VoiceXML document is named as the corresponding telephone extension, for example 101.vxml, and stored on the central server in a directory named for the corresponding department, for example sales/.

4.3.4. Design solution

As in the previous examples, we begin by analyzing our call flow as shown in Figure 4-4.

Figure 4-4. The auto attendant call flow

As this is a simple application, we can implement it all as a single menu dialog whose choice elements are dynamically rendered from the phone directory XML document. Since both our phone directory and our resulting VoiceXML are in XML format, we use XSLT to do this rendering.

We implement this dynamic menu dialog as Welcome.aspx, an ASP.NET dynamic document. The document contains the skeleton of the VoiceXML page but delegates the dynamic generation of the choice elements to a user control, XSLTTransformerControl.ascx.

Welcome.aspx begins, as shown in Example 4-31, with a Page directive used by IIS at serve time to configure parameters such as the scripting language used, the code-behind class for this page, and instructions for managing events. This is followed by a Register directive which informs IIS that this page will use a control prefixed with AA and it should look in the AutoAttendant Assembly's AutoAttendant namespace to find the appropriate control class.

Example 4-31. The ASP.NET directives for `Welcome.aspx`

<%@ Page language="c#" Codebehind="Welcome.aspx.cs" 
         AutoEventWireup="false" Inherits="AutoAttendant.Welcome" %>
<%@ Register TagPrefix="AA" NameSpace="AutoAttendant" 
                                         Assembly="AutoAttendant" %>

The next portion of Welcome.aspx, shown in Example 4-32, is the template for our VoiceXML document to be retrieved by a VoiceXML interpreter. We open our vxml element, open a menu element, and give our menu a prompt with which we welcome callers. We also provide a custom catch block for nomatch and noinput here.

Example 4-32. The VoiceXML intro for `Welcome.aspx`

<?xml version="1.0" encoding="iso-8859-1"?>
<vxml version="2.0">
  <menu id="main_directory">
    <prompt>
      You've reached the VoiceXML auto attendant.  Please say
      either the name or the extension of the person you would like
      to reach.
    </prompt>
    <catch event="nomatch noinput">
      <prompt>Sorry. That's not a valid employee name.</prompt>
      <reprompt />
    </catch>

Within our menu, where we would expect to find our choice elements, we instead find our XSLTransformer user control element, shown in Example 4-33.

Example 4-33. The `XSLTransformer` user control tag in `Welcome.aspx`

    <AA:XSLTransformerControl 
                XMLSrc="c:AutoAttendantPhoneDirectory.xml"
                StyleSheetSrc="c:AutoAttendantPhoneDirectory.xsl" 
                id="PhoneDir" runat="server" />

The XSLTransformer control takes as input an XML source file and an XSLT stylesheet file. Its output is inserted into the final rendered ASP page. This control gives us a convenient way to insert XSLT output into an ASP.NET page. We will cover the implementation of XSLTransformer in the next section.

We set the XSLTransformer control's XMLSrc property to the file name of our phone directory XML file. Likewise, we set the StyleSheetSrc property to the file name of our XSLT template. We also provide a unique ID for this control using the id attribute. Finally, with the runat attribute we tell IIS to replace this tag with a fully rendered XSLTransformer control - the output generated from its Render() method.

The remainder of Welcome.aspx simply closes the menu element and the vxml element as shown in Example 4-34.

Example 4-34. The remainder of `Welcome.aspx`

  </menu>
</vxml>

4.3.5. The `XSLTransformerControl`

In this section we will create the XSLTransformer ASP.NET control. This control takes as parameters the filename of an XML source file and the filename of an XSLT stylesheet, then renders itself as the output of the XSLT stylesheet applied to the XML source file. In the previous section we invoked this control using the following element:

<AA:XSLTransformerControl 
                XMLSrc="c:AutoAttendantPhoneDirectory.xml"
                StyleSheetSrc="c:AutoAttendantPhoneDirectory.xsl" 
                id="PhoneDir" runat="server" />

When the Web form is rendered, the above element will be replaced, in the final output, with the results of the XSL transformation on PhoneDirectory.xml.

Since we don't need the features of user controls, we'll implement XSLTransformerControl simply as a C# class, XSLTransformerControl.cs. This class begins with typical C# introductory declarations shown in Example 4-35.

Example 4-35. The introductory portion of `XSLTransformerControl.cs`

using System.Collections;
using System.ComponentModel;
using System.Data;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.IO;
using System.Xml;
using System.Xml.XPath;
using System.Xml.Xsl;

namespace AutoAttendant
{
  /// <summary>
  /// Summary description for PhoneDirectoryControl.
  /// </summary>
  public class XSLTransformer : Control
  {

Here we declare what namespaces we will be using, the namespace in which we are defining our class, and the class definition itself. This is followed by declarations for class member properties XMLSrc and StyleSheetSrc, both of type string, as shown in Example 4-36.

Example 4-36. Declaration of the `XMLSrc` and `StyleSheetSrc` class member properties

  /// <summary>
  /// The source file name for the XML directory.
  /// </summary>
  string _xmlSrc;
  public string XMLSrc
  {
    get{return _xmlSrc;}
    set{_xmlSrc = value;}
  }

  /// <summary>
  /// The source file name for the XSLT stylesheet template.
  /// </summary>
  string _styleSheetSrc;
  public string StyleSheetSrc
  {
    get{return _styleSheetSrc;}
    set{_styleSheetSrc = value;}
  }

Recall that we are able to set these properties from our control element in Welcome.aspx through attributes with the same names as these properties. This allows the VoiceXML developer to set class properties without eliciting the aid of a C# programmer.

The remainder of this class is the interesting part, the Render() method shown in Example 4-37. This method is called by IIS when it's time to render all of the controls on an ASP.NET page and deliver them to the client - in our case, a VoiceXML interpreter client.

Example 4-37. `XSLTransformerControl`'s `Render()` method

  /// <summary>
  /// Overrides the function that renders this controls output.
  /// </summary>
  /// <param name="writer"></param>
  override protected void Render(HtmlTextWriter htmlWriter)
  {
    XPathDocument myXPathDocument = new XPathDocument (XMLSrc);
    XslTransform myXslTransform = new XslTransform();
    XmlTextWriter xmlwriter = new XmlTextWriter(htmlWriter);
    myXslTransform.Load(StyleSheetSrc);
    myXslTransform.Transform(myXPathDocument, null, xmlwriter);
  }

This Render() method takes the XMLSrc and StyleSheetSrc properties as set from the ASP.NET page and uses them to find the source document and the stylesheet document. It then creates an instance of XslTransform, loads the stylesheet into it, and instructs it to process the input XML document by calling the Transform() method. Notice how the Transform() method takes as its third parameter xmlwriter. Two lines earlier we created this object as a wrapper around htmlWriter passed in as a parameter to the Render() method. Hence, as the Transform() method writes its output to xmlwriter, xmlwriter will in turn write this output to htmlWriter. htmlWriter will then write its output into the final stream sent back to the HTTP client.

4.3.6. Putting it all together

Let's consider a use scenario. A caller calls in to the company's VoiceXML interpreter that is configured to go immediately to Welcome.aspx as its first VoiceXML document. IIS receives the request, and due to the .aspx extension, knows to interpret this file before sending it back to the requesting client. In doing so it converts the <AA:XSLTransformer/> element into an instance of XSLTransformerControl and then sets that instance's XMLSrc and StyleSheetSrc properties from the element's attributes.

The XSLT stylesheet for transforming our phone directory into choice elements is shown in Example 4-38.

Example 4-38. The XSLT stylesheet to convert the phone directory into VoiceXML

<xsl:stylesheet version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml"/>
  <xsl:template match="person">
    <choice>
      <xsl:attribute name="next">
        <xsl:value-of select="department"/>/<xsl:value-of 
                                           select="extension"/>.vxml
      </xsl:attribute>
      <xsl:value-of select="name"/>
    </choice>
  </xsl:template>
</xsl:stylesheet>

All this stylesheet needs to do is generate a choice element for each person element in the phone directory. It uses xsl:value-of elements to construct the destination URL and insert it into the next attribute of the choice element. It also uses an xsl:value-of element to insert the person's name into the body of the choice element. The VoiceXML interpreter will use this to generate a grammar for the menu.

Once all of the controls are instantiated, IIS renders the page, including each of its contained controls, out to the requesting client. This causes the XSLTransformerControl's Render() method to be called which creates an XSLT processor and processes the XMLSrc file with the StyleSheetSrc file specified in the ASP.NET control element. What comes out may look like the dialog shown in Example 4-39.

Example 4-39. The fully rendered `Welcome.aspx` page returned to the VoiceXML interpreter

<?xml version="1.0" encoding="iso-8859-1"?>
<vxml version="2.0">
  <menu id="main_directory">
    <prompt>
      You've reached the VoiceXML auto attendant.  Please say
      either the name or the extension of the person you would like
      to reach.
    </prompt>
    <catch event="nomatch noinput">
      <prompt>Sorry. That's not a valid employee name.</prompt>
      <reprompt />
    </catch>
    <choice next="sales/101.vxml">John Smith</choice>
    <choice next="engineering/102.vxml">Mary Jones</choice>
    <choice 
       next="customer support/103.vxml">Vincent Contuccio</choice>
  </menu>
</vxml>

If the user utters the name “John Smith,” the VoiceXML interpreter will be redirected to John's custom VoiceXML document sales/101.vxml.

4.3.7. Conclusions

This simple example has only scratched the surface of using ASP.NET for rendering VoiceXML in so far as it only used the ASP.NET's implementation of XSLT.

ASP.NET provides all sorts of sophisticated features for serving dynamic pages, including run-time databinding which allows for easy yet flexible database integration in dynamic pages. In addition ASP.NET provides a framework for translating HTTP POST requests into application events so application programmers can write event-driven Web applications in a fashion similar to how event-driven desktop applications are written.

While the ASP.NET framework to date has been focused mostly on the delivery of visual markup - HTML, WML, etc. - Microsoft has its sights on voice as well. It has been driving the creation of a new speech dialog markup specification called SALT, which is covered briefly in Chapter 5, “Voice services,”. This language, while similar to VoiceXML in many ways, is more geared towards an event-driven model where the logic of what to do next is implemented on the server. This model can be contrasted with VoiceXML's Form Interpretation Algorithm model where more of the processing is off-loaded to the VoiceXML interpreter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for The Auto Attendant: generating VoiceXML using XSLT and ASP.NET

Create new playlist

Sign In

Sign Up