Recognizing Speech with Phrase List Grammars

,

Using a list grammar in your app requires the creation of a collection of strings that represent recognizable phrases, which is then passed to the SpeechRecognizer’s grammar set, like so:

string[] numbers = {"one", "two", "three"};
SpeechRecognizerUI recognizerUI = new SpeechRecognizerUI();
SpeechGrammarSet grammarSet = recognizerUI.Recognizer.Grammars;
grammarSet.AddGrammarFromList("Numbers", numbers);

A speech recognizer is not limited to a single grammar. Multiple list grammars may be added to the same grammar set using successive calls to AddGrammarFromList. AddGrammarFromList accepts a key, which allows you to disable or enable the particular phrase list, and an IEnumerable<string> of phrases.

An instance of a speech recognizer has one grammar set. The speech recognizer loads its grammar set at the beginning of a recognition operation unless it has been preloaded. If its grammar set is empty, the speech recognizer falls back on the dictation grammar. You see how to preload a grammar later in the chapter.


Note

There are restrictions when combining types of grammars in a single grammar set. In the case of the dictation and web search grammars, no other grammar types can be added to the set. In other words, if a grammar set contains a dictation or web search grammar, it cannot contain any list grammars or SRGS grammars.


Create a Voice Paint App with a List Grammar

In the sample code for this section you see how to create a phrase list grammar that allows the user to use speech recognition to add colored shapes to a canvas. The sample code for this section is located in the Speech/VoicePaint/PhraseList directory of the WPUnleashed.Examples project in the downloadable sample code.

The VoicePaintViewModel class defines two list grammars. The first is defines the actions taken for the page

string[] actionPhrases = { "[add] circle", "[add] square", "clear" };

When you use a list grammar, the square brackets indicate that the word “add” in the previous excerpt is optional and may or may not be uttered by the user.

The second list grammar defines a set of colors:

string[] colorPhrases = { "red", "green", "blue" };

When navigating to the VoicePaintView.xaml page, the app asks the user for an action. If the user responds with an action that is either to add a circle or to add a square, the app prompts the user for the color of the new shape.

A SpeechRecognizerUI instance is created in the viewmodel’s GetSpeechRecognizerUI method. This method adds the list grammars using the AddGrammarFromList method, as shown in the following excerpt:

SpeechRecognizerUI GetSpeechRecognizerUI()
{
    if (recognizerUI == null)
    {
        recognizerUI = new SpeechRecognizerUI();
        SpeechGrammarSet grammarSet = recognizerUI.Recognizer.Grammars;

        grammarSet.AddGrammarFromList(actionsKey, actionPhrases);
        grammarSet.AddGrammarFromList(colorsKey, colorPhrases);
    }

    return recognizerUI;
}

The viewmodel’s Prompt method retrieves the SpeechRecognizerUI object and selectively enables the actions list grammar while disabling the colors list grammar. See Listing 23.1. This prevents the speech recognizer from recognizing phrases from the colors list grammar when it should expect a phrase from the actions grammar.

The SpeechRecognizerUI uses three built-in screens for guiding the user through the speech recognition process. The first is a listening screen, which requests speech input from the user. The second is a confirmation screen, which presents the recognized speech to the user and allows the user to confirm or discard it. If the speech input is matched with similar confidence to more than one phrase in the enabled grammars, the third screen, a disambiguation screen is displayed, which allows the user to decide the phrase. The disambiguation screen is capable of presenting up to 20 phrases from the enabled grammars.


Note

The disambiguation screen is not displayed when using the dictation or web search grammars.


The properties of the SpeechRecognizerUI’s Settings property affects if and how the built-in displays are presented.

The ListenText property of the SpeechRecognizerUISettings class is set to display an instructive message to the user on the built-in listening screen, to let him or her know what kind of information your app is expecting—for example, “Select a color.” If you do not specify a value for this property, the text “Listening...” is displayed.

The ExampleText property allows you to provide one or more purely instructional examples to the user, such as “blue”, or “Zurich Switzerland.” If no value is specified, this portion of the listening screen is blank.

The SpeechRecognizerUISettings class also includes two other properties: ReadoutEnabled and ShowConfirmation.

ReadoutEnabled determines whether the phone speaks successfully recognized text back to the user from the built-in confirmation screen, and whether it speaks options from the built-in disambiguation screen. By default ReadoutEnabled is true; however, if TTS readout is disabled in the settings for speech on the phone, readout is disabled even if ReadoutEnabled is set to true.

ShowConfirmation determines whether the confirmation screen is displayed upon recognition of a phrase. The confirmation screen displays when a call to the RecognizeWithUIAsync method produces a successful recognition, and it shows the text of the recognized speech. By default ShowConfirmation is true. Set it to false to prevent it from being displayed, which also prevents reading out of recognized speech regardless of the ReadoutEnabled property.

If the disambiguation screen is presented, the phrases are read out to the user if fewer than five phrases are displayed, the ReadoutEnabled property is not set to false, and “Play audio confirmations” is enabled in the phone’s speech settings.

A successful recognition is determined by the ResultStatus property of the SpeechRecognitionUIResult return value.

In the sample, when the viewmodel receives an action to add a shape, it then disables recognition of the action grammar and prompts the user for a color. See Listing 23.1.

LISTING 23.1. VoicePaintViewModel.Prompt Method


async void Prompt()
{
    SpeechRecognizerUI recognizer = GetSpeechRecognizerUI();
    recognizer.Recognizer.Grammars[actionsKey].Enabled = true;
    recognizer.Recognizer.Grammars[colorsKey].Enabled = false;

    recognizer.Settings.ListenText = "Say an action.";
    recognizer.Settings.ExampleText = " 'add circle', 'square', 'clear' ";

    SpeechRecognitionUIResult uiResult = await recognizer.RecognizeWithUIAsync();

    if (uiResult.ResultStatus == SpeechRecognitionUIStatus.Succeeded)
    {
        recognizer.Recognizer.Grammars[actionsKey].Enabled = false;

        string recognitionText = uiResult.RecognitionResult.Text;

        if (recognitionText == actionPhrases[0]) /* Add Circle */
        {
            Color outlineColor = await AskColor();
            AddShape(ShapeType.Circle, outlineColor);
        }
        else if (recognitionText == actionPhrases[1]) /* Add Square */
        {
            Color outlineColor = await AskColor();
            AddShape(ShapeType.Square, outlineColor);
        }
        else if (recognitionText == actionPhrases[2]) /* Clear */
        {
            ClearShapes();
        }
        else
        {
            throw new Exception("Unknown recognition response: " + recognitionText);
        }
    }
}


The AskColor method works in much the same way, as shown in Listing 23.2. The colors list grammar is enabled and the user is prompted for a color—either red, green, or blue. The color phrase is interpreted, and a System.Windows.Media.Color object is returned to the caller.

LISTING 23.2. VoicePaintViewModel.AskColor Method


async Task<Color> AskColor()
{
    SpeechRecognizerUI recognizer = GetSpeechRecognizerUI();
    recognizer.Recognizer.Grammars[colorsKey].Enabled = true;

    recognizer.Settings.ListenText = "What color would you like your shape to be?";
    recognizer.Settings.ExampleText = " 'red', 'green', 'blue' ";

    SpeechRecognitionUIResult uiResult = await recognizer.RecognizeWithUIAsync();

    if (uiResult.ResultStatus == SpeechRecognitionUIStatus.Succeeded)
    {
        string recognizedText = uiResult.RecognitionResult.Text;
        Color color;
        if (recognizedText == colorPhrases[0]) /* Red. */
        {
            color = Colors.Red;
        }
        else if (recognizedText == colorPhrases[1]) /* Green. */
        {
            color = Colors.Green;
        }
        else if (recognizedText == colorPhrases[2]) /* Blue. */
        {
            color = Colors.Blue;
        }
        else
        {
            throw new Exception("Unknown color response: " + recognizedText);
        }

        return color;
    }

    return Colors.White;
}


A custom abstract class named Shape serves as the base class for the Circle and Square classes. It contains two positional properties named Top and Left, two size properties named Width and Height, and an OutlineColor property that defines the color of the shapes border when displayed on a page.

Shape objects are stored in an ObservableDictionary<Shape> in the viewmodel. The AddShape method, shown in Listing 23.3, creates a new Shape of the specified ShapeType and positions it alongside the previous shape, offset by a small margin.

LISTING 23.3. VoicePaintViewModel.AddShape Method


void AddShape(ShapeType shapeType, Color outlineColor)
{
    double left = 0;
    double top = 0;

    Shape lastShape = shapes.LastOrDefault();
    if (lastShape != null)
    {
        left = lastShape.Left + 20;
        top = lastShape.Top + 20;
    }

    if (shapeType == ShapeType.Circle)
    {
        Circle circle = new Circle
            {
                Left = left, Top = top, OutlineColor = outlineColor
            };
        Shapes.Add(circle);
    }
    else if (shapeType == ShapeType.Square)
    {
        Square square = new Square
            {
                Left = left, Top = top, OutlineColor = outlineColor
            };
        Shapes.Add(square);
    }
}


The view displays the shapes using an ItemsControl that is bound to the Shapes ObservableDictionary. See Listing 23.4. Items are positioned absolutely using the Shape’s Top and Left properties.

For seasoned WPF developers, you may be wondering why the ItemsControl does not simply declare a Canvas in an ItemsPanelTemplate. Unfortunately, in XAML for Windows Phone, the ItemsControl does not allow you to set the ItemContainerStyle, which would otherwise allow us to correctly bind the Canvas.Left and Canvas.Top properties. Instead, each shape is placed in its own Canvas.

The DataTemplate used to display each shape type is determined by the TypeTemplateSelector (presented in Chapter 11, “Creating Expansive and Engaging Apps with the Pivot and Panorama”).

LISTING 23.4. VoicePaintView.xaml ItemsControl (excerpt)


<ItemsControl ItemsSource="{Binding Shapes}">
    <ItemsControl.ItemTemplate>
        <DataTemplate>
            <p:TypeTemplateSelector Content="{Binding}"
                               HorizontalAlignment="Left" VerticalAlignment="Top">
                <p:TypeTemplateSelector.Resources>
                    <DataTemplate x:Name="CircleTemplate">
                        <Canvas>
                            <Ellipse Width="{Binding Width}" Height="{Binding Height}"
                                 StrokeThickness="5"
                                 Stroke="{Binding OutlineColor,
                                 Converter={StaticResource ColorToBrushConverter}}"
                                 d:DataContext="{d:DesignInstance l:Circle}"
                                 Canvas.Left="{Binding Left}"
                                 Canvas.Top="{Binding Top}" />
                        </Canvas>
                    </DataTemplate>
                    <DataTemplate x:Name="SquareTemplate">
                        <Canvas>
                            <Border Width="{Binding Width}" Height="{Binding Height}"
                                BorderBrush="{Binding OutlineColor,
                                Converter={StaticResource ColorToBrushConverter}}"
                                BorderThickness="5" CornerRadius="4"
                                d:DataContext="{d:DesignInstance l:Square}"
                                Canvas.Left="{Binding Left}"
                                Canvas.Top="{Binding Top}" />
                        </Canvas>
                    </DataTemplate>
                </p:TypeTemplateSelector.Resources>
            </p:TypeTemplateSelector>
        </DataTemplate>
    </ItemsControl.ItemTemplate>
</ItemsControl>


An AppBarIconButton is used to execute a command named PromptCommand in the viewmodel, which calls the viewmodel’s prompt method and begins the speech recognition process:

<u:AppBar>
    <u:AppBarIconButton
            Command="{Binding PromptCommand}"
            Text="Begin"
            IconUri="/Speech/Images/TalkAppBarIcon.png" />
</u:AppBar>


Best Practice

The SpeechRecognizerUI class implements IDisposable. A SpeechRecognizerUI instance may retain native resources while it is being used by your app. It is, therefore, wise to dispose of your SpeechRecognizerUI object when the user navigates away from the page, as shown in the following excerpt:

public void CleanUp()
{
    if (recognizerUI != null)
    {
        recognizerUI.Dispose();
        recognizerUI = null;
    }
}


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.240.75