Enabling speech recognition in the web application

Speech recognition functionality is something we haven't discussed yet in any of our projects. Actually, performing speech recognition from a web browser is better than using offline speech recognizers. The reason is that web-based speech recognition uses Google's speech recognition system, which is one of the best speech recognition systems available today. So let's see how we can implement speech recognition in our application.

The web speech API specification (https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html) provides speech recognition and synthesis APIs for a web application, but the majority of web browsers don't support it anymore. Google introduced their own speech recognition and synthesis platform and integrated into these APIs. Now these APIs can work well with Google Chrome.

Let's look at the procedure to enable web-based speech recognition APIs.

The first step is to check whether the browser supports the speech APIs. We can check this using the following code:

     if (!('webkitSpeechRecognition' in window)) { 
        //Speech API not supported here... 
     } else { //Let's do some cool stuff :)

If the browser supports speech recognition, we can start the speech recognizer.

First, we have to create a speech recognizer object, which will be used throughout the code:

     var recognition = new webkitSpeechRecognition();

Now we will configure the speech recognizer object. If we want to implement continuous recognition, we need to mark this as true. This is suitable for dictation.

    recognition.continuous = true;

The following settings enable intermediate speech recognition results even if they are not final:

    recognition.interimResults = true;

Now we configure the recognition language and accuracy of detection:

    recognition.lang = "en-US"; 
    recognition.maxAlternatives = 1;

After configuring the speech recognition object, we can fill in the callback functions. The callback functions handle each speech recognition object event. Let's look at the main callback of the speech recognition object.

The start()callback function calls when the recognition starts, and we may add some visual feedback, such as flashing a red light or something here to alert the user:

    recognition.onstart = function() { 
 
    ; 
    };

Also, if the speech recognition is finished, the onend() callback will be called, and you can give some visual feedback here too:

    recognition.onend = function() { 
   
 
    };

The following callback, onresult(), give the final recognized results of speech recognition:

    recognition.onresult = function(event) { 
 
        if (typeof(event.results) === 'undefined') {  
            recognition.stop(); 
            return; 
        }

After getting the results, we have to iterate inside the result object to get the text output:

    for (var i = event.resultIndex; i < event.results.length; ++i)    { 
             if (event.results[i].isFinal) { 
                console.log("final results: " +              
    event.results[i][0].transcript); 
 
            } 
 
    else { 
                console.log("interim results: " +          
        event.results[i][0].transcript); 
              } 
        } };

Now we can start the speech recognition through a user-defined function called startButton(). Whenever this function is called, the recognition will start.

    <div onclick="startButton(event);"</div> 
 
    function startButton(event) { 
        recognition.start(); 
    }

Table of Contents for Enabling speech recognition in the web application

Create new playlist

Sign In

Sign Up

Table of Contents for
Enabling speech recognition in the web application