Converting text to speech using HTML5 audio

If we were to build a web-based navigation applications today, most of the components would already be available. There are Google maps or open street map components to display maps, as well as API services that provide driving directions.

But what about voice-based navigation guidance? Wouldn't that require another API service that converts text to speech?

Thanks to HTML5 audio and Emscripten (a C to JavaScript compiler), we can now use a free text-to-speech engine called espeak that works fully in the browser.

In this example we're going to use espeak to generate text entered by the user on a simple page. Most of the work will consist of preparations—we will need to set up espeak.js.

Getting ready

We need to download the speak.js from (http://github.com/html5-ds-book/speak-js). Click on the download zip button and download the archive to a newly created folder. Extract the archive in that folder—it should create a sub folder called speak-js-master.

How to do it...

Perform the following steps:

  1. Create the page index.html containing a text input field and a Speak button:
    <!doctype html>
    <html>
      <head>
        <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js"></script>
        <script src="speak-js-master/speakClient.js"></script>
        <script src="example.js"></script>
        <meta charset="utf8">    
      </head>
      <body>
        <div id="audio"></div>
        <input type="text" id="text" value="" placeholder="Enter text here">
        <button id="speak">Speak</button>
      </body>
    </html>
  2. Create example.js and add an on click action to the button:
    $(function() {
        $("#speak").on('click', function(){
            speak($("#text").val());
        });
    });
  3. From the command line, install http-server if not already installed then start the server:
    npm install -g http-server
    http-server
  4. Open http://localhost:8080 in your browser and test the demo.

How it works...

The engine that converts the text to speech is eSpeak (http://espeak.sourceforge.net/). This engine is written in C, however, the only language natively supported by browsers is JavaScript. How can we use this engine in the browser?

Emscripten is a compiler designed to work around this limitation. It takes LLVM bytecode generated by a LLVM compiler from C or C++ source code and converts it to JavaScript. Emscripen utilizes a lot of modern JavaScript features such as typed arrays, and relies on the great performance of modern optimizing JavaScript JIT compilers.

To avoid blocking the browser, the speech generator is invoked from a web worker created in speakClient.js. The generated WAV data is passed back by the worker, converted to base64 encoding and passed as a data URL to a newly created audio element. This element in turn is appended to the #audio element on the page and playback is activated by calling the play method.

There's more..

Espeak is licensed under the GNU GPL v3 license. As such, it might not be suitable for proprietary projects.

More information about Emscripten can be found on the Emscripten wiki at https://github.com/kripken/emscripten/wiki.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.79.206