Sound is cool and speech is even cooler, but you'll also want to be able to communicate with your projects through voice commands. This section will show you how to add speech recognition to your robotic projects. This isn't nearly as simple as the speaking part, but thankfully, you have some significant help from the open source development community. You are going to download a set of capabilities named PocketSphinx, which will allow our project to listen to our commands.
The first step is downloading the PocketSphinx capabilities. Unfortunately, this is not quite as user-friendly as the espeak
process, so follow along carefully. There are two possible ways to do this. If you have a keyboard, mouse, and display connected or want to connect via vncserver
, you can do this graphically by performing the following steps:
.tar.gz
version of these and move them to the /home/pi
directory of your Raspberry Pi.However, before you build these, you need two libraries. The first library is libasound2-dev
. If you skipped the first two objectives of this chapter, you'll need to download them now using sudo apt-get install libasound2-dev
. If you're unsure whether or not it's installed, try it again. The system will let you know if it's already installed.
The second of these libraries is a library called
Bison
. This is a general-purpose, open source parser that will be used by PocketSphinx. To get this package, type sudo apt-get install bison
.
Another way to accomplish this is to use wget
directly on the command-line prompt of Raspberry Pi. If you want to do it this way, perform the following steps:
wget
on your host machine, find the link to the file you wish to download. In this case, go to the Sphinx website hosted by Carnegie Mellon University at http://cmusphinx.sourceforge.net/. This is an open source project that provides you with the speech recognition software. With your smaller, embedded system, you will be using the PocketSphinx version of this code.sphinxbase-0.8.tar.gz
file (if 0.8 is the latest version) and select Copy Link Location. Now open a PuTTY window in Raspberry Pi, and after logging in, type wget
and paste the link you just copied. This will download the .tar.gz
version of sphinxbase
. Now follow the same procedure with the latest version of PocketSphinx.Before you build these, you need two libraries. The first library is libasound2-dev
. If you skipped the first two objectives of this chapter, you'll need to download it now using sudo apt-get install libasound2-dev
. If you're unsure whether or not it's installed, try it again. The system will let you know if it's already installed.
The second of these libraries is called
Bison
. This is a general purpose, open source parser that will be used by PocketSphinx. To get this package, type sudo apt-get install bison
.
Once everything is installed and downloaded, you can build PocketSphinx. Firstly, your home directory, with the tar.gz
files of both PocketSphinx and sphinxbase, should look as follows:
To unpack and build the sphinxbase
module, type sudo tar –xzvf sphinx-base-0.y.tar.gz
, where y
is the version number; in our example, it is 8
. This should unpack all the files from the archive into a directory named sphinxbase-0.8
. Now type cd sphinxbase-0.8
. Listing the files should show something like the following screenshot:
To build the application, start by issuing the command sudo ./configure --enable-fixed
. This command will check that everything is ok with the system and then configure a build.
Now you are ready to actually build the sphinxbase
code base. This is a two-step process, which is as follows:
make
and the system will build all the executable files.sudo make install
and this will install all the executables onto the system.Now we need to make the second part of the system: the PocketSphinx
code itself. Go to the home directory and decompress and unarchive the code by typing tar -xzvf pocketsphinx-0.8.tar.gz
. The files should now be unarchived, and we can now build the code. Installing these files is a three-step process as follows:
cd
in the PocketSphinx
directory, and then type ./configure
to see if we are ready to build the files.make
and wait for a while for everything to build.sudo make install
.Several possible additions to our library installations will be useful later if you are going to use your PocketSphinx capability with Python as a coding language. You can install Python-Dev using sudo apt-get install python-dev
and Cython using sudo apt-get install cython
. You can also choose to install pkg-config
, a utility that can sometimes help deal with complex compiles. Install it using sudo apt-get install pkg-config
.
Once the installation is complete, you'll need to let the system know where our files are. To do this, you will need to edit the /etc/ld.so.conf
path as the root by typing sudo emacs /etc/ld.so.conf
. You will add the last line to the file, so it should now look like the following screenshot:
Now type sudo /sbin/ldconfig
, and the system will now be aware of your PocketSphinx
libraries.
Everything is installed, so you can now try our speech recognition. Type cd
in the /home/pi/pocketsphinx-0.8/src/programs
directory to try a demo program; then type pocketsphinx_continuous
. This program takes input from the microphone and turns it into speech. After running the command, you'll get a lot of irrelevant information, and then you will see the following screenshot:
The INFO
and Warning
statements come from the C or C++ code and are there for debugging purposes. Initially, they will warn you that they cannot find your Mic
and Capture
elements, but when they find them, they will print out READY....
. If you have set things up as previously described, you should be ready to give them a command. Say "hello" into the microphone. When they sense that you have stopped speaking, they will process your speech and give lots of irrelevant information again, but they should eventually show the commands in the following screenshot:
Notice the 000000000: hello
command. It recognized your speech! You can try other words and phrases too. The system is very sensitive, so it may pick up background noise. You are also going to find that it is not very accurate. We'll deal with that in a moment. To stop the program, type cntrl-c
.
There are two ways to make your voice recognition more accurate. One is to train the system to more accurately understand your voice. This is a bit complex and if you want to know more, go to the PocketSphinx website of CMU.
The second way to improve accuracy is to limit the number of words that your system uses to determine what you are saying. The default has literally thousands of word possibilities, so if two words are close, it may choose the wrong word. To avoid this, you can make your own grammar rules to restrict the words it has to choose from.
The first step is to create a file with the words or phrases that you want the system to recognize. Then, you use a web tool to create two files that the system will use to define our grammar. I'll do this through the vncserver
command because I'll need to use a web browser on Raspberry Pi to turn a text file into a set of grammar files. Begin by editing a file; type emacs grammar.txt
and insert the text as shown in the following screenshot:
Now you must use the CMU web tool to turn this file into two files that the system can use to define its dictionary. On my system, I have already installed Firefox using sudo apt-get install firefox
. So, now I can open a web browser window and go to http://www.speech.cs.cmu.edu/tools/lmtool-new.html. If you hit the Browse button, you can find and select the file. It should look something like the following screenshot:
Open the grammar.txt
file; then, on the web page, select COMPILE KNOWLEDGE BASE, and a window should pop up, as shown in the following screenshot:
You need to download the .tgz
file created; in this case, the TAR1565.tgz
file. This will download into your /home/pi/
directory. Move it to the /home/pi/pocketsphinx-0.8/src/programs
directory and unarchive it using tar –xzvf
and the filename.
Now you can invoke the pocketsphinx_continuous
program to use this dictionary by typing ./pocketsphinx_continuous -lm 1565.lm -dict 1565.dic
, and it will look in that directory to find matches to your commands.
You can also do this on your remote computer using Windows or Linux by creating the file in a text editor such as WordPad or Emacs. Once you have created the required grammar files, you can download them to your Raspberry Pi using WinSCP, if you are using Windows or scp
from the command line, if you are using Linux.
Your system can now understand your specific set of commands! In the next section of this chapter, you'll learn how to use this input to have the project respond.
3.147.56.45