In this section, we will see how to install the speech recognition system of Julius and how to connect it to Python. The required packages (such as Julius and audio tools) are available in Ubuntu's package manager, but we also need to download and install the Python wrapper separately. Let's start with the required components for the installation.
The following are the instructions to install Julius and Python binding in Ubuntu 14.04.2:
$ sudo apt-get install julius
$ sudo apt-get install pulseaudio-utils
$ sudo apt-get install osspd-alsa
osspd
to alsa
:$ sudo alsa force-reload
To install pyjulius, the Python extension for Julius, you need to install the setup tools in Python.
wget
tool:$ wget https://bootstrap.pypa.io/ez_setup.py $ sudo python ez_setup.py
$ sudo python setup.py install
$ wget http://www.repository.voxforge1.org/downloads/software/julius-3.5.2-quickstart-linux.tgz
$ padsp julius -input mic -C julian.jconf
$ padsp julius -module -input mic -C julian.jconf
This command will start a Julius server. This server listens to clients. If we want to use Julius APIs from Python, we need to connect to a server using a client code as given in the following sections. The Python code is a client that connects to the Julius server and prints the recognized text.
The following code is a Python client of the Julius speech recognition server that we started using the previous command. After connecting to this server, it will trigger speech-to-text conversion and fetch the converted text and print on terminal:
#!/usr/bin/env python import sys #Importing pujulius module import pyjulius #It an implementation of FIFO(First In First Out) queue suitable for multi threaded programming. import Queue # Initialize Julius Client object with localhost ip and default port of 10500 and trying to connect server. client = pyjulius.Client('localhost', 10500) try: client.connect() #When the client runs before executing the server it will cause a connection error. except pyjulius.ConnectionError: print 'Start julius as module first!' sys.exit(1) # Start listening to the server client.start() try: while 1: try: #Fetching recognition result from server result = client.results.get(False) except Queue.Empty: continue print result except KeyboardInterrupt: print 'Exiting...' client.stop() # send the stop signal client.join() # wait for the thread to die client.disconnect() # disconnect from julius
After connecting to Julius server, the Python client will listen to server and print the output from the server.
The acoustic models we used in the preceding programs are already trained, but they may not give accurate results for our speech. To improve the accuracy in the previous speech recognition engines, we need to train new language and acoustic models and create a dictionary or we can adapt the existing language model using our voice. The method to improve accuracy is beyond the scope of this chapter, so some links to train or adapt both Pocket Sphinx and Julius are given.
The following link is used to adapt the existing acoustic model to our voice for Pocket Sphinx:
http://cmusphinx.sourceforge.net/wiki/tutorialadapt
Julius accuracy can be improved by writing recognition grammar. The following link gives an idea about how to write recognition grammar in Julius:
http://julius.sourceforge.jp/en_index.php?q=en_grammar.html
In the next section, we will see how to connect Python and speech synthesis libraries. We will work with the eSpeak and Festival libraries here. These are two popular, free, and effective speech synthesizers available in all the OS platforms. There are precompiled binaries available in Ubuntu in the form of packages.
eSpeak and Festival are speech synthesizers available in the Ubuntu/Linux platform. These applications can be installed from the software package repository of Ubuntu. The following are the instructions and commands to install these packages in Ubuntu.
$ sudo apt-get install espeak $ sudo apt-get install python-espeak
$ sudo apt-get install festival
svn
tool (Apache Subversion) to download this package. Subversion is a free software versioning and revision control system:$ svn checkout http://pyfestival.googlecode.com/svn/trunk/ pyfestival-read-only
pyfestival-read-only
folder and you can install this package using the following command:$ sudo python setup.py install
Here is the code to work with Python and eSpeak. As you will see, it's very easy to work with Python binding for eSpeak. We need to write only two lines of code to synthesize speech using Python:
from espeak import espeak espeak.synth("Hello World")
This code will import the eSpeak-Python wrapper module and call the synth
function in the wrapper module. The synth
function will synthesize the text given as argument.
The following code shows how to synthesize speech using Python and Festival:
import festival festival.say("Hello World")
The preceding code will import the Festival-Python wrapper module and call the say
function
in the Festival module. It will synthesize the text as speech.
3.134.118.95