With the introduction of Siri, Cortana, Alexa and other voice based personal assistants, communicating with computers using voice has become the order of the day. Voice communication also makes things more intuitive and definitely cooler. This project will provide the basics needed to install and use text-to-speech synthesizers like eSpeak and Flite on the PocketBeagle featuring the OSD3358-SM or any OSD335x based project running Debian Linux.
Text-to-Speech Project VideoSetting up the Hardware for the ProjectThis project has been developed to use all off the shelf products with minimal soldering. The components we used are listed in the "things" section. In addition to the PocketBeagle, you will need a powered USB Hub and a Sound Card with an AUX cable to output analog signals to a speaker or headphones.
First, you will need to add an additional USB port to your PocketBeagle headers so that it can interface with the Sound-Card.
To connect a USB Type A Female Receptacle Breakout Board to PocketBeagle, make the following connections:
PocketBeagle <--> USB Type A Female Breakout Board
- VB (P1.5) <--> VBUS
- VI (P1.7) <--> VBUS
- USB1- (P1.9) <--> D-
- USB1+ (P1.11) <--> D+
- USB1 ID (P1.13) <--> GND
- USB1 GND (P1.13) <--> GND
For more information see the PocketBeagle FAQ
Eshtaartha loves to solder, so we used wires to connect the USB card to PocketBeagle. You can also plug the breakout board to PB using male headers.
Power up PocketBeagle and connect to it using SSH. Then connect the USB hub and Sound Card. Check if the Sound Card is detected using the command
lsusb (lists all usb devices) or aplay -l (lists all sound cards) on terminal
For this project we will be demonstrating the code at the command line. You can also incorporate it into your python or c project on your PocketBeagle image.
Setting up ALSA:The Advanced Linux Sound Architecture (ALSA) is a part of the Linux Kernel that provides drivers to sound cards. ALSA basically manages all sound cards and sound devices on your Linux system and also allows direct interaction with sound devices through its libraries. By default, ALSA chooses sound card 0, device 0 for voice playback. To find out the card number and device number assigned to your USB sound card, use the following command:
debian@beaglebone:~$ aplay -l
In case your sound card is assigned a different card or device id, you will have to override the default settings using the following piece of code.
pcm.!default
{
type plug
slave
{
pcm "hw:1,0" # hw:1:0
}
}
ctl.!default
{
type hw
card 1 # change card number if necessary
}
Go to /home/debian directory by:
debian@beaglebone:~$ cd /home/debian
Create a file named .asoundrc
debian@beaglebone:~$ nano .asoundrc
Copy and paste the above piece of code.
To save, press Ctrl + o and then hit Enter.
Press Ctrl + x to exit
.asoundrc is an ALSA configuration file and ALSA always looks for it in /home/debian directory.
You can find more information about asoundrc configuration file on the alsa-project.org page here.
Setting up and using Text- to-Speech Synthesizers:Before proceeding, make sure your board is connected to the internet.
This is a good tutorial for connecting BeagleBoards to the Internet
eSpeak Synthesizer:The first Synthesizer we will use is "eSpeak". eSpeak is compact, open source and is available for English and other languages. More information about it is available at http://espeak.sourceforge.net. eSpeak uses a "formant synthesis" method. This allows many languages to be provided in a small size.
To install espeak:
debian@beaglebone:~$ sudo apt-get install espeak
To check if espeak is functional, try:
debian@beaglebone:~$ espeak “Hello”
or
debian@beaglebone:~$ espeak “I can now talk”
You can also specify the number of words per minute using -s option:
debian@beaglebone:~$ espeak “I can now talk” -s 120
espeak is capable of handling several languages and producing both male and female voices.
- For female voice in English(en), try:
debian@beaglebone:~$ espeak -ven+f1 “What are you up to”
+m1, +m2 … +m6
are for male voices.
+f1, +f2 … +f6
are for female voices.
In the above command, en stands for English.
We can even make it whisper! Try:
debian@beaglebone:~$ espeak -ven+whisper “I can whisper”
Now, let’s find out how espeak manages text in different languages
- German (de):
debian@beaglebone:~$ espeak -vde+m1 "ichkann Deutsch sprechen"
- Spanish (es):
debian@beaglebone:~$ espeak -ves+m1"puedo hablar español"
- Hindi (hi):
debian@beaglebone:~$ espeak-vhi+m1 "मैं हिंदी बोल सकता हूँ"
- Kannada (kn):
debian@beaglebone:~$ espeak -vkn+m1 “ನಾನು ಕನ್ನಡದಲ್ಲಿ ಮಾತನಾಡಬಹುದು”
- Mandarin Chinese (zh):
debian@beaglebone:~$ espeak -vzh+f1 "我可以说普通话"
To learn more about espeak, go to http://espeak.sourceforge.net/
Flite Synthesizer:An alternate synthesizer to use is Flite (festival-lite). This open source, small, fast run-time, text-to-speech synthesis engine was developed at Carnegie Melon University (CMU). It is primarily designed for small embedded machines and/or large servers.
To install flite:
debian@beaglebone:~$ sudo apt-get install flite
To check if flite is functional, try:
debian@beaglebone:~$ flite -t “Hello”
flite supports various voices in English
To get the list of available voices, try:
debian@beaglebone:~$ flite -lv
To use a particular voice, you can:
debian@beaglebone:~$ flite -voice slt -t “Can you hear me”
flite supports very few other languages when compared to espeak (at the time of writing this article)
To learn more about flite, go to http://www.festvox.org/flite/
Text to Speech Using other OSD335x based Development BoardsYou can add text to speech in this manner using any OSD335x based development board. We also built the project using the Octavo Systems OSD3358-SM-RED Reference, Evaluation, Development Board. This board uses the same OSD3358-SM System-in-Package as PocketBeagle. The OSD3358-SM-RED has additional full development features included such as sensors, ethernet and USB on the board. The set up is shown below:
You can also run the project on other BeagleBoard.org Foundation BeagleBones that use the OSD335x System-in-Package devices.
Incorporating Text to Speech into a Larger ProjectWe hope you enjoy adding text to speech to your PocketBeagle project. Perhaps you are using some sensors to set off a voice command, or providing a voice in addition to some other action. Check out the following project which incorporates text-to-speech:
Biometric Door Opener with Facial Recognition & Voice Output
(Turn up the volume to listen and click on the title link to see the full project)
We'd like to hear about your project and what type of application you are adding this feature. Also, if you run into issues while executing above instructions, please leave a comment below to get help.
References:[1] https://www.alsa-project.org/main/index.php/Asoundrc
Comments