Published July 9, 2018 © GPL3+

ReSpeaker Voice Reception System

The system includes a ReSpeaker Core v2.0 and speaker. Visitor can talk with it and it will drop a message to the person to be visited.

IntermediateFull instructions provided8 hours2,597

Things used in this project

Hardware components

Seeed Studio ReSpeaker Core v2.0

Speaker: 0.25W, 8 ohms

Story

Project Description

The basic function of the reception service is to greet visitors, make them feel welcome and prevent unauthorized access to the office. We leverage the ReSpeaker Core v2.0 functions to design a Voice Reception Service. The system can interact with visitor and drop message to the person to be visited. In future, we can design a small office employee phone list database, the employee can drop a message to the system, the system uses the GPIO functions of the ReSpeaker Core v2.0 to open the doors for the visitor. We use the Microsoft Bing Speech to text services and Twilio/Tencent message APIs to come out the python scripts.

Interaction Video

*Guest presses the button*

ReSpeaker: Welcome to x.factory. Who are you here to see?

Guest: I am here to see <employee>.

ReSpeaker: Thank you. What is your name?

Guest: My name is <guest>.

ReSpeaker: Thank you. I will send them a message.

*The device sends a message to the employee*

Message: Hello, <employee>. <guest> is here to see you. Please meet them at the door.

Hardware Setup

Software Setup

1. Apply Bing/Twilio/Tencent keys: please apply bing speech to text API，twilio keys, or tencent keys.

2. Install dependencies

#update the system

sudo apt-get update && sudo apt-get upgrade

#apt install dependencies

sudo apt-get install -y libpulse-dev libasound-devportaudio19-dev libportaudiocpp0 pythonpython-dev python-pip build-essential swig git python-pyaudiopython-numpy python-virtualenv libffi-dev libssl-devlibxml2-dev python3-pyaudio sox swig

# pip install dependencies

sudo pip installpocketsphinx webrtcvad requests monotonic pyaudio cffi twilio qcloudsms_py evdev pixel_ring respeaker

# Install snowboy

cd ~

git clone --depth 1 https://github.com/Kitt-AI/snowboy.git

cd snowboy

sudo virtualenv --system-site-packages env

source env/bin/activate

sudo python setup.py build

sudo python setup.py bdist_wheel

sudo pip install dist/snowboy*.whl

# Install voice engine

cd ~

git clone https://github.com/voice-engine/voice-engine.git

cd voice-engine

sudo python setup.py bdist_wheel

sudo pip install dist/*.whl

#Install webrtc-audio-processing

cd ~

git clonehttps://github.com/xiongyihui/python-webrtc-audio-processing.git

cdpython-webrtc-audio-processing

git submodule init && gitsubmodule update

python setup.py build

sudo python setup.py install

#Enable access

sudo chmod 0777 /dev/input/event0

3. Download Voice_Reception_System from github

cd ~

git clone https://github.com/SeeedDocument/Voice_Reception_System.git

cd Voice_Reception_System

nano Speech.py

#Modify Bing/Twilio/Tencent keys

python Speech.py

A Quick Start Guide

1. Download the code from GitHub.

2. Extract them onto ReSpeaker Core v.2(‘home’ is a default folder of this code.).

P.S.If you are still not familiar with ReSpeaker Core v.2, please visit http://wiki.seeedstudio.com/ReSpeaker_Core_v2.0. You can try out-of-box demos on the wiki first.

Adding API keys to the code

1. Location of Bing Speech API key:

2. Location of Tencent cloud service API key:

3. Location of Twilio API

At this point, you should be able to interact with ReSpeaker.

1. Plug in the speaker

2. Press a button on the back of ReSpeaker Core v.2

3. You should be able to hear the welcome message and be able to interact with the device. You can say “I want to see Bill” to ReSpeaker and then the device will ask your name. (Because a default database has “Bill” entry so the device is able to recognize the name.)

Adding names to the database

1. Access these two files on extracted : “username_en.txt” and “allname.txt”

2. Add names to the list by using the same format as seen on text files.

Changing the message content

1. Search for the message part in the code and change.

2. Make sure your new message structure is exactly the same as in Tencent Cloud, otherwise it will fail.

Project Source File

We uploaded the project resources into Seeedstudio Github.

Voice Reception System Python Script

#!/usr/bin/python
# coding:utf-8
# import Speech SDK

'''
File name: speech.py
'''

from evdev import InputDevice,categorize,ecodes
from ctypes import *
from pixel_ring import pixel_ring
from respeaker.bing_speech_api import BingSpeechAPI,RequestError
from twilio.rest import Client
from twilio.twiml.voice_response import VoiceResponse

from voice_engine.kws import KWS
from voice_engine.ns import NS
from voice_engine.source import Source
from voice_engine.channel_picker import ChannelPicker

from qcloudsms_py import SmsSingleSender
from qcloudsms_py.httpclient import HTTPError

from voice_engine.element import Element
	
import logging
import ConfigParser
import threading,signal
import time
import sys,stat
import os
import re
import json
import copy

if sys.version_info[0] < 3:
    import Queue as queue
else:
    import queue

is_pressed  	= False
is_record		= False
is_music		= False
is_noteinfo 	= False
led_ring		= False
parse_flags 	= False
starttime 		= 0
lasttime  		= 0
recvok			= False

stringcopy = ''




'''
Twilio key
'''
ACCOUNTSID = ''
AUTHTOKEN = ''

'''
Bing key
'''
BING_KEY = ''

'''
tencent key
'''
appid = ''
appkey = ''


'''
LANGUAGE == True (English)Flase = (Chinese, NOT enabled yet)
'''
ENPATH = "/home/respeaker/Voice_Reception_System/username_en.txt"
ALLNAME = "/home/respeaker/Voice_Reception_System/allname.txt"

LANGUAGE = True


'''
Macro Switch
'''
TREAD_ON 	= 1
ON			= 0


'''
TENCENT_TWILIO = True:  TWILIO 
TENCENT_TWILIO = False: Tencent
'''
TENCENT_TWILIO = True



'''
bing Speech to text class
'''
class Bing(Element):
    def __init__(self, key):
        super(Bing, self).__init__()

        self.key = key

        self.queue = queue.Queue()
        self.listening = False
        self.done = False
        self.event = threading.Event()

        self.bing = BingSpeechAPI(BING_KEY)

    def put(self, data):
        if self.listening:
            self.queue.put(data)

    def start(self):
        self.done = False
        thread = threading.Thread(target=self.run)
        thread.daemon = True
        thread.start()

    def stop(self):
        self.done = True

    def listen(self):
        self.listening = True
        self.event.set()

    def run(self):
		while not self.done:
			self.event.wait()

			def gen():
				count = 0
				while count < 16000 * 6:
					data = self.queue.get()
					if not data:
						break

					yield data
					count += len(data) / 2

			#recognize speech using Microsoft Bing Voice Recognition
			try:
				# text = bing.recognize(gen(), language='zh-CN')
				text = self.bing.recognize(gen())
				global stringcopy
				print('Bing:{}'.format(text).encode('utf-8'))
				stringcopy = format(text).encode('utf-8')
				
			except ValueError:
				print('Not recognized')
				global is_music
				is_music = False
					
				time.sleep(1)
				#play_music('mpg123 /home/respeaker/Voice_Reception_System/notfind.mp3')
				play_music('mpg123 /home/respeaker/Voice_Reception_System/not_clear.mp3')
			except RequestError as e:
				print('Network error {}'.format(e))
				time.sleep(1)
				play_music('mpg123 /home/respeaker/Voice_Reception_System/net_error.mp3')

			self.listening = False
			self.event.clear()
			self.queue.queue.clear()
			global recvok,is_record
			recvok = True
			is_record = True
				
		
'''
speech_server_scheduler
'''
class speech_server_scheduler(object):
	
	def __init__(self,_path,_format):
		self.path 	= _path
		self.format = _format
		'''	
		read file 
		'''
	def get_file_content(self):
		with open(self.path, 'rb') as fp:
			return fp.read()
			
	def bing__parse_speech(self):
		global is_music,is_find,count,stringcopy
		try:                      
			bing.listen()
			text = recv_string()
			stringcopy = ''
			'''
			if you want to add new names into username_en.txt and allname.txt, 
			please run the python and check the print out texts, 
			then add the related texts into username_en.txt and allname.txt. 
			for example, if you say "seth", the bing may recoginze as seth, sis, set. 
			please add all of them into username_en.txt and allname.txt. 
			'''
			print('Recognized text %s' % text)
			textparse = text_parse(text)
			
			if text:           
				print('Recognized %s' % text)
				is_find = False
				array = textparse.parse_array(text)
				
				print("---------------------------\n")
				
				convert_lis = textparse.convert_to_list(ALLNAME)
				
				for strd in convert_lis:
					print(strd)
					print("\n")
				
				# start parse voice 
				for i in array:
					for employee in convert_lis:
						if i == employee :
						#do what you want to do
							print(employee)
							print("\n")
							is_find = True
							is_music = False
							'''
							Thank you. What is your name?
							'''
							play_music("mpg123 /home/respeaker/Voice_Reception_System/Thanks.mp3")
							count=2
							
							while(count):
								
								bing.listen()
								time.sleep(0.1)
								
								gestName = recv_string()
								stringcopy = ''
								is_music = True

								is_music = False
								getGestName = ''
								
								if gestName:
									result = []
									str_list = textparse.parse_array_noneuplow(gestName)
									print(len(str_list))
									list_len = len(str_list)
									
									
									if list_len == 1:
										getGestName = str_list[list_len-1]
									elif list_len == 2:
										getGestName = str_list[0]+'.'+str_list[1]
									elif list_len >= 3:
										getGestName = str_list[list_len-2]+'.'+str_list[list_len-1]
										
									print(getGestName)
									
									time.sleep(0.1)
									
									'''
									Thank you. I will send them a message
									'''
									play_music("mpg123 /home/respeaker/Voice_Reception_System/message.mp3")
									time.sleep(0.1)
									employeeNumber = search_user(''.join(employee).lower(),ENPATH,LANGUAGE)
									print(employeeNumber)
									if employeeNumber == 0:
										print("Phone number error\n")
										return 0
									
									if TENCENT_TWILIO == True :
										TwilioSendMessage(employee,getGestName,employeeNumber)
									else:
										tencentSendMesaage(employee,getGestName,employeeNumber)
										
									print("********Bing_English****************")
									
									count = 0
									
									return 0
									
								else:
									count = count -1
									
							if 	count == 0 :
									return 0
									
								
				if is_find == False:
					
					#Oh, sorry, I didn't find it.
					
					is_music = False
				
					play_music('mpg123 /home/respeaker/Voice_Reception_System/notfind.mp3')
				

				
			else:
				return -1
				
		except Exception as e:               
			print(e.message)  
			

'''
text parse
'''			
class text_parse(object):		
	def __init__(self,_text):
		self.text 	= _text
		
	def parse_array(self,text):
		array =[]
		array = text.lower().split(' ')
		return 	array

	def parse_array_noneuplow(self,text):
		array =[]
		array = text.split(' ')
		return 	array
		
	'''
	 list
	'''
	def convert_to_list(self,text):
		result=[]
		fd = file(text, "r")
		
		for line in fd.readlines():
			result.append(''.join(list(line.lower().rstrip().split(','))))
		'''
		for item in result:
			for it in item :
				print(it)
		'''
		fd.close()
		
		return result


'''
Twilio send message
'''	

def TwilioSendMessage(employee,gestName,employeeNumber):
	"""
	Some example usage of different twilio resources.
	"""
	client = Client(ACCOUNTSID, AUTHTOKEN)

	# Get all messages
	all_messages = client.messages.list()
	print('There are {} messages in your account.'.format(len(all_messages)))

	# Get only last 10 messages...
	some_messages = client.messages.list(limit=10)
	print('Here are the last 10 messages in your account:')
	for m in some_messages:
		print(m)

	# Get messages in smaller pages...
	all_messages = client.messages.list(page_size=10)
	print('There are {} messages in your account.'.format(len(all_messages)))
	print('Sending a message...')
	print('+86'+employeeNumber)
	print('Hello,'+employee+'.'+gestName+' is here to see you. Please meet them at the door.')
	#Hello, <employee>. <guest>  is here to see you. Please meet them at the door.
	new_message = client.messages.create(to='+86'+employeeNumber, from_='+15109014046', body='Hello,'+employee+'.'+gestName+' is here to see you. Please meet them at the door.')

'''	
tencent Send Mesaage
'''	

def tencentSendMesaage(employee,gestName,employeeNumber):
	## Enum{0: , 1: }
	sms_type = 0
	template_id = 151036
	params = []
	params.append(employee)
	params.append(employeeNumber)
	
	ssender = SmsSingleSender(appid, appkey)
	try:
		
		
		sendStr = "Hello, "+employee+". "+gestName+" is here to see you. Please meet them at the door."
		print(sendStr)
		print(employeeNumber)
		result = ssender.send(sms_type, 86, employeeNumber,sendStr)
		#result = ssender.send(sms_type, 86, employeeNumber,"Hello,search.jcr is here to see you. Please meet them at the door.")
		# 
		#result = ssender.send_with_param(86, employeeNumber,template_id, params)  
		
	except HTTPError as e:
		print(e)
	except Exception as e:
		print(e)
	
	print('tencnet:{}'.format(result).decode('unicode_escape'))
	

def led_pixel_ring():
	
	pixel_ring.set_brightness(20)
	
	
	while led_ring :
		try:
			pixel_ring.wakeup()
			time.sleep(0.1)
			pixel_ring.off()
			time.sleep(0.1)
			pixel_ring.off()
		except KeyboardInterrupt:
			break

	pixel_ring.off()
	

'''
bing log
'''
def bing_api_call():
	logging.basicConfig(level=logging.DEBUG)
	
def recv_string():
	global recvok 
	while recvok != True:
		time.sleep(0.1)
	
	recvok = False
	print(stringcopy)
	return stringcopy	
		
'''
 parse local file
'''
def search_user(user,path, language):
	print(len(user))
	print(user)
	print(path)
	print(language)
	
	'''
	'''
	cf = ConfigParser.ConfigParser()
	cf.read(path)
	
	if False == language :
		is_ = cf.has_option("db", user[0:len(user)-3])
	else:
		is_ = cf.has_option("db", user)
	
	print(is_)
	
	if is_ :
		if False == language :
			db_phone = cf.get("db", user[0:len(user)-3])
		else:
			db_phone = cf.get("db", user)
			
		print(db_phone)
		
		if db_phone.isdigit() :
			return db_phone
		else:
			return 0;
	else:
		return 0;
		
def bing_init(self):
	text = ""
	bing = BingSpeechAPI(key=BING_KEY)
	try:                      
		fd = open(self.path)
		conect = fd.read(-1)
		text = bing.recognize(conect)
		fd.close()
		return text
	except Exception as e:               
			print(e.message)
			return text
			

def pre_play(v):
    global is_pressed,is_music,is_record,is_noteinfo,led_ring,parse_flags
		
    while True:
        if is_pressed  == True:
			'''
			Welcome to x.factory. Who are you here to see?
			'''
			os.system("mpg123 /home/respeaker/Voice_Reception_System/welcome.mp3") 
			is_pressed	= False
						
			try:
				#global is_music				
				is_music	= True
				par = speech_server_scheduler('/home/respeaker/Voice_Reception_System/test.wav', 'wav')
				
				'''
				bing api call
				'''
				
				if par.bing__parse_speech() == 0 :
					is_music	= False	
					parse_flags = False
				else:
					parse_flags = True
						
				is_music	= False
					
			except Exception,e:
				is_music 	= False
				is_noteinfo = False
				print e.message
			led_ring 	= False
	time.sleep(0.1)


'''
guest press buttom handler
'''		
def key_hander(v):
	global is_pressed,is_noteinfo,led_ring,starttime,lasttime	
	key = InputDevice("/dev/input/event0")
	for event in key.read_loop():
		
		if event.type == ecodes.EV_KEY:
			#print(categorize(event)) 
			
			
			'''
			hold time
			'''
			if event.value == 2 :
				lasttime = event.sec
				sp_time = lasttime - starttime 
				'''
				down
				'''
			elif event.value == 1 :
				starttime = event.sec
				'''
				up
				'''
			else:
				led_ring 	= True
				is_pressed 	= True
				is_noteinfo = True
				starttime = 0
				lasttime  = 0
				
							
def play_music(strInfo):
		os.system(strInfo)
	
'''
 CTRL + C
'''
def CtrlC(signum, frame):
	os.kill(os.getpid(),signal.SIGKILL)
	
'''
 main  start ...
'''
if __name__ == "__main__":
	try:
		src = Source(channels=8)
		ch0 = ChannelPicker(channels=src.channels, pick=0)
		ns = NS()
		
		bing = Bing(BING_KEY)
		
		src.pipeline(ch0, ns, bing)
		
		bing_api_call()
		

		signal.signal(signal.SIGINT, CtrlC)
		signal.signal(signal.SIGTERM, CtrlC)

		thread_key  	= threading.Thread(target=key_hander,args=(1,))
		thread_play 	= threading.Thread(target=pre_play, args=(1,))
		
		
		thread_key.setDaemon(True)
		thread_play.setDaemon(True)
		
		
		thread_key.start()
		thread_play.start()
			
		
		src.recursive_start()
		
		if TREAD_ON == ON :
			thread_key.join()
			thread_play.join()
			
			
			
		else :
			while True:
				led_pixel_ring()
				time.sleep(1)
				pass
			
	except Exception,exc:
		print exc