This is Part 3 of a series extending an AI-powered conversational speaker. If you haven't yet, complete parts 1 and 2 first:
- AI Conversation Speaker aka Friend Bot: Part 1 Conversation - Hackster.io
- AI Conversation Speaker aka Friend Bot: Part 2 Wake Word - Hackster.io
The Conversational Speaker, informally known as "Friend Bot", uses a Raspberry Pi to enable a spoken conversation with OpenAI large language models. This implementation waits for a wake phrase, listens to speech, processes the conversation through the OpenAI service, and responds back with emotional style.
For more information on the prompt engine used for maintaining conversation context, go here: python, typescript, dotnet.
For more information about prompt design in general, checkout OpenAI's documentation on the subject: https://beta.openai.com/docs/guides/completion/prompt-design.
This project is written in.NET 6 which supports Raspberry Pi OS, Linux, macOS, and Windows.
Build time: 5 minutes
Read time: 5 minutes
The Code1. Using the repo you cloned in part 1 or 2, checkout the hackster-tutorial-3 branch.
cd ~/conversational-speaker
git checkout hackster-tutorial-3
2. Build and run the new code!
cd ~/conversational-speaker/src/ConversationalSpeaker
dotnet build
dotnet run
cd ~/conversational-speaker/src/ConversationalSpeaker
dotnet build
dotnet run
UsageTake a look at Azure Cognitive Service's style support page to see which languages support which emotional styles, then play with the "AzureCognitiveServices:SpeechSynthesisVoiceName
" and "PromptEngine
" settings in src/ConverstationalSpeaker/configuration.json
.
In the PromptEngine
section of src/ConverstationalSpeaker/configuration.json,
we've added a longer description as well as a few interaction examples. The description, the examples, and recent interactions with the speaker are all sent to OpenAI's GPT-3 models as a way of providing a guide on how you want the AI to respond. In this case, we are instructing it to append specific style modifiers to the end of each response. We then give it a few examples of what that looks like in the hopes that it will replicate the pattern.
### Computer is a friendly, intelligent, and curious person who is good at conversation. Each response from Computer should end with a word representing Computer's emotion state, such as ~~chat~~, ~~angry~~, ~~cheerful~~, ~~sad~~, ~~excited~~, ~~friendly~~, ~~terrified~~, ~~shouting~~, ~~unfriendly~~, ~~whispering~~, and ~~hopeful~~.
Human: Hello
Computer: Hello! How are you? ~~friendly~~
Human: I am doing well, how about you?
Computer: I am doing very well! ~~excited~~
Human: What are you up to?
Computer: I am just hanging out and talking to people. ~~chat~~
How It WorksFor more details on how the code words, check out the README.
Have fun
Comments