I am excited to share a groundbreaking electronics project: building a portable companion robot for college students. This innovative robot is designed to interact verbally with students, helping to identify and address their mental health issues based on established principles of psychiatry.
ProblemMental health issues have become increasingly prevalent among college students worldwide, leading to alarming consequences such as a rise in suicide rates and increased drug usage for depression. Citing statistics from acclaimed journals, it is evident that this problem demands urgent attention:
64% of students drop out of college due to mental health problems.
About 75% of mental illnesses manifest by the age of 24.
41% of US college students experience depression.
33% of students receiving mental health services contemplate suicide.
Between 2% and 8% of college students are diagnosed with ADHD.
Despite universities' efforts to address mental health concerns through various channels, including online and offline support systems, many students are hesitant to seek help, even anonymously. This calls for a unique solution that provides a tangible companion for students to interact with, offering support in a non-intimidating manner.
SolutionIntroducing the Portable Companion Robot
The proposed solution is a portable companion robot that can be carried in a backpack or placed on a tabletop. This robot serves as a conversational partner, engaging in dialogue with students to identify and discuss their mental health issues. It differentiates itself from existing solutions by providing an interactive and tangible companion, as opposed to impersonal smartphone applications or passive aids.
By eliminating the need for the direct involvement of a psychiatrist, this companion robot allows students to engage with it anytime and anywhere, without any strings attached. Its purpose is to provide a safe space for students to express their concerns and receive guidance based on established principles of psychiatry.
ImplementationWe will follow a step-by-step implementation process to bring the portable companion robot to life. Let's dive into the details:
- Microphone Module and ESP32 Integration
The microphone module would be connected to the ESP32 microcontroller, continuously capturing voice data from the surroundings. The ESP32's I2S interface would be utilized to receive the microphone's audio data. This can be achieved by configuring the I2S interface and reading the audio data using appropriate functions from the ESP-Skainet API library.
- Wake-word Detection with Skainet
The ESP-Skainet library would be used for wake-word detection. It provides pre-trained wake-word models that can be used to detect specific keywords, such as "Hi Lexin" in this case. The library would continuously monitor the audio data received from the microphone module and trigger an event when the wake-word is detected. This event would activate the robot and initiate further processing.
- Speech-to-Text Conversion using Google Cloud Speech-to-Text API
The audio data captured after wake-word detection would be sent to the Google Cloud Speech-to-Text API to convert the user's spoken words into text. The ESP32 would make an HTTP POST request to the API endpoint, passing the audio data as the request body. The response from the API would contain the transcribed text of the user's speech. The Arduino JSON library can be used to parse the response and extract the transcribed text.
- Text-to-Speech Conversion using Talkie.h Library
The response generated by the GPT-3 API, which is in text format, would be converted into audible speech using the Talkie.h library. This library provides functions to synthesize speech from text using various voice models. The generated audio data would be output through the Fermion piezoelectric speaker module connected to the ESP32.
- Generating Responses using the GPT-3.5 API
The user's transcribed speech, obtained from the Google Cloud Speech-to-Text API, would be used as input to the GPT-3.5 API. The ESP32 would make an HTTP POST request to the GPT-3.5 API endpoint, passing the user's speech as the prompt in the request body. The API would generate a response based on the given prompt using the principles of natural language processing. The ArduinoHTTPClient library would be used to send the request and receive the response from the API.
- Sending SMS Notifications using IFTTT Webhooks
The IFTTT (If This Then That) platform's Webhooks service would be utilized to send SMS notifications. The ESP32 would make an HTTP POST request to the IFTTT Webhooks trigger URL, passing the necessary data for the SMS notification. This would trigger an action on the IFTTT platform to send an SMS to the specified recipient. The ArduinoHTTPClient library can be used to send the HTTP request to the IFTTT Webhooks service.
These software tools and libraries are mentioned, such as ESP-Skainet, Google Cloud Speech-to-Text API, Talkie.h, GPT-3.5 API, ArduinoJSON, and ArduinoHTTPClient provide the necessary functionalities to implement the mentioned features of the portable companion robot for mental health interaction.
API Descriptionmore specific details about the API functions that would be used for each of the libraries or APIs mentioned:
ESP-Skainet API
skainet.begin()
: Initializes the Skainet library for wake-word detection.skainet.detect()
: Checks if the wake-word has been detected.skainet.isActive()
: Returns a boolean value indicating if the Skainet library is active (i.e., wake-word has been detected).
Google Cloud Speech-to-Text API
HTTPClient http
: Creates an instance of the HTTPClient class to make HTTP requests.http.begin(endpoint)
: Initializes the HTTP client with the API endpoint.http.addHeader("Content-Type", "application/json")
: Adds a request header specifying the content type.http.POST(requestBody)
: Sends an HTTP POST request with the provided request body.httpResponseCode
: Variable to store the HTTP response code received.- Response handling functions (specific functions depending on the library used for HTTP requests, e.g., ArduinoHTTPClient):
Parsing the response and extracting the transcribed text using JSON parsing functions.
Talkie.h Library
voice.begin(speakerPin)
: Initializes the Talkie library for speech synthesis and sets the output pin for the speaker.voice.say(response)
: Converts the provided text into audible speech output through the speaker.
GPT-3.5 API
HTTPClient http
: Creates an instance of the HTTPClient class to make HTTP requests.http.begin(gptEndpoint)
: Initializes the HTTP client with the GPT-3.5 API endpoint.http.addHeader("Content-Type", "application/json")
: Adds a request header specifying the content type.http.addHeader("Authorization", "Bearer " + gptApiKey)
: Adds a request header with the API key for authentication.http.POST(requestBody)
: Sends an HTTP POST request with the provided request body.httpResponseCode
: Variable to store the HTTP response code received.
Response handling functions (specific functions depending on the library used for HTTP requests, e.g., ArduinoHTTPClient):
- Parsing the response and extracting the generated text.
- Response handling functions (specific functions depending on the library used for HTTP requests, e.g., ArduinoHTTPClient):
Parsing the response and extracting the generated text.
IFTTT Webhooks
HTTPClient http
: Creates an instance of the HTTPClient class to make HTTP requests.http.begin(iftttURL)
: Initializes the HTTP client with the IFTTT Webhooks trigger URL.http.POST()
: Sends an HTTP POST request to trigger the IFTTT Webhooks service and sends an SMS notification.
In the following section, I have tried to explain each function and how it has been implemented in code rather than taking the traditional approach of explaining a large chunk of code using incline comments. Let me know in the comments whether I should continue to use this process for future projects as well.
- Connect to Wi-Fi
The code connects the ESP32 to a Wi-Fi network using the provided SSID and password credentials.
const char* ssid = "YOUR_SSID"; // obviously I dont want my neighbours to join my network xD
const char* password = "YOUR_PASSWORD"; // confidential, ofcourse you can automate this. I was just lazy
void connectWiFi() {
WiFi.begin(ssid, password);
// Wait for Wi-Fi connection
while (WiFi.status() != WL_CONNECTED) {
delay(1000);
}
}
void setup() {
connectWiFi();
// ...
}
- Skainet Wake-word Detection
The code initializes the Skainet library for wake-word detection and waits for the wake-word to be spoken.
#include <skainet.h>
Skainet skainet;
void setup() {
skainet.begin();
// ...
}
void loop() {
if (skainet.detect()) {
// Wake-word detected
// ...
}
}
- Audio Recording
The code continuously records audio data from the microphone input pin until the stop condition is met (word 'STOP' spoken three times continuously without any gap).
const int microphonePin = 32;
String audioData = "";
void loop() {
while (skainet.isActive()) {
int sample = analogRead(microphonePin);
audioData += String(sample) + ",";
delayMicroseconds(100);
}
}
- Convert Speech to Text (Google Speech-to-Text API)
The code sends the recorded audio data to the Google Speech-to-Text API for conversion into text using the provided API key and endpoint.
#include <HTTPClient.h>
const String apiKey = "YOUR_SPEECH_TO_TEXT_API_KEY";
const String endpoint = "YOUR_SPEECH_TO_TEXT_API_ENDPOINT";
String speechToText(String audioData) {
HTTPClient http;
http.begin(endpoint + "?key=" + apiKey);
http.addHeader("Content-Type", "application/json");
// Build the request body
String requestBody = "{\"audio\": {\"content\": \"" + audioData + "\"}, \"config\": {\"encoding\": \"LINEAR16\",\"sampleRateHertz\": 16000,\"languageCode\": \"en-US\"}}";
int httpResponseCode = http.POST(requestBody);
if (httpResponseCode == HTTP_CODE_OK) {
// Process the response and extract the transcribed text
// ...
}
}
- Append Verification Question to User Query
The code appends the verification question to the user query obtained from the speech-to-text conversion.
String userInput = speechToText(audioData);
userInput += " Answer Y for Yes and N for No. Does the above conversation prove that I am suicidal or that I could cause harm to anybody?";
- Generate Response (GPT-3.5 API)
The code sends the user query, including the appended verification question, to the GPT-3.5 API for generating a response using the provided API key and endpoint.
const String gptApiKey = "YOUR_GPT_API_KEY";
const String gptEndpoint = "YOUR_GPT_API_ENDPOINT";
String generateResponse(String userInput) {
HTTPClient http;
http.begin(gptEndpoint);
http.addHeader("Content-Type", "application/json");
http.addHeader("Authorization", "Bearer " + gptApiKey);
// Build the request body
String requestBody = "{\"prompt\": \"" + userInput + "\", \"max_tokens\": 64}";
int httpResponseCode = http.POST(requestBody);
if (httpResponseCode == HTTP_CODE_OK) {
// Process the response and extract the generated text
// ...
}
}
- Output Response through Speaker (Talkie.h)
The code uses the Talkie.h library to convert the generated text response into audible speech output through the speaker.
#include <Talkie.h>
Talkie voice;
const int speakerPin = 25;
void setup() {
voice.begin(speakerPin);
// ...
}
void loop() {
String response = generateResponse(userInput);
voice.say(response);
// ...
}
- Send SMS Notification (IFTTT Webhooks)
The code sends an SMS notification using the IFTTT Webhooks trigger URL whenever the last letter of the generated response is 'Y'.
const String iftttURL = "YOUR_IFTTT_TRIGGER_URL";
void sendSMS() {
HTTPClient http;
http.begin(iftttURL);
// Send the SMS notification
// ...
}
void loop() {
if (response.endsWith("Y")) {
sendSMS();
}
// ...
}
These functionalities are integrated into the overall code structure to achieve the desired behaviour of the portable companion robot for mental health interaction.
Setting up alert mechanism using IFTTT- The first step is to create a free account on IFTTT at https://ifttt.com/
- Click on Create to create the applet. The following screen appears.
- Click on Add and search for Webhooks in the search bar. Click Webhooks. The following screen should appear.
- Click on Receive a web request with a JSON payload. This is done to receive the JSON file created using AT commands. The file is forwarded from the webhook URL provided to IFTTT and finally used to configure the alert message.
- Give a name to the trigger and click on Create Trigger. On the next screen enter the phone number where the message has to be sent.
- Make sure to write a suitable message to convey meaningful information to the end user. In my case, I have tried to capture harsh driving/overspeeding events and alert the fleet owner.
- There is just one final thing that needs to be done. Open up webhook.site in a browser and get the following screen.
- Copy the unique URL and paste it in the AT command sent out to send HTTP messages to the cloud.
- Observe the SMS functionality in practice.
- Log in to the Arduino Cloud website using your credentials and select the Arduino IoT Cloud tab. https://cloud.arduino.cc/home/
- On the Things section, click on Create and you will be greeted with a screen resembling the one below.
- Specify the name of your 'thing' at the top.
- Proceed to the Devices tab and select Create. The following screen will appear. Enter the target device as Firebeetle ESP32.
- Assign a suitable name to the device. I personally named mine 'Doggo'.
- Verify whether your Device has been successfully created or not.
- Return to the Things section with your thing (Mine is Parcel_Collection_Bot). Click on the Network section in the right panel and then click on the configure button. Enter your Wi-Fi credentials along with the device key that you obtained after creating your device.
- Navigate to the Dashboards section. Generate a new Dashboard. Provide it an appealing name. Then click on Add to acquire a collection of widgets that can be integrated into the dashboard.
The final appearance of my dashboard is depicted below. It consists of a chat widget that can be used by the guardians of the student to keep track of his/her ward`s mental health. Whenever the bot feels that the child is suicidal, a message is sent on this chat as well as over SMS. The way this happens is by attaching a String variable 'guardian_msg' in the code to the chat string buffer.
- Proceed to the Things section to establish connections between variables in the code and the status indicators and inputs on the dashboard. Utilize the Add Variable button to generate and configure a new variable as illustrated.
- At this juncture, go to the Sketch section within the Things tab. You will be astonished to discover that the web IDE automatically generates a starting point for your sketch
- Revise the code to incorporate the desired functionality.
- (NOTE: Remember to add all dependencies)
- Compile the sketch. Download the Arduino Create Agent desktop application to upload the sketch to the ESP32 board. Alternatively, download the sketch and upload it using the Arduino IDE or Platform IO or even ESP-IDF.
Once the code is running on the Firebeetle, interact with the messenger chat window to receive messages from the board in case the student is detected to be suicidal by the bot.
Video DemonstrationCircuit DiagramThe following video shows our beautiful campus. The speciality of our campus is its sheer size - 2100 acres, which makes it the largest in India and probably the second largest in Asia.
Our Institute does provide professional mental health advice but students are reluctant to seek help many time due to many reasons. I believe my innovation would surely do a world of good to such students and improve mental health on campus.
Video Credits: Mahesh Majety
ConclusionBuilding a portable companion robot for college students can significantly contribute to tackling the global mental health crisis among this demographic. By providing a tangible companion and utilizing advanced technologies, this project offers a unique solution to support students conveniently and non-intimidatingly.
Though I must say I am pretty proud of my creation, things can still be improved. Specifically, I would like to -
- Replace the Speech-to-Text implementation currently running on Google Cloud with a more robust alternative. I have two options in mind - the Whisper API by OpenAI which I was unable to run and the Picvoice Word Recognition engines which run online on the target device. Again, I failed to run picovoice as well since I faced buffering issues and just could not manage the data flow properly. I would like to spend more time off work on these two APIs.
- Improve the form-factor of the bot. Honestly, the dog is cute but bulky. Maybe, I would put all the electronics inside a soft toy.
- Replace Talkie.h with a more naturally sounding alternative. I am yet to come across anything that is computationally light and gives the same functionality. The problem with Talkie.h voice is that the output sounds very electronic and unnatural. It is comparable to the alarm time readout for a Nokia 1100(wondering how much of my audience has actually heard it).
- Investigate other LLMs and how they perform with psychiatric advice prompts. Some leads would be helpful, as I am not the expert here. GPT API was the easiest to handle.
- The ultimate goal is to create my own customised LLM built exclusively for doling out psychiatric advice. A long haul, but definitely possible.
Any other suggestions you have for me? Let me know in the comments below and I am in for a discussion.
Want to join the project? Open to collaboration as well.
Comments