Pursuing a computer science major can sometimes feel isolating, with long hours spent coding. After graduation, many students enter software engineering or other tech-related careers, often involving extended hours of screen time. To make these experiences more enjoyable, we created Magical Pet—a delightful turtle companion that interacts with your emotions in a fun and engaging way.
Magical Pet features both audio and visual modes, enabling it to interpret your emotions through sound or video. Having this charming companion on your desk adds a cheerful presence to your coding sessions, brightening your day while supporting your emotional well-being.
Related products, work and inspirationsWe drew inspiration from the Keepon toy (https://beatbots.net/my-keepon), a small, adorable device that moves in response to beats and touch. Similarly, we envisioned creating a toy that responds to emotions through movement but with additional features. For example, our toy leverages OpenAI’s Large Language Models to listen to what you say and respond with a voice, adding a more interactive and personalized experience.
Flow Chart of System DesignWe wanted to create a robot turtle that serves as an emotional companion by responding to user emotions with movements, colors, sounds, and small surprises. We wanted to use video input for facial expression analysis and audio input for sentiment detection, and the turtle would identify emotions and react empathetically. For example, it would glow cheerful colors and reveal playful tokens when the user is happy, or it would emit red light and offer tissues when the user is sad. We planned for the system to combine OpenCV, sentiment analysis tools, servo motors, RGB LEDs, speakers, and motorized compartments for dynamic, thoughtful responses.
Summary of Milestone 2We developed a 3D-printed, interactive box with two lids that open or close based on the user's detected emotions. The lids are connected to the box using chopsticks, allowing them to rotate up and down, and are operated by a single servo motor. This motor rotates 60 degrees in either direction, acting as a seesaw to open the appropriate lid. Rubber bands and strings ensure the lids close securely when not in use.
When the application starts, users can choose between two modes for emotion detection: video mode or audio mode.
Video Mode: The laptop's camera uses the FER (Facial Emotion Recognition) library to analyze facial expressions and classify emotions into six categories: fear, neutral, happy, sad, anger, and disgust. Positive emotions (e.g., happiness) open the treat lid, accompanied by a green LED light, while negative emotions (e.g., fear, sadness, anger, disgust) open the tissue lid, with a red LED light. Neutral emotions keep the box closed, with no lights activated.
Audio Mode: The laptop's microphone records a 3-second audio clip, which is transcribed and analyzed using a large language model (LLM) to detect positive or negative sentiment. Positive sentiment opens the treat lid, and negative sentiment opens the tissue lid. Additionally, the LLM acts as a virtual counselor, generating supportive responses based on the user's detected emotions. These responses are saved as audio files and can be played back.
The entire application runs within a Python virtual environment, ensuring all dependencies are properly installed and managed.
We refined the project by integrating all components into a turtle plushie, creating a more user-friendly and visually appealing design. To achieve this, we made precise cuts into the plushie to embed hardware components while leaving flaps that allow access to the internal setup without compromising the turtle’s aesthetic. We also added push buttons to control whether the turtle will run video or audio mode, with red for video and green for audio. This allows for all control to be centralized on the turtle itself, rather than having to pass in "audio" or "video" as an argument to the run command in the terminal. Pushing the red or green button will kill the previous process and run the new audio or video mode process. The microphone and speaker functionality gave us problems, leading us to switch to Bluetooth speakers and microphone for improved reliability. The camera is a wired webcam discreetly installed within the plushie. We also designed and 3D-printed a bottom box to house the wires and a single ESP32 microcontroller. After several iterations to ensure the correct dimensions, the box was successfully fitted inside the turtle's stomach area. We equipped the turtle with wheels powered by two gear motors, enabling forward and backward movement. The back wheels had rubber bands on them to improve traction. RGBW LED lights controlled by the ESP32 were programmed to display a rainbow pattern for happy emotions and red for negative ones, further enhancing the turtle's expressive capabilities and interactivity.
Structural and Hardware ComponentsA servo motor is used to control the movement of the interactive lids. This motor rotates up to 60 degrees in either direction, allowing the lids to open and close based on the detected emotions.
When we begin the application, the user presses the red button for video mode and the green button for audio mode. If the turtle is in the middle of video mode, you press the green button to kill the video mode process and then it will begin the audio mode.
To give the turtle mobility, two gear motors are integrated into the design to power its wheels. These motors are capable of moving the turtle forwards or backward. The motors are controlled via the ESP32, allowing for easy integration and communication with the other components.
The ESP32 microcontroller is the central control unit for the project, coordinating all the components. It controls the servo motor for lid movements, the gear motors for the turtle’s wheels, and manages the RGBW LEDs that display mood-based colors.
RGBW LEDs are embedded into the turtle's shell and serve as a visual representation of the turtle's emotional state. These LEDs change colors to match the detected emotion—turning rainbow-colored when a happy emotion is detected and red for negative emotions.
Due to initial issues with the hardware microphone and speaker setup, we switched to a Bluetooth speaker and microphone for improved sound quality and reliability. The wired webcam is used for capturing the user’s facial expressions, enabling emotion detection via facial recognition.
We utilize the Python Facial Emotion (FER) Recognition library to analyze facial expressions and detect whether the user is experiencing positive or negative emotions. This allows the turtle to respond accordingly, based on the user's emotional state.
To bridge the communication between the Python application and the hardware, we use PlatformIO and PySerial. These tools enable seamless integration between the C++ Arduino code and the Python software, allowing signals to be passed back and forth for controlling the servo motors, LEDs, and other components.
We leveraged OpenAI’s API to create a large language model that processes audio input from the user. Acting as a counselor, the model responds with words of encouragement based on the detected sentiment. This interaction adds a layer of emotional support, where the turtle provides comforting and uplifting feedback to the use
Comments
Please log in or sign up to comment.