Last year, overwhelmed by the number of times my doorbell interrupted me, I decided to modify it so that ChatGPT could handle visitors directly. That project, called RinGPT, could engage in endless conversations with visitors while notifying me remotely. RinGPT also detected keywords and provided predefined responses.
This year, after working with LLMs, I discovered the concept of AI agents and found it interesting to upgrade RinGPT into a smart receptionist. To test its functionality, I built a small-scale model using a few Legos and a prototype setup.
What is an AI Agent?An AI agent is an autonomous entity that perceives its environment through sensors, processes information, and takes actions to achieve specific goals. It can be software-based (e.g., a chatbot or recommendation system) or integrated into hardware (e.g., a robot or self-driving car).
In this project, the AI agent has access to a set of tools and can autonomously decide which ones to use and in what order, based on its objective.
Project OverviewThis is an experimental prototype. Although it uses AI agents, it does not rely on a dedicated agent framework. Instead, it is built with plain Python and multiple API calls to OpenAI. Despite its simplicity, the project involves a significant amount of code and stands out for interacting with both software and physical components, such as sensor readings and door control.
HardwareThe project is built on a Unihiker board. For those unfamiliar, Unihiker is similar to a Raspberry Pi but comes with a touchscreen, integrated sensors, buttons, and a preinstalled operating system.
To enhance functionality, I connected:
- An optional IO board
- A retained LED push button
- An SG90 servo motor
- A Bluetooth speaker (paired for audio output)Since the Unihiker has a built-in microphone, no additional audio hardware was needed. I also built a small cardboard model of an entrance door for testing.
Software Setup
To configure Unihiker’s WiFi access:
Connect it to a computer via USB.
Connect it to a computer via USB.Access 10.1.2.3 to configure the 2.4GHz WiFi network.
Once connected, SSH into the assigned IP using:
ssh root@<assigned_ip>
password: dfrobot
Required LibrariesInstall the necessary Python libraries using:
pip install openai speech_recognition edge_tts art asyncio textwrap
API Keys & CredentialsYou need to obtain:
An OpenAI API Key from OpenAI PlatformA Telegram Chat ID and Token (this process can be tedious, but tutorials are widely available)Bluetooth Speaker Setup
Pair the Bluetooth speaker once using:
bluetoothctl
default-agent
power on
scan on
trust 00:00:00:00:00:00
pair 00:00:00:00:00:00
connect 00:00:00:00:00:00
Replace 00:00:00:00:00:00
with the speaker’s MAC address.
Button Pressed → The system plays the doorbell sound and greets the visitor.
Records Visitor’s Speech → Converts speech to text using SpeechRecognition
Identifies Visitor Name → Calls OpenAI API with function calling to extract the visitor’s name.
Selects Appropriate Tools → Calls OpenAI again to decide which tools to use.
Executes Selected Tools → Runs functions for checking schedules, light levels, etc.
Determines Final Action → Calls OpenAI again to decide whether to open the door or send a Telegram notification.
Generates Response → Uses ChatGPT to generate a final response.
The AI agent has access to the following tools:
def agenda(name):
def getDayTime():
def getLightConditions():
Additional functions, like unlocking the door or sending a Telegram notification, are handled within the program's flow but are not yet directly assigned to the agent (though they could be in future versions).
The agenda function is currently hardcoded but could easily integrate Google Calendar.
The getLightConditions function is somewhat arbitrary but was included to make use of Unihiker’s light sensor.
LLM Calls & ProcessingNot all calls to OpenAI are the same.
First Call:Extracts the visitor’s name, regardless of how they phrase it, using OpenAI function calling.
Subsequent Calls: Some responses are in JSON format, while others are plain text.
Running the ScriptTo launch RinGPT, run:
python ringpt1.py
To set it to auto-start, follow the instructions here.
LoggingSince RinGPT operates autonomously, a log records all actions and interactions for monitoring purposes.
Demo
Conclusions
Unlike traditional heuristic programming, where actions are predictable, LLMs introduce an element of uncertainty. During testing, the same prompt and code sometimes yielded unexpected results, such as:
Providing extra information in a JSON response and refusing to choose tools due to “uncertainty.” After refining the prompts, I was able to get consistent and reliable results.
Beyond this simple example, the potential of AI agents is huge, especially with models that support fine-tuning and retrieval-augmented generation (RAG).
The real question is: How much autonomy are we willing to give our AI agents?
Comments
Please log in or sign up to comment.