An inherent advantage pertaining to problems lies in their capacity to serve as wellsprings for ideas. In my case, a classic wired doorbell does not work as it is supposed to. Sometimes it does not even sound and when it does, it cannot be heard from certain rooms. So I thought about replacing the whole thing with a custom doorbell made with ESP32, a push button, WiFi notifications and two or three buzzers or speakers.
Then I thought I could take visitor’s pictures so I don’t have to trust their voices. I could even get rid of the push button: Machine Learning makes it possible to recognize a face in front of the door.
I have recently received a Xiao ESP32S3, a tiny yet complete board from Seeed Studio that was perfect for this job: it comes with a good camera, WiFi and a processor powerful enough for Machine Learning.
Parts required- Seeed Studio Xiao ESP32S3 sense with cam $13.99
- 8GB microSD Card $3
Snap the antenna to the board
Snap the camera to the board
Format the microSD as Fat32 and place it in the microSD slot
Software configurationOn Arduino IDE, go to File > Preferences, Additional Boards Manager URL and add:
https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_dev_index.json
Go to Tools > Board > Boards Manager, enter esp32, select and install ESP32 package.
Connect with USBC cable and select the Board XIAO ESP32S3. Select also OPI PSRAM
Install UniversalTelegramBot library Sketch, Include Library, Manage Libraries
Download the.ino file and the Machine Learning model. Go to Sketch, Include library, Add Zip file for the zip model.
Open
Arduino/Documents/libraries/Face_detection_inferencing/scr/edge-impulse-sdk/classifier/ei_classifier_config.h
Locate the line with #define EI_CLASSIFIER_TFLITE_ENABLE_ESP_NN 1, and change it from 1 to 0:
Thanks Marcelo Rovai for the details about running Edge Impulse in ESP32S3 sense.
Model trainingHow can we recognize one or several faces versus a background or other objects? Edge Impulse has an algoritm named FOMO that is perfect for this purpose “Edge Impulse FOMO (Faster Objects, More Objects) is a novel machine learning algorithm that brings object detection to highly constrained devices. It lets you count objects, find the location of objects in an image, and track multiple objects in real-time”
There is no need to take your own pictures since there are many face datasets available. I just cloned an existing Edge Impulse project with a faces dataset and that was enough. If you want to, you can take your own pictures and train a new model with these instructions:
- Take around 400 pictures of faces.
- Select the bounding box and label them as “face”
- Create an Impulse with Image Data 96x96px. Image processing block.
- Object detection as a learning block.
- Try 70 training cycles with 0.00015 learning rate.
- Finally, deploy the model as an Arduino library and save the Zip file.
Enter your WiFi credentials
#define WIFI_SSID ""
#define WIFI_PASSWORD ""
- Go to Telegram App, search botfather, send /start send /newbot
- Go to t.me/sandoombot and get your token
- Search idBot in Telegram App
- Send /getId
- Send a message to the bot created searching by link
- Create a Group and add the bot
- Load https://api.telegram.org/botXXXXXXX:YYYYYYY/getUpdates
- Extract the ID, like id: -1234567890
Enter Token and ID in these settings
#define BOT_TOKEN ""
String chat_id="";
Upload the ino to the Xiao ESP32S3 Sense board.
How does it work- The Xiao ESP32S3 sense takes a picture
- The picture is sent to Edge Impulse Machine Learning model for inference
- If at least one face is detected with FOMO algorithm, the on board led will be turned on, the image will be saved into the microSD card and a notification will be sent to the family Telegram group with the attached picture and the detection %.
I have a Wioterminal and I wanted to use it as a screen for this doorbell, so even without the Smartphone you can check who is behind the door. That was not possible. Why? I really like Seeed products but some of their hardware is released without tested software and accurate documentation. I have experienced this with Groove AI Camera and now with Wioterminal. To use some basic features, the user is requested to execute complicated firmware updates. Example: Wioterminal as it comes cannot use WiFi or Bluetooth. The procedure to update the device and use those features for Windows is the following:
- Download https://github.com/Seeed-Studio/ambd_flash_tool
- Navigate through folders that are not correctly stated in the documentation (“While you are inside the ambd_flash_tool directory” There is a ambd_flash_tool-master and ambd_flash_tool-master/tool)
- Run: ambd_flash_tool.exe erase
- Run: ambd_flash_tool.exe flash
At this point Wioterminal shows “Burn RTL8720 fw” and you are not sure what to do. Some external page say that you should press the upper button until USB to serial is displayed. Then upload the following code to check the firmware version. The code does not work. WiFi still does not work. Bluetooth still does not work :(
#include "rpcWiFi.h"
void setup() {
Serial.begin(115200);
while(!Serial); // Wait to open Serial Monitor
Serial.printf("RTL8720 Firmware Version: %s", rpc_system_version());
}
void loop() {
}
Where to go from hereThe most complicated parts were solved and they were not really complicated. A tiny device assisted by Machine Learning is capable of replacing a classic doorbell adding also useful features.
There is no bell yet but that can be added easily: in the same ESP32S3 with a relay or a buzzer. Remotely: one or several ESP32 reading the same Telegram group and activating a relay with a loud speaker. Another option is to communicate through Bluetooth.
Other enhancements: counting how many people are in front of the door, recognizing family and friends and playing an mp3 informing to the visitor that the door will soon be opened.
View alsoMachine Learning projects with Arduino, ESP and Raspberry
Comments