It's great to be back on sharing a project here. It's been too long.
So, my company just got a free ESP-EYE development board from Espressif (the maker of ESP32, ESP8266 chip) itself. It's developed officially by Espresif as an attempt to quickly get started to create image recognition or audio processing-related application.
Espressif develops a sample application and libraries that makes the most use of the board, namely: esp-who. To me, it's an awesome project that shows how to do speech recognition and face recognition, all done at the edge or on the board itself, not in the cloud.
While esp-who is great for making use of the board and embracing edge intelligence, I want to do something else. As a Microsoft Most Valuable Professional (MVP) of Microsoft Azure (my profile), I want to make use of Azure services, specifically Azure Custom Vision, to be used as cloud-based object detection and classification engine. It is exactly the reason I created this repo.
I did a live coding to show the step by step how to develop the firmware from scratch and show how to set up Azure Custom Vision. There are four videos and indeed long ones with the total of 5 hours. If you're keen to know the details, go to this YouTube playlist.
To see what we can do with the project, I made a video that shows you how we can do "live" face recognition:
Supported BoardsThis project has been tested using following boards:
- ESP-EYE
- ESP-WROVER-KIT
- TTGO T-Camera
- You should be able to use it for another board with camera. Just adapt the code yourself, PRs are always welcome.
This image shows the architecture of the project:
- Clone the project's repo, recursively:
git clone --recursive https://github.com/andriyadi/esp32-custom-vision.git
- If you clone the project without
--recursive
flag, please go to theesp32-custom-vision
directory and run command this command to update submodules which it depends on:git submodule update --init --recursive
- Create
secrets.h
file insidemain
folder. Explained below. - On Terminal/Console, in root folder, do
make menuconfig
. Go toApp Configuration
-->Select Camera Dev Board (ESP-EYE)
. Here you can select the development board, either: ESP-EYE, ESP-WROVER-KIT, or TTGO T-Camera. Exit and save the menuconfig. - Still in root folder, try to build first to know any issues by typing
make -j8
. Fingers crossed :) - If all's good, then you can flash the firmware to the board by typing
make flash monitor
on Console.
Under main
folder, create a new file named secrets.h
with the content:
#ifndef MAIN_SECRETS_H_
#define MAIN_SECRETS_H_
#define SSID_NAME "[YOUR_OWN_WIFI_SSID_NAME]"
#define SSID_PASS "[YOUR_OWN_WIFI_SSID_PASSWORD]"
// Azure Custom Vision-related settings
#define AZURE_CV_PREDICTION_KEY "[YOUR_OWN_AZURE_CUSTOM_VISION_PREDICTION_KEY]"
#define AZURE_CV_HOST "[YOUR_OWN_AZURE_CUSTOM_VISION_HOST]"
#define AZURE_CV_PROJECT_ID "[YOUR_OWN_AZURE_CUSTOM_VISION_PROJECT_ID]"
#define AZURE_CV_ITERATION_ID "YOUR_OWN_AZURE_CUSTOM_VISION_ITERATION_ID]"
#endif /* MAIN_SECRETS_H_ */
Replace all values with format of [...] inside quote.
Azure Custom VisionObviously, you need to have access to Azure Custom Vision to make this project works. You can try it for free at customvision.ai. If you already have Microsoft Azure account, you're good to go.
In the live coding videos above-mentioned, I explained and showed how to get started with Azure Custom Vision. Watch this video
Determine Azure Custom Vision SettingsAZURE_CV_PREDICTION_KEY
can be determined by clicking "Prediction URL" in "Performance" tab that will display this dialog:
You can see there's a Prediction-Key
value. Use it.
Still from above dialog, you'll find URL like: https://southcentralus.api.cognitive.microsoft.com/customvision/v2.0/Prediction/28bdc115-xxxx-48e5-xxxx-0f627d67137d/image?iterationId=13ebb90a-xxxx-453b-xxxx-3586788451df
. From the URL, you can determine:
AZURE_CV_HOST
=southcentralus.api.cognitive.microsoft.com
AZURE_CV_PROJECT_ID
=28bdc115-xxxx-48e5-xxxx-0f627d67137d
AZURE_CV_ITERATION_ID
=13ebb90a-xxxx-453b-xxxx-3586788451df
Note that AZURE_CV_ITERATION_ID
is quite important as you can switch between training iterations, just by setting different iteration ID.
Upon successful build and flashing the firmware to the board, on Terminal/Console you'll see the firmware runs and showing the logs, then eventually show these lines:
I (2870) DXWIFI: SYSTEM_EVENT_STA_CONNECTED. Station: 44:79:57:61:72:65 join, AID: 45
I (6130) event: sta ip: 192.168.0.20, mask: 255.255.255.0, gw: 192.168.0.1
I (6130) DXWIFI: SYSREM_EVENT_STA_GOT_IP. IP Address: 192.168.0.20
I (6130) DXWIFI: WiFi connected
I (6130) APP: Starting web server on port: '80'
Take a look that there's a log message: IP address: 192.168.0.20
. It's the IP address of the board when it's connected to specified WiFi Access Point. It will be different on your machine.
How we're gonna do the face recognition and see the result? For this project, I'll use web browser, so the firmware need to activate HTTP server to serve web page request.
Now, open your favourite web browser and type http://[BOARD_IP_ADDRESS]
with [BOARD_IP_ADDRESS]
is the IP addrees you got above. You should see the hello
text.
Now, type URL: http://[BOARD_IP_ADDRESS]/capture
, you should see the captured image by the board's camera on the browser.
Then, type URL: http://[BOARD_IP_ADDRESS]/recog
, the board will capture an image, send the image to Azure Custom Vision for inferencing, then show the detected face on the browser as this image:
For showing live video streaming on the browser and do live recognition, you can use http://[BOARD_IP_ADDRESS]/stream
URL. The demo video is as above-mentioned, you can watch it here.
I recently pushed some commits to add support for TTGO T-Camera board. It has - among other things - following features:
- OLED display with SSD1306
- Passive Infrared: AS312
- Camera: OV2640, similar to ESP-EYE
And here's the result. Detected face info is now displayed on OLED display.
That's it for now. Enjoy!
Comments