I started the Seeed Vision Challenge wanting to build an AI Camera for our Hummingbird feeder so that I could capture pictures to share with my grandson. The Grove Vision AI kit that was awarded for the Vision challenge includes the Grove Vision AI V2, an RPi OV5647-62 FOV Camera, and a Xiao ESP32C3. Initially, I did not see an example of transferring images from the Vision AI to the ESP32C3. I only saw examples that passed inference results via the UART. Therefore, I decided to swap a Xiao ESP32S3 Sense module for the ESP32C3. That would provide another camera and SD card storage in addition to WiFi capability. My plan was to use the the Vision AI module for inferencing and the ESP32S3 for image capture and storage when an appropriate object was detected. The ESP32S3 would also be a Webserver for the captured images.
After I had built my hardware setup I subsequently discovered the Seeed SSCMA examples (https://github.com/Seeed-Studio/Seeed_Arduino_SSCMA) that sent hex encoded image data via the common serial interfaces (UART, I2C, SPI). The camera_web_server example was exactly what I wanted to do, so I decided that I would switch to using that program. The advantage of using a single camera is that I won't get the parallax error (image offset) that you need to compensate for when using two cameras. Parallax is mainly a problem at close distances.
I still have the dual camera setup and since the ESP32S3 is also capable of running AI models, I could possibly have each device running a different model. Not sure how well the ESP32S3 would handle that in addition to providing the interface for the Vision AI.
For this project I will use the camera webserver example with the Vision AI camera. First I will need to get the hardware working with a ready to use model from SenseCraft AI, then I'll need to develop and deploy a custom hummingbird model.
Hardware developmentHere are the two cameras and boards connected together. The ESP32S3 plugs into the Vision AI V2 via pre-mounted female headers. Its camera is part of the Sense option and plugs on top. The Vision AI camera attached via the CSI connector. The camera came with the 10cm FFC cable that is shown. I replaced it later with a 5cm one. The WiFi antenna is plugged onto the ESP32S3.
Here is the assembly in the housing that I printed. This is the front view without the cover.
The back view.
The camera assembly with the cover is in the center. The 2000mAh power bank is on the right. On the left is a 3D printed bracket to hold the battery and attach the assembly to the hummingbird feeder pole.
The assembly from the front and side.
Mounted on a tripod for testing.
Mounted on the hummingbird feeder pole.
Seeed has a nice wiki section for the web server example.
I went to the SenseCraft AI site then connected and uploaded the Face Detection model to the Vision AI and the result showed up immediately in the preview window.
Then I downloaded and installed the SSCMA zip library in the Arduino IDE and then modified the camera_web_server example to add my WiFi info and then uploaded the program to the ESP32S3. The program prints the ipaddr of the web server to the Serial Monitor so you just need to open that in your browser and hit the Start Stream button. Everything worked great and I also verified that I could save frames.
I found a project on Hackster - Computer Vision at the Edge with Grove Vision AI Module V2 by mjrobot that describes creating a custom model on Edge Impulse and deploying it to the Vision AI V2. Just what I needed.
I also found a dataset on Kaggle - BIRDS 525 SPECIES- IMAGE CLASSIFICATION that had labeled data that included 4 species of hummingbirds. I downloaded that dataset (took an eternity) and extracted just the hummingbird data.
I created a new project on Edge Impulse.
And uploaded the dataset using the Data acquisition tab.
These are the 4 classes of hummingbirds.
Here is my Impulse design:
Training parameters
Training results for quantized (int8) version
Training results for unoptimized (float32) version
The images in the dataset are actually pretty difficult for me to differentiate so I'm impressed that the model does so well with a small dataset. I am going to deploy the quantized model. For my purposes it just has to detect any type of hummingbird.
I just need to download the quantized tensorflow lite model that I can upload to the Vision AI.
The final step that is recommended is to use Google Colab and the Vela compiler to generate a model that is optimized to run on the Arm Ethos-U NPU that is used on the Vision AI V2.
Here are the notebook sections that install Vela and use it to generate the new model from the downloaded tflite model.
Once we have a Vela model we can use SenseCraft AI to upload that model to the Vision AI. There is a button to Upload Custom AI Model below the Section for Ready to use AI models.
And the result looks good in the Preview window.
Well, so it almost worked. I'm not sure why but when I run the model with SSCMA, I don't get the bounding box or inference information. I tried to get some help on the Seeed Discord channel but I haven't gotten a response. It must be some sort of data formatting issue. I'll have to try to figure it out. Maybe try to generate the model a different way. Not sure why it works with SenseCraft AI and not SSCMA. The Vision challenge is over soon so I wanted to publish this project since I did receive free hardware for it.
Sample ImagesThe hardware is working fine. I may have to tweak the focus a bit on the camera.
Here are some images from the setup. Unfortunately, none of these images are the result of an inference triggered detection. Once I figure out what the custom model issue with the inference data is, adding the trigger part should be easy.
Comments