As ecommerce grows, many people prefer to shop online and have their purchases delivered to their homes. Package theft has increased as a result of more parcels being delivered. Although there are ways to prevent package theft, such as having packages delivered to the Post Office or giving the courier access to your home remotely, many people prefer door deliveries. However, we may not always be around to collect packages, or thieves may be quicker to do so!
Monitoring delivered packages with TinyMLThere are a couple of techniques to prevent package theft, but this project focuses on parcels delivered to our front porches or mailboxes. Edge Impulse Studio is used to create a Machine Learning model that recognizes parcels. The model is then be deployed to a low-cost, low-power device, the ESP-EYE development board. This board has a 2MP camera that we will use to collect live feeds of our door steps.
For processing the images, I used FOMO. This is an algorithm developed by Edge Impulse to enable real-time object detection, tracking and counting to microcontrollers. FOMO is 30x faster than MobileNet SSD and runs in <200K of RAM. On the Raspberry Pi 4, live classification with FOMO had ~27.7 fps while the SSD MobileNetV2 gave ~1.56fps.
To train a model to detect parcels we need various images that will be used for training. In total, I took 275 images of parcels and split them with a 80/20 ratio for training and testing. Note that it is recommended to have more images so that the model learns well, but in this demo I worked with this number.
Next step was to label the parcels in all images which I also did from the Studio. Note that Edge Impulse has AI assisted labelling which can be used by tracking objects in frames, classify with YOLOv5 or classify with your pertained model.
We can now use our dataset to train a model. This requires two important features: a processing and learning block. Documentation on Impulse Design can be found here.
In my Impulse, image width and height are set to 96x96; and resize mode to Squash. Processing block is set to “Image” and the Learning block is “Object Detection(images)”.
Since the ESP-EYE is resource-constrained (4MB flash and 8MB PSRAM), I used 96x96 image size.
The next step was to use the processing block (Image) to generate features from our dataset. Since I wanted to use FOMO, I set the color depth to grayscale.
The last step was to train the model. For the Neural Network architecture, I selected FOMO (Faster Objects, More Objects) MobileNetV2 0.35. I used 60 training cycles with a learning rate of 0.001.
After training, the model has an F1 score of 94%. F1-score combines precision and recall into a single metric.
When training the model, I used 80% of the data in the dataset. The remaining 20% is used to test the accuracy of the model in classifying unseen data. We need to verify that our model has not overfit by testing it on new data. If a model performs poorly, then it means that it overfit (crammed the dataset). This can be resolved by adding more dataset and/or reconfiguring the processing and learning blocks, and even adding Data Augmentation. Increasing performance tricks can be found in this guide.
To deploy the model to the ESP-EYE, first go to the “Deployment” section. Next, under “Build firmware” select Espressif ESP-EYE (ESP32).
To increase performance on the board, I enabled the EON Compiler and chose “Quantized(int8)” optimization. This makes the model use 243.9K of RAM and 77.5K of Flash on the board.
Click “Build” and the firmware will be downloaded after the build ends.
Connect an ESP-EYE board to your computer, extract the downloaded firmware and run the appropriate script (based on your computer's Operating System) to upload it to your board. Great! Now we have a FOMO model running on the ESP-EYE, locally.
To get a live feed of the camera and classification, run the command:
edge-impulse-run-impulse --debug
Next enter the provided URL in a browser and you will see live feeds from the ESP-EYE.
The ESP-EYE gives ~ 1fps, using 98x96 image size. Using a 48x48 image size gives ~5fps but the model is not accurate in this case. The performance can be related to the ESP-EYE being a constrained device with limited flash and RAM. Inference has a latency of ~850ms with 96x96 image, while ~200ms with a 48x48 image. Larger images have more bytes that need to be processed, while in lower resolutions useful data for object-detection is collapsed thus poor accuracy.
For this use-case of monitoring a front porch, taking one picture every second, from ESP-EYE, and analyzing it is acceptable. However, you can also target higher performance MCUs such as Arduino Nano 33 BLE Sense with a camera, Portenta H7 with a Vision Shield, Himax WE-I Plus, OpenMV, Sony’s Spresense, and Linux-based dev boards.
Taking it one step furtherWe can use this model to monitor delivered parcels and take some actions such as sounding an alarm or sending a text message when no parcel, or fewer parcels are detected.
A build library will allow us to add custom code to the predictions. We can check if parcel is predicted, save the count and later on monitor the predictions count. If predictions count is less, that means a parcel(s) is missing and we can raise an alarm using our development board, and even send a signal to other home automation devices such as security cameras or alarms.
Edge Impulse has also developed a feature that enables sending an SMS based on inference from our model. This feature works with Development boards that support Edge Impulse for Linux, such as the Raspberry Pi. The repository and documentation can be found here example-linux-with-twilio . Sending SMS uses Twilio service.
You can find the Edge Impulse public project here: Parcel Detection - FOMO. To add this project into your projects, click “Clone” at the top of the page. Alternatively, to create a similar project, make sure to look at the project tutorial for more details.
Developing Machine Learning models with Edge Impulse has always been easy and fun! FOMO was chosen in this project so that the model could be small to fit in the ESP-EYE, and also fast to have an acceptable latency when doing inference!
We can now monitor parcels easily at low-cost and low-power requirements. This demonstrates the massive potential that TinyML offers to make the world smarter and solve endless problems.
Comments
Please log in or sign up to comment.