Physical inventory counting is still required in the modern world. This is an unpopular process, however. The main issue is because counting a lot of things takes a lot of time and is tedious. The labor intensity for physical inventory counting should therefore be decreased.
To automate the process of inventory counting, I trained a Machine Learning model that can identify bottle and box/carton drinks on store shelves and their counts are then displayed on a Web Application.
For the object detection, I used Edge Impulse to train a YOLOv5 model that can detect boxed and bottled products. You can find the public Edge Impulse project here: Inventory tracking in retail with Renesas DRP-AI.To add this project to your Edge Impulse projects, click “Clone this project” at the top of the window. Later, I deployed this model to the Renesas RZ/V2L Evaluation Board Kit and created a Web application that uses the camera feed and model results.
Dataset preparation 📂For the dataset, I used the SKU110K image dataset. This dataset focuses on detection in densely packed scenes where images contain many objects.
From the dataset, I chose to use detect the bottle and boxed/carton drinks. This is because these objects are not that identical and they have not been densely packed like other items in the dataset.
In total, I had 145 images for training and 36 images for testing. The dataset has two classes: bottle and box_drink.
With 181 images it would be tiresome to draw bounding boxes for all the objects. Luckily, Edge Impulse offers various AI-assisted labelling methods to automate this process. In my case, I chose YOLOv5. To use this feature, in the Labelling queue select "Classify using YOLOv5" under "Label suggestions".
An Impulse is a machine learning pipeline that indicates the type of input data, extracts features from the data and finally a neural network that trains on the features from your data.
For my YOLOv5 model, I used an image width and height of 320 and the "Resize mode" set to "Squash". The processing block was set to "Image" and the learning block set to "Object Detection (Images)".
Under "Image" in Impulse design, the color depth of the images is set to RGB and the next step was to extract features.
Here in the features tab we can see the on-device performance for generating features during the deployment. These metrics are for the Renesas RZ/V2L(with DRP-AI accelerator). The Renesas RZ/V2L Evaluation Board Kit was recently supported by Edge Impulse. This board is designed for vision AI applications and it offers a powerful hardware acceleration through its Dynamically Reconfigurable Processor (DRP) and multiply-accumulate unit (AI-MAC).
Currently, all Edge Impulse models can run on the RZ/V2L CPU which is a dedicated Cortex A55. However, so that I can benefit from the DRP-AI hardware acceleration, I chose a YOLOV5 model. Note that on the training page you have to select the RZ/V2L (with DRP-AI accelerator) before starting the training in order to tell the studio that you are training the model for the RZ/V2L. This can be done on the top right in the training page or by changing target device in the Dashboard page.
I used 200 training cycles with a learning rate of 0.001. Note that it is however advised to train a YOLOv5 model using more than 1500 photos per class and more than 10, 000 instances per class to produce a robust YOLOv5 model.
After the training process, I got a precision score of 89%. Precision is the number of True Positives divided by the number of True Positives plus the number of False Positives.
After training a model, we need to do a test with the unseen(test) data. In my case, the model had an accuracy of 91%. This accuracy is a percent of all samples with a precision score above 90%.
The Renesas RZ/V2L Evaluation Kit comes with the RZ/V2L board and a 5-megapixel Google Coral Camera. To setup the board, Edge Impulse has prepared a guide that shows how to prepare the Linux Image, install Edge Impulse CLI and finally connecting to Edge Impulse Studio.
To run the model locally on the RZ/V2L we can run the command edge-impulse-linux-runner
after edge-impulse-linux
which lets us log in to our Edge Impulse account and select the cloned public project.
We can also download an executable of the model which contains the signal processing and ML code, compiled with optimizations for the processor, plus a very simple IPC layer (over a Unix socket). This executable is called .eim model.
To do a similar method, create a directory and navigate into the directory:
mkdir monitoring_retail_inventory && \
cd monitoring_retail_inventory
Next, download the.eim model with the command:edge-impulse-linux-runner --download
modelfile.eim
Now we can run the executable model locally using the command:
edge-impulse-linux-runner --model-file
modelfile.eim
We pass the directory and name of the downloaded file modelfile in the command.
We can go to the provided URL and we will see the feed being captured by the camera as well as the bounding boxes if present.
Using the eim executable and Edge Impulse Python SDK, I developed a Web Application using Flask that counts the number bottle and box drinks. The counts are then displayed on the Webpage responsively.
Product packaging is influential to the consumers. Most people prefer drinks in bottles compared to carton/box packaging. As much as some drinks are carbonated and cardboard packaging won't be ideal for them, no one wants to buy a Coke which can be easily squashed by bread in the shopping bag. Therefore, to address this the application shows a red indicator when the bottle drinks are less than the box drinks.
For the counting process, the application only shows the counts from the object detection. However in a retail store, products are racked in various ways and normally a couple are behind the front row. Since from the images we can see some of the items in the rows behind I worked with a simpler approach. From the counts of the detected items we can multiply them by the number of rows that are on the back.
In a retail store, this application can be used with the surveillance cameras that see the various shelves. Another image source can be cameras mounted on shopping carts enabling views of where the surveillance cameras can't see. A manager can then be monitoring the inventory from the Web Application and take the required actions when some products get less.
You can clone the public GitHub repository to your Renesas RZ/V2L board. The installation steps are in the repository. You can run the Flask application or the binary executable built with PyInstaller for AARCH64 platform.
ConclusionFor a more detailed explanation of this project please check the documentation on Edge Impulse. The documentation also has the inference times that I obtained between using FOMO running on the RZ/V2L DRP-AI accelerator.
The suggested structure can be used in crowded retail establishments. This will assist in reducing the amount of labor needed, increasing profits, and most importantly improving customer experience.
The Renesas RZ/V2L provides an accelerator designed exclusively for Computer Vision applications, enhancing performance for rule-based image processing that can run concurrently with networks that are powered by the DRP-AI. All of this runs on low power.
Comments