Halloween will be looking a bit different this year due to the COVID-19 pandemic. Trick-or-treating is largely being foregone to maintain distancing, so most people will probably end up buying one or two bags of candy to enjoy at home themselves. And wherever there's a large collection of objects, you can be sure that it will end up sorted in some form or another. That's where the idea for this project came from. Given a large amount of candy, train a machine learning model to classify it.
For this project, I used a cell phone to collect training and testing data for Edge Impulse via their web client. As for classifying items live, I went with the mobile client as well, although a Raspberry Pi 4 running a Python webserver with a webcam attached could have also been used. If I wanted, I could have constructed a rig to move candy in front of the camera and push it into the correct pile automatically, but there wasn't enough time.
Setting Up Edge ImpulseTo gather training data and produce a TensorFlow model, I went with Edge Impulse. I began by creating a new project called "candySorter" and then heading to the device page. Edge Impulse has a nice feature where you can scan a QR code with a phone and then gather data using your phone's sensors with ease. You'll need to give the website access to your phone's camera, however, if you want to upload images.
Gathering enough training data is vital to having a model that performs accurately. Not enough, and images that are slightly different than the ones you used for training will result in either a mislabeling or an unknown value, which is why aiming for around 20-50 images per label is a good target.
On the page that pops up on the phone after scanning the QR code, you'll need to click on the "Collecting images?" button, which will open up a camera preview. While gathering training data, ensure there is adequate lighting, and make sure to get plenty of different angles and lighting variations so the model can better adapt to different conditions. Labels can be set by clicking on the "Label: " area and typing in a new label. Each type of candy should get its own label, as well as a label when there is no candy visible.
Now that there is a large amount of image data, it's time to train a TensorFlow model on it. For input, I went with an image block that scales the original image to one that is 96x96 pixels large. The processing block just sets the color depth to RGB instead of monochrome, and the training block utilizes transfer learning with the MobileNetV2 model. Output is one of the following labels: kitkat
, sour patch
, twizzler
, reeses
, or none
.
From here, you can switch to a live classification view on your phone, which will let you take an image and then view the resulting label based on what the model thinks. All one has to do is take an additional picture and then view the confidence of each label.
This method of categorizing objects from a large number of images is extremely versatile and can be applied to a whole number of other projects. These could include other object classifiers, such as finding a Lego piece, or something a bit more advanced, like a robot that moves items into a certain location automatically.
Comments
Please log in or sign up to comment.