Summary
What We Tried
Additional Tests
Dataset Collection
Labeling and Training
Results
Further Exploration
Conclusion
Links and References

•

Published July 1, 2024 © MIT

Watching Washers w/ Grove Vision AI Module v2

To tackle dorm laundry hassles, we took pics and trained models to monitor washers using the Seeed Vision AI Module v2 & Xiao ESP32S3.

IntermediateWork in progress8 hours440

Watching Washers w/ Grove Vision AI Module v2

Things used in this project

Hardware components

Seeed Studio Grove Vision AI Module V2

Brand new shiny and exciting Seeed Vision AI Module v2!

Raspberry Pi Camera Module

Version 1.3 ft. 5MP Omnivision 5647 sensor. Supports 1080p@30fps, 720p@60fps and 640x480p 60/90 Recording 15-pin MIPI CSI ribbon cable 20 x 25 x 9mm @3g

Seeed Studio XIAO ESP32S3 Sense

Also works with other Xiao series boards but Seeed recommends the ESP32S3 on the camera_web_server github readme: https://github.com/Seeed-Studio/Seeed_Arduino_SSCMA/tree/main/examples/camera_web_server?goal=0_4b071a49e3-85fc0cd7b7-51640806#prerequisites

USB Cable, USB Type C Plug

Software apps and online services

roboflow

Label Studio

Seeed Studio Sensecraft

PlatformIO IDE

Hand tools and fabrication machines

3D Printer

Story

Summary

Living in a student dorm means you have to deal with many challenges. One big hassle is getting your laundry done: pack your smelly clothes and carry them down to the laundromat - only to find a mess with no available washer for the next hour.

As part of the Seeed Vision Challenge, we wanted to create something to help solve some of these issues. We designed a frame to retain the Vision AI Module v2 with a socketed Xiao ESP32S3 and deployed a custom model to look for washers. We learned lots about the new and exciting v2 AI Module, Seeed_Arduino_SSCMA/camera_web_server and the amazing Sensecraft no-code platform.

Dorm laundromat sees frequent use even during off-season.

What We Tried

We aimed to create a system using the Seeed Vision AI Module v2 to detect and monitor washing machines' availability in a dorm laundromat.

Sketched out & explored the concept before proceeding with the project.

To achieve this, we designed a frame to securely hold the Vision AI Module, using a socketed Xiao ESP32S3 to deploy a custom model for washer detection. This setup was intended to make the monitoring process seamless and efficient.

Section view showing internal construction design.

Adapter kits print tests. Recommend calibrating your printer for best results.

Assembled adapter (left, right). Internals (center) w/ compliant buttons, offsets & grooves.

What Worked Out of the Box

The Vision AI Module v2 proved to be user-friendly and straightforward to set up. Integrating it with the Xiao ESP32S3 was smooth, and the initial setup and deployment of the camera_web_server on the Xiao ESP32S3 worked without major issues. Additionally, the Sensecraft no-code platform was highly intuitive, providing an easy starting point for AI model deployment, which was crucial for our project's rapid development.

Additional Tests

To better understand the capabilities of the ESP32S3, we explored various demos, including the Xiao ESP32S3 Sense Web Camera Demo. We also designed a 3D printed frame to hold the Vision AI Module and Xiao ESP32S3 securely. This frame featured adjustable camera angles, optional lens mounts, and a tidy setup, which facilitated easier data collection and ensured that the components remained securely in place during operation.

Assembly uses ~0.75 W peak at inference.

Challenges We Faced

Despite the initial successes, we encountered several challenges.

The camera_web_server had some initial bugs that required troubleshooting.

The flashing interface on the Sensecraft website also posed issues, needing multiple attempts to resolve.

Configuring the model to detect multiple classes effectively was particularly challenging, as it required fine-tuning and experimentation to achieve satisfactory results.

Mixed results when testing deployed models with only one detected class. :(

Dataset Collection

For our dataset, we used the camera_web_server running on the Vision Module v2 attached to the Xiao ESP32S3. By connecting the setup via a mobile hotspot and using a smartphone browser, we were able to save still images of washers. We aimed to create a diverse dataset by capturing images from various angles and under different lighting conditions, resulting in a total of 55 washer images. This diversity was crucial for training a robust AI model.

Collected 55 images of washers at various angles & light conditions.

Labeling and Training

We employed both Label Studio and Roboflow for dataset labeling and augmentation. By implementing MobileNet v2 and YOLO 192 transfer learning models, we aimed to enhance the detection capabilities of our system. Although we faced initial difficulties, such as relabeling the dataset multiple times to improve accuracy, we ultimately achieved a functional model capable of detecting washers.

Google Colab ipynb model training file based on Gesture Detection Swift Yolo 192 provided example by Seeed: Another copy of Gesture_Detection_Swift-YOLO_192.ipynb

Roboflow labelled dataset.

Testing Label Studio.

After labelling & training on the data, we uploaded the model to sensecraft.seeed.cc, typed in the labels and deployed it using the fancy built-in web serial flasher (use a supported browser) to the Seeed Grove Vision v2 board to get some results at last.

We did not expect stellar results based on these model post-training stats.

Results

The inference results showed that only washers were detected, so further optimization is critically needed.

We also encountered issues with flashing the Arduino_SSCMA for the ESP32C3 but eventually succeeded with the ESP32S3 camera web server using PlatformIO.

While the model achieved fast inference speeds of around 30fps, its mostly clear we need a larger dataset and probably simpler labeling might be required for better results. We figured we try first with all the labels.

Further Exploration

We compared Label Studio and Roboflow for dataset management and augmentation:

Label Studio was robust and fast but lacked some advanced features like mosaic augmentation.
Roboflow offered powerful features and a better user interface but involved additional costs.

We also explored modifying the YOLO 192 person detector colab for training and exporting models but stopped short of committing fully.

When deployed, we tested the image delay and found wired latency of about 300 ms and wireless latency between 400-600 ms, though jitteriness due to antenna placement and orientation remained an issue.

Glass-to-glass latency measurement taking advantage of the Droste effect.

Conclusion

The 3D printed frame significantly facilitated data collection, keeping the setup secure and tidy. Our project provides an approach to a practical solution for monitoring washer availability in dorm laundromats. With potential improvements in dataset size and model configuration, this system could become even more effective, helping students spend less time waiting for washers.

By unraveling Seeed’s Vision V2, we've taken a small step towards making dorm life more convenient, and show that students can potentially monitor laundry availability on the edge in real-time affordably, saving time and frustration.

Links and References

@fb03 prepared a step model of the AI Vision v2 available on GrabCAD, which made the adapter kit design possible

Model files also available on Printables here: printables.com/model/929372-seeed-grove-vision-module-v2-adapter-kit

Optional magnetic lens kit: de.aliexpress.com/item/1005007059986698.html

The adjustable lens mount we use in the adapter kit is a remix of the one RoverXR uses: github.com/mbz4/RoverXR

Credits

Matevž Zorec

2 projects • 6 followers

Contact

Watching Washers w/ Grove Vision AI Module v2