The pet cat visual recognition detection project based on XIAO ESP32 S3 development board aims to realize real-time monitoring and analysis of pet cat behavior and characteristics by using deep learning technology and embedded system. The requirements of this project include hardware equipment, software algorithm, function module and performance index.
In terms of hardware equipment, the project needs the XIAO ESP32 S3 development board as the main controller, equipped with a camera module to collect image data of pet cats. The camera module needs to have high definition and wide field of view to ensure all-round observation of the pet cat. Through the video stream collected by the camera module, the behavior and characteristics of pet cats are detected in real time, and the monitored data is analyzed and counted.
In terms of software algorithm, the project needs to establish a pet cat visual recognition model based on deep learning technology. The recognition algorithm also needs to have a certain degree of robustness, and can accurately identify the behavior of pet cats under different lighting conditions and different backgrounds. Pre-trained deep learning models are used to identify cats, enabling real-time monitoring of the presence or absence of cats in the home.
2 Functions completed and performance achievedUsing the XIAO ESP32 S3 development board, through the control of the OV2640 camera, the cat is identified by the pre-trained deep learning model, so as to realize the instant monitoring of whether there is a cat at home. When the cat is detected, the light signal is sent and the LED flashes.
The performance is as follows:
1, real-time: the system needs to process the video stream collected by the camera in real time to ensure the timely monitoring and feedback of the pet cat behavior.
2, accuracy: visual recognition algorithm needs to have high accuracy, can accurately identify pet cat characteristics.
3, stability: The system needs to run stably, long-term continuous monitoring of pet cats, no crash or abnormal conditions.
4, scalability: The system needs to have a certain degree of scalability, to facilitate the subsequent upgrade and expansion of functions and performance.
3 Implementation Ideas1. Using Arduino IDE software and image acquisition program, take photos of cats for subsequent production into data sets.
2. Use the Robo flow platform to upload cat photos for the label classification of image data annotation, download data files, and export the pre-processed data.
3. Use the ModelAssistant project of AI Studio platform to train the model.
4. Upload the custom model to the Sense Craft platform, connect the development board, and start the cat target detection and recognition.
4 Implementation Process1. Flow Chart
2. Assemble XIAO ESP32 S3
3. Run the image capture program on Arduino IDE
After initializing the development board
if(camera_sign && sd_sign){
String command;
// Read incoming commands from serial monitor
while (Serial.available()) {
char c = Serial.read();
if ((c != '\n') && (c != '\r')) {
command.concat(c);
}
else if (c == '\n') {
commandRecv = true;
command.toLowerCase();
}
}
//If command = "capture", take a picture and save it to the SD card
if (commandRecv && command == "capture") {
commandRecv = false;
Serial.println("\nPicture Capture Command is sent");
char filename[32];
sprintf(filename, "/image%d.jpg", imageCount);
photo_save(filename);
Serial.printf("Saved picture:%s\n", filename);
Serial.println("");
imageCount++;
}
}
Use the camera_sign and sd_sign as conditions to determine whether the camera and SD card have been successfully initialized. Next, read the serial port to receive commands, and take and save photos to the SD card when you receive the "capture" command.
Next, save the picture on your SD card.
// 保存图片到SD卡
void photo_save(const char * 文件名) {
// 拍摄照片
camera_fb_t *fb = esp_camera_fb_get();
if (!fb) {
Serial.println("获取相机帧缓冲区失败");
return;
}
// 保存照片到文件
writeFile(SD, 文件名, fb->buf, fb->len);
// 释放图像缓冲区
esp_camera_fb_return(fb);
Serial.println("照片已保存到文件");
}
The writeFile() function used is as follows:
// SD card write file
void writeFile(fs::FS &fs, const char * path, uint8_t * data, size_t len){
Serial.printf("Writing file: %s\n", path);
File file = fs.open(path, FILE_WRITE);
if(!file){
Serial.println("Failed to open file for writing");
return;
}
if(file.write(data, len) == len){
Serial.println("File written");
} else {
Serial.println("Write failed");
}
file.close();
}
4. Upload the collected cat photos to the Robo Flow platform.
5. A total of 1000 cat photos were annotated.
6. Converted the final pre-processed data set into COCO format for export.
7. Use AI Studio platform to train models and introduce ModelAssistant project.
!git clone https://github.com/Seeed-Studio/ModelAssistant.git
%cd ModelAssistant
8. Import the data set to be trained.
%env DATA_ROOT="https://universe.roboflow.com/ds/aliqVGka4t?key=JibV8Iog4S"
%env NUM_CLASSES=1
9. Train Swift-YOLO Tiny Model.
!python tools/train.py \
configs/swift_yolo/swift_yolo_tiny_1xb16_300e_coco.py \
--cfg-options \
epochs=10 \
num_classes=${NUM_CLASSES} \
workers=1\
imgsz=192,192 \
data_root=${DATA_ROOT} \
load_from=https://files.seeedstudio.com/sscma/model_zoo/detection/person/person_detection.pth
10. Upload the model to SenseCraft platform
1. The accuracy rate of model recognition in the first training is low
When the model was trained for the first time, the data set used was too small, only 200 pictures, and the trained model could easily identify people or other objects as cats. When the model was retrained, a larger data set was used and the training time was longer, and the recognition accuracy of the trained model was finally improved.
2. Pins cannot be controlled. When running the model with SenseCraft, only one LED light operation can be performed, and other codes or pins cannot be added.
3. During operation, the chip is hot seriously.
4, the camera frame rate is low, only ten frames, can not maintain the recognition of high frame rate.
6 Suggestions for future plansThe project has successfully realized the cat target recognition and detection function, and can send optical signal alarm, initially achieved the expected goal, but by adding more peripherals
To achieve functional expansion and extension:
1, the use of OV2640 camera resolution is limited, adding a higher pixel camera can make the recognition more clear.
2, by adding some peripherals, drive some motors, you can realize the application of smart home.
3, replace the chip with a higher frequency, you can increase the frame rate.
Where you can upgrade without replacing the hardware:
1, Use a larger data set to train the model, the more photos trained, the better the recognition effect of the final model.
2. More tags can be added to identify more types of pets.
3. After networking, connect with some apps for remote control.
4, use Bluetooth to connect with some devices.
Comments