Published March 17, 2024 © CC BY

Pet recognition and speech correction system design

This project designed a pet position detection and speech hevaior correction system. Which based on ESP32-S3 and OV2640 image sensor board.

BeginnerFull instructions provided24 hours207

Pet recognition and speech correction system design

Things used in this project

Hardware components

Seeed Studio XIAO ESP32S3 Sense

Espressif ESP32

Software apps and online services

Edge Impulse Studio

Home Assistant

Arduino IDE

Story

This project selects Edge Impluse framework to complete pet identification and location detection. And through the Home Assitant smart Home framework, the pet location information is transmitted into the Home Assitant to trigger automatic commands. Finally control another wireless speaker based on the ESP32S3 to play voice messages.

Please refer this link for a video clip. Sorry for only Chinese version.

https://www.bilibili.com/video/BV1CF4m1c7d

Chapter 1, Project discribtion.

This project mainly completes the following functions:

a) Complete the overall system design according to the overall design. The hardware cables are connected.

b) Use Edge Impluse to complete pet feature extraction, image recognition detection framework establishment, and location information conversion to GPIO level output.

c) Use the Home Assitant framework to receive level information for the ESP32-C3 development version. Send voice to the ESP32-S3 wireless speaker. And compile the automatic script, complete the automatic voice playback.

Chapter 2, Hardware Detail.

On the whole, according to the design requirements, it is necessary to complete the function of identifying the pet's position and issuing voice commands to guide its actions. The XIAO ESP32S3 Sense development board is used as the data processing unit to complete the functions of pet image recognition, position detection and detection result output. An ESP32C3 development board module is used to transmit GPIO level information. An ESP32S3 development board was used to complete the wireless speaker, and information concentration, processing and automatic triggering tasks were realized within the framework of Home Assitant. Finally, a system capable of detecting the position of pets and issuing voice commands to correct the behavior of pets was realized. The following figure shows the final implementation effect.

Fig 1

2.1 Hardware Overview

According to the hardware of the Winter Vacation training project, a XIAO ESP32S3 Sense development board and Edge Impluse framework are used to complete pet identification and location detection. The detection results are output to the ESP32C3 development board through two GPIOs to complete the GPIO level information transmission. Use an ESP32S3 development board to complete the wireless speaker.

The following is the hardware frame diagram of the implementation

Fig 2

Basically, there are not many hardware-related areas in the whole project. The screenshots of the two modifications made for this project are as follows:

Here four cables are drawn from the XIAO development board and welded to the ESP32C3 development board. The four signals are power supply, Land, land area indication, and Water area indication. The GPIO 8/9 ports of the XIAO development board are used here. In contrast, the ESP32C3 uses GPIO 4/5. This information needs to be recorded and then used in the corresponding programming process.

Fig 4

The jumper wire is used here to connect the MAX98357 I2S amplifier board and a speaker. It could be detect as a media player device that can be recognized by the Home Assistant.

Chapter 3, Software Detail.

The software part mainly uses Edge Impluse framework to complete pet identification and location detection. Through the Home Assitant smart home framework, pet location information is transmitted and automated commands are triggered. Finally control another wireless speaker based on the ESP32S3 to play voice messages. The following two parts are described in detail.

3.1 Introduction to the establishment and application of the image recognition framework based on Edge Implues

According to the design requirements of the project, the software needs to use Edge Implues framework to complete the image recognition function. This framework is an embedded AI framework that can run on a variety of embedded CPU platforms, including the ESP32-S3 processor used in this project. Through the registration website, you can basically complete the task of pet image recognition in the following order, described in detail below.

First of all, we need to prepare a number of different angles and different sizes of the pet to be identified (here a small toy frog is used instead). The photo library was divided into training group and test group according to the ratio of 80%/20%. This example uses around 35 images for the training and 10 for the test group. At the beginning, we used two small toys, monkey and frog, and found that the recognition rate of monkey toys was relatively low when we used them later. Just leave the frog as the object to be identified.

Fig 5

Once you're ready, log on to the Edge Impluse website and follow the object identification process to create a complete new process. The basic steps are to upload the image data set to be trained and tested, and to extract the image features. After extraction, FOMO V2.0.1 was used for model training.

Fig 6

After the training, the recognition rate is tested using the test data set. After the recognition rate is greater than 95, you can basically think that the recognition is correct. At this point, you can click the last Deploy to generate the source package and merge it into the main program. For this project, I chose the simplest Arduino library package to generate.

Fig 7

The generated library file already directly contains the currently trained identification network. And provides the ESP32 Camara Demo program. What we need to do here is to add a judgment statement to identify the position of the frog toy frog on the basis of this Demo program, and output it to the pin of the Seeed XIAO ESP32 development board. Figure 6 shows the judgment and the execution of the pin digitalWrite function.

Fig 8

So far, the Seeed XIAO ESP32-S3 development board has been able to complete image recognition and output to GPIO 8/9. And this part of the work is completely offline, all in the XIAO development board complete running.

3.2 Design and implementation of Home Assitant home automation system

Home Assistant is a home automation platform that runs on Python 3. The ability to track and control all devices in the home and provide an automated control platform. Usually we have a variety of installation schemes, I used here to run an x86 Home Assitant OS directly on the x86 virtual machine to build.

This project mainly uses two types of development boards ESP32C3/ESP2S3 to complete the transmission of recognition results and the output of voice commands, so the first thing to install is the ESPHome add-on.

Fig9

After completion, the subsequent addition operations are completed in the ESPHome add-on menu to complete further additions. Here we need to add a GPIO board based on the ESP32-C3 and two modules for the audio player board based on the ESP32S3.

Fig 10

After adding the module, the program is written first. Home Assitant is described using yaml scripts. The GPIO section is listed here, as well as the I2S module section below.

Fig 11

Fig 12

After adding these two modules, the status of the player and GPIO sensor can be seen in the main interface dashboard, and you can see that the status of the sensor changes with the location state of the pet.

Fig13

After that, we need to set the "Automation" operation according to the status of the sensor "Sensor_Land/Water". After receiving the operation of Sensor_Land, a prompt sound representing land is emitted. After receiving the status of Sensor_Water, a prompt sound representing water is emitted. To complete voice commands to your pet.

Fig 14

By now, the construction of the whole system has been completed, and this is also the first time for me to contact Edge Impluse and Home Assitant. Please kindly forgive me for the improper use of many terms in the middle of the completion of the project.

Chapter 4, Summary.

On the whole, this design can complete the function of identifying the pet's position and issuing voice commands to guide its actions. Thanks to Seeed Studio, this design uses its XIAO ESP32S3 Sense development board as a data processing unit to complete the functions of pet image recognition, position detection and detection result output. An ESP32C3 development board module is used to transmit GPIO level information. An ESP32S3 development board was used to complete the wireless speaker, and information concentration, processing and automatic triggering tasks were realized within the framework of Home Assitant. Finally, a system capable of detecting the position of pets and issuing voice commands to correct the behavior of pets was realized.

This project will be published simultaneously on: www.eetree.cn

Thanks to Seeed Studio and eetree.cn, I hope to have another opportunity to participate in the event!

Pet recognition and speech correction system design

Things used in this project

Hardware components

Software apps and online services

Story

Chapter 1, Project discribtion.

Chapter 2, Hardware Detail.

Chapter 3, Software Detail.

Chapter 4, Summary.

Credits

James

Comments

Embed the widget on your own site

Pet recognition and speech correction system design

Pet recognition and speech correction system design

Things used in this project

Hardware components

Software apps and online services

Story

Chapter 1, Project discribtion.

Chapter 2, Hardware Detail.

Chapter 3, Software Detail.

Chapter 4, Summary.

Credits

James

Comments

Related channels and tags