The main objective of this project is to perform detection and tracking of faces from the real-time input video. The servos control panning and tilting the webcam mounted on it. The webcam’s position changes according to the movement of the object or person.
In my previous project, I built ESP32Cam on an existing Arduino robot and thus got the opportunity to implement a face tracking function in a whole circle of 360 degrees. Since both servos and motors are already connected to Arduino, ESP32 only sent commands to the Arduino module via a serial connection. That looks complicated, but with a finished construction I only had a few problems in communication. In this tutorial I will show you how to implement a face tracking function using only the ESP32 Cam and 2 servos. So we have the possibility to follow the face in range of 180 degrees.
In a design where ESP32Cam is supposed to control the servo directly, we get other problems like interference and connections with the servo.
Design and BuildingThe input video stream is obtained using an ESP32-CAM. With the Espressif ESP-FACE library it’s easy to detect a face and find its location in the frame. The library provides a function called draw_face_boxes that is normally used to display a box around a detected face. The coordinates obtained from the bounding box in ESP32 is used to track the face in the subsequent frames.
The ESP-WHO framework takes QVGA (320×240) images as input. Face detection is implemented using MTCNN and MobileNet, and will return the position of any faces in the image if present. Each frame is examined for a face. It is operative only on frontal faces. Once the face is spotted, a bounding box is drawn around it.
In the design where ESP32 controls the servo directly, I initially had problems with the power supply and interference caused by the servo. No matter what I did, once I added the servo, the device would reboot endlessly and warn about voltage drop. So I recreated interference filters. This makes it possible to power the ESP and servo with any mobile phone charger.
The ESP32-CAM can be programmed using the Arduino IDE, which supports the ESP32 platform.
If you have little or no experience with the ESP32 camera development boards, you should start with a tutorial for beginners.
CodeIn my previous projects I talked more about code, so now I will just focus on a few small changes I had to make to realize the control of two servos. The code shown below was the minimum needed to be able to use servo. There are a few libraries for servos for ESP32. I chose to use Kevin Harrington's ESP32Servo library. You can download the code as a zip file and then use the "Add zip library" function in the Arduino IDE, or find the library in the Arduino IDE's library manager under the name ServoESP32Fix. The code shown below was the minimum needed to be able to use servo.
We need to create 2 dummy servos, so that ESP32Servo library does not interfere with “pwm channel” and timer used by esp32 camera.
Variable “toNul” introduced here, indicates direction of servo movement, which in my previous project was the Arduino uno thing. Value Zero shows movement to position 0 degree.
How to find center of faceYou can find out how to find a face in my previous project. A function “called draw_face_boxes” that is normally used to provide a detected face, display a box around.
Using the function as a result we get coordinates for the top point on the left side of the box frame.
To conversion from pixels to degrees, for QVGA (320px ×240px, diagonal 400px) divide diagonal with the view field of camera. I’m using a OV2640 camera module, which shipped together with my board ESP32 and has 60°.
For my camera it gets the pixels per degree of rotation:
400/60= 6, 7
Now distance image center from the frame center converted into degrees is given with the following forms:
posH =posH + (160 - face_center_pan)/7;
I rounded 6.7 up to 7, but that's not critical. The camera picks up the position well even after a few steps. I would say that such iterative positioning of the camera looks more natural.
What's left is to send the angle value to the panorama servo via pin 14.
Serial.printf("Center detected at %d dots\n", face_center_pan);
panServo.write(posH);
Here I made a different design of servo stand, so movement in "tilt" direction is reversed. That's why we changed formula for vertical position.
posV =posV - (120 - face_center_tilt)/7;
tiltServo.write(posV);
Serial.printf("V%d \n", posV);
A code can be added so that the motors and servos only activate when the face is outside the frame.
Due to servo movement limitations of 50° and 130° for tilt one must limit tilt movement.
Testing the codeA box_array_t type value contains face boxes, as well as score and landmark of each box: as coordinates: left top, right down, landmarks.
if (boxes != NULL) {}
We just want to know if faces were detected in the image or not. We simply check if this pointer is not NULL. If so, draw_face_boxes(image_matrix, boxes) sends the servo the commands to move the camera.
If "boxes = NULL" is, then is no face detected. However, if there is no face to be recognized for a certain time, then you have to activate a function to search. As a timer for the search function, we use a variable "noDetection (each step follows in about 300-330 msec)", which accumulates with each unsuccessful attempt at face detection. Action within "else" is split so that search runs first on the side where the last mall face is located. Then set for other side. If face recognition would have come in the meantime, the entire action within "else" is cancelled.
ConclusionThis can be used as a smart phone holder for dynamic video recording. Thanks to intuitive gesture recognition, you can enjoy hands-free filming. With 180° rotation and AI-assisted face tracking, you don't have to miss a moment and can capture professional footage anywhere with one battery for TikTok Live, vlogs and more. Unfortunately, “Follow Me” feature is only limited to Face Tracking, which is the only option for ESP-WHO framework. OpenCV library would be more suitable and brings many advantages, but would be a completely different concept. So we'll leave it at that for the next projects.
Comments
Please log in or sign up to comment.