Team Duo Legends:

•

Neha Kurdekar

Published July 31, 2024 © GPL3+

MechaMachine

A real-time face tracking turret, which detects faces and aims to shoot at point blank of the detected target.

AdvancedFull instructions providedOver 2 days476

Things used in this project

Hardware components

AMD Kria™ KR260 Robotics Starter Kit

Arduino UNO

Webcam, Logitech® HD Pro

DC Motor, 12 V

SG90 Micro-servo motor

LM2596 DC-DC Step Down Module

KY008 Laser

RFP30N06LE N channel MOSFET

Resistor 330 ohm

Rectifier Diode

Jumper wires (generic)

A few male to male, male to female, female to female jumper wires to connect all the components

TaydaElectronics DC POWER JACK 2.1MM BARREL-TYPE PCB MOUNT

Test Accessory, Power Supply Adapter

Heat Shrink Tubing

Machine Screw, M4

Machine Screw, M2

Nerf Darts

Software apps and online services

Microsoft VS Code

AMD Vitis Unified Software Platform

Arduino IDE

OpenCV – Open Source Computer Vision Library OpenCV

Hand tools and fabrication machines

3D Printer (generic)

Soldering iron (generic)

Solder Wire, Lead Free

Multitool, Screwdriver

Story

The Idea:

Ever wondered about a turret that moves in both horizontal and vertical directions to track faces and that shoots at the point blank of the target? MechaMachine is the answer. It is a real-time face tracking turret which detects the faces, classifies whether it is a friend or an enemy, and shoots at the target.

The main idea of the project is to build an autonomous real-time face tracking turret which shoots at the faces by tracking them. This involves the computation feasibility of detecting and plotting the faces along with the mechanical structures to shoot the projectiles at the target.

STEP 1: The MECHAnical part of the MechaMachine

The intention was to make a turret like structure inspired by the Turret of the character/agent KillJoy in the video game Valorant with the constraints of a mount for holding a camera for tracking and a mechanism to shoot bullets with a place to install a laser to point out the target.

KillJoy with the Turret

Based on the inspiration, we decided to make a 3D printed build of the MechaMachine. These were the following constraints to be adhered to:

A place/mount to hold the camera.
A space to fit a laser module.
A shooting mechanism which holds a magazine of nerf bullets.
A pan and tilt setup for the movement of the turret.

With the following constraints in mind the final design seemed to look like this when designed on Fusion360 and AutoCAD:

1 / 2 • Final CAD Design of The MechaMachine

The STL parts attached were then 3D printed using ABS. The infill was set to 40%. We chose to print it in ABS as PLA didn’t seem to work out with the speedy movement of the mechanism causing its breakdown. The strength and durability of ABS at 40% infill sufficed our needs.

STEP 2: The Functioning of the MechaMachine

To understand the way MechaMachine functions we must look into the approach of it:

Use the camera get frames.
Detect if any faces are present in the frames.
Classify the detected faces into friends and foes.
Calculate the coordinates of the face and mark the target.
Move the setup to make sure the face is being tracked.
Shoot at the target using the shooting mechanism.
Once shot, stabilise the MechaMachine.

To make the computation fast and the movement real-time, a decision was made to use 2 computation units; one for detecting and classifying the faces and the other for the controlling of the movements and shooting.

AMD Xilinx KRIA KR260 SOMwas used as the computation unit or the CPU which takes input from the camera and marks the coordinates to send it to the MCU.

The MCU is used to control the mechanisms to move around, track the faces and also to shoot the Nerf Darts at the target. The MCU used here is ArduinoUNOR3.

1 / 2 • The Functioning of the AMD(On-Board Computer/OBC)

STEP 3: The Schematics and Connections.

Schematics For Connection of Arduino UNO R3 with other components

For the connections part, the AMD Xilinx KRIA KR260 SOM is connected to the camera and the ArduinoUNOR3. The camera sends the video frames to the AMD Xilinx KRIA KR260 SOM using the usb port. There is a serial communication established between the AMD Xilinx KRIA KR260 SOM and the ArduinoUNOR3. at 9600 baud rate to transfer the data.

The ArduinoUNOR3 controls the movement by moving the pan and tilt servos which are connected to it using the digital pwm pins on the ArduinoUNOR3.

The ArduinoUNOR3 also controls the shooting mechanism and the laser module.

STEP 4: The Code

Drafting the code involves the following parts:

Code for the ArduinoUNOR3 to control the MechaMachine.
Code for the AMD Xilinx KRIA KR260 SOMto detect and map the face using OpenCV.
Training a custom YOLOv5 Model for classification on Google Colab using RoboFlow as the annotating tool.
Quantising the model to make it work on the AMD Xilinx KRIA KR260 SOM.

To understand the communication between the AMD Xilinx KRIA KR260 SOM and ArduinoUNOR3 we must look into the data structure which is been transferred using serial communication: [startMarker, x_coordinate, y_coordinate, arm_motor, fire, endMarker]

startMarker => this variable indicates the start of the data being sent by the python code.

x_coordinate => this indicates the x coordinates of the face being sent by the python code.

y_coordinate => this indicates the y coordinates of the face being sent by the python code.

arm_motor => this indicates whether the shooting mechanism should be armed or not.

fire => this indicates whether MechaMachine should shoot or not.

endMarker => this is the end marker of the data being sent by the computer.

The Python Code for KRIA KR260 SOM:

/* Please refer the given GitHub repository for the whole code this sub-section has only the pseudo-code: */

Initialize:
    Import necessary libraries (cv2, serial, time, torch)
    Load quantized YOLOv5 model weights ('model.pt')
    Initialize Serial connection to Arduino ('com4', 9600)

Define Constants:
    tolerance_x = 640 // 2 - 30
    tolerance_y = 480 // 2 - 30
    tolerance_w = 640 // 2 + 30
    tolerance_h = 480 // 2 + 30
    arming_tolerance_x = tolerance_x - 25
    arming_tolerance_y = tolerance_y - 25
    arming_tolerance_w = tolerance_w + 25
    arming_tolerance_h = tolerance_h + 25
    startMarker = 999
    endMarker = 998

Main Execution Loop:
    Open VideoCapture device (0) as 'cap'
    While True:
        success, img = cap.read()
        Flip image horizontally (cv2.flip(img, 1))
        
        # YOLOv5 Face Detection
        img0 = Convert 'img' to RGB format (cv2.cvtColor(img, cv2.COLOR_BGR2RG   B))
        Resize image to 'img_size' and perform letterboxing (letterbox function)
        Convert image to Torch tensor and move to device (img.to(device).float())
        Perform inference with YOLOv5 model (model(img)[0])
        Apply non-maximum suppression to detections (non_max_suppression function)
        
        Process Detections:
            If detections are found:
                For each detection:
                    Extract coordinates and confidence (xyxy, conf)
                    Calculate center (x, y) and dimensions (w, h) of bounding box
                    
                    Determine arm_motor and fire flags:
                        If center (x, y) is within arming_tolerance region:
                            Set arm_motor = 1
                        Else:
                            Set arm_motor = 0
                        
                        If center (x, y) is within tolerance region:
                            Update in_box_time
                            If in_box_time > 500ms:
                                Set fire = 1
                        Else:
                            Reset in_box_time and fire

                    Construct serial command string:
                        string = 'S{0:d}X{1:d}Y{2:d}A{3:d}F{4:d}E{5:d}'.format(startMarker, x, y, arm_motor, fire, endMarker)
                        Print string for debugging
                        Encode string as UTF-8 and send to Arduino (ArduinoSerial.write(string.encode('utf-8')))
                    
                    Draw visualizations:
                        Draw circle at (x, y) (cv2.circle(img, (x, y), 2, (255, 255, 255), 2))
                        Draw bounding box around face (cv2.rectangle(img, (x - w // 2, y - h // 2), (x + w // 2, y + h // 2), (0, 0, 255), 3))

        Draw constraint rectangles on image:
            cv2.rectangle(img, (tolerance_x, tolerance_y), (tolerance_w, tolerance_h), (0, 0, 0), 3)
            cv2.rectangle(img, (arming_tolerance_x, arming_tolerance_y), (arming_tolerance_w, arming_tolerance_h), (255, 0, 0), 3)

        Display processed image with annotations (cv2.imshow("MechaMachine", img))

        Exit loop if 'q' key is pressed (cv2.waitKey(1) & 0xFF == ord('q'))

    Release VideoCapture device and close all windows (cap.release(), cv2.destroyAllWindows())

TheArduinoCode:

/* Please refer the given GitHub repository for the whole code this sub-section has only the pseudo-code: */

Function Setup
    InitializeServosAndPins()
    InitializeVariables()
    SetupSerialCommunication()

Function Loop
    TurnOnLaser()
    ReceiveData()
    ArmMechaMachine()
    If DataReceived Then
        TrackFace()
        SetTrigger()
        ArmMechaMachine()
        FireIfTriggered()

Function ReceiveData
    If SerialDataAvailable Then
        ParseSerialData()

Function TrackFace
    ReadFaceCoordinatesFromBuffer()
    AdjustServoPositions()
    UpdateServos()

Function SetTrigger
    If ShouldFireRequested Then
        EnableFiring()

Function FireIfTriggered
    If CanFire And IsArmed And NotFiring Then
        StartFiringSequence()

Function ArmMechaMachine
    If ArmRequested Then
        ActivateArmMotor()
    Else
        DeactivateArmMotor()

Function ParseSerialData
    ReadAndStoreDataFromSerial()

Function AdjustServoPositions
    CalculateNewServoPositions()

Function StartFiringSequence
    InitiateFiringMechanism()

Function EnableFiring
    AllowMechaMachineToFire()

To train the face classification model using YOLOv5 refer the following documentation:

YOLOv5 Custom Model Training

Extract the trained model and further quantise it to use it with the AMD Xilinx KRIA KR260 SOM.

To Quantise the model pt to be usable with the AMD Xilinx KRIA KR260 SOM follow the documentation:

Quantising of the custom YOLOv5 model

Result:

MechaMachine takes 0.3 second to shoot without classifying the faces which is very close to real-time. Whereas with classification it takes around 1 second to make inference and shoot.

1 / 2 • Face Tracking

Conclusion:

MechaMachine is the real-life replica of the Turret of KillJoy from the game Valorant but with the real life implementation which may help out law enforcement authorities and to aid the military with the autonomy of a machine to make decisions to shoot to prevent infiltrators in a no mans land.

We would like to thank AMD Pervasive AI contest to provide with the hardware needed for the computation. This was a great opportunity to build such a project.