Published July 31, 2024 © Apache-2.0

360° Object Detection Robot Car

This is the 360° Object Detection Robot Car featuring the KR260. This Robot was built for the AMD Pervasive AI Developer Contest.

AdvancedFull instructions provided5 days810

Robotics AI: 2nd Place

Pervasive AI Developer Contest

Things used in this project

Hardware components

AMD Kria™ KR260 Robotics Starter Kit

RICOH THETA V

360°　Camera

Tamiya 70162 RC Robot Construction Set

Robot Main-Flame

Tamiya 70168 Double Gearbox

Robot Car-Gearbox

Tamiya 70098 Universal Plate

Robot Car-Add-plate

Tamiya 70096 Off-Road Tires

Robot Car-Add-Tires

Custom KR260 Motor-Driver-PCB

Custom KR260 Debug(LED/SW)-PCB

Software apps and online services

AMD Vivado Design Suite

Vivado 2023.1

AMD Vitis Unified Software Platform

Vitis 2023.1, Vitis-AI 3.5

OpenCV – Open Source Computer Vision Library OpenCV

ROS Robot Operating System

ROS2

AMD PetaLinux

PetaLinux 2023.1

Story

In this project, we introduce a Robotic Car with 360° camera and manipulator arm, controlled by KR260.

We will provide detailed information about the hardware and software configurations.

360° Object Detection Robot Car

1. Introduction

This project integrates advanced technologies like Object Detection, DPU, PYNQ, Vitis AI and ROS2.

A key feature is the incorporation of a 360° Camera.

By leveraging the KR260, we've developed a robot with 360-degree AI vision.

Key feature is the 360° Camera.

Using KR260 and PYNQ, 360° object detection is processed on the PL (FPGA).

The PL also handles PWM and GPIO for driving the Robot Car and Arm.

By integrating PL and PS, the robot achieves 360-degree AI vision.

Using KR260 and PYNQ

Additionally, we are using the latest Vitis AI technology to compare and accelerate object detection(YOLO*).

Comparing object detection(YOLO*) with latest Vitis AI.

The main output topics are as follows:

Visualizing ROS2 3D markers from 360° Camera
Object detection using DPU-PYNQ within 360 Live Streaming
Control Robotic Car and Arm with PYNQ + Original PCB
Comparison of Object Detection Acceleration Using the Latest Vitis AI

This bellow video summarizes the main results. It's all wrapped up in about one minute, so please take a look!

Summary Video

2. BOM

The BOM is composed mainly of generic parts.

This low-cost robot costs about $550 with a 360° camera (and about $200 without 360° camera).

The detailed BOM and Cost list is also saved in csv on GitHub;

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/bom/AMD_contest_BOM.csv

2.1360° Camera BOM list.

We use Ricoh Theta V as 360° Camera.

Ricoh offers a public API for developers, along with extensive API documentation and SDKs for the Theta series. This enables remote control and integration with various applications.

And Ricoh Theta V is old model(via USB2.0), it's low cost 360° camera.

We attach the 360° camera to the robot car. It has a pre-installed 1/4-inch mounting hole for easy attachment.

360° Camera --- Ricoh Theta V
Bolt to fix the 360° Camera --- 1/4-20 x 3"

1 / 2 • 360° Camera(Ricoh Theta V)

2.2Robot Car and Arm BOM list.

We use TAMIYA's Robot kit as Main Flame.

We are creating additional frames to accommodate the KR260. Using TAMIYA's universal kit, it includes the DC motors and small screws.

Robot Main-Flame --- Tamiya 70162 RC Robot Construction Set
Robot Car-Gearbox --- Tamiya 70168 Double Gearbox
Robot Car-Add-plate --- Tamiya 70098 Universal Plate
Robot Car-Add-Tires --- Tamiya 70096 Off-Road Tires

1 / 4 • Robot Car and Arm

2.3Motor-Driver-PCB BOM list.

This is our custom KR260-specific motor driver board. We use two boards: one for driving the robot car and one for the arm.

Details will be explained later, and the board data is available on GitHub.

Motor-Driver-PCB --- Original-PCB
Motor-Driver-IC(U1) --- DRV8833PWPR
Resistor_SMD(R1) --- 0603_1608Metric_10k
Resistor_SMD(R2.R3) --- 0603_1608Metric_0 (or short-bar)
Capacitor_SMD(C1) --- 0603_1608Metric_2.2u
Capacitor_SMD(C2) --- 0603_1608Metric_10u
Capacitor_SMD(C3) --- 0603_1608Metric_0.1u
PinHeader_2x06(J1) -- 2x06_P2.54mm_Horizontal
PinHeader_1x02(J2, J3, J4) --- 1x02_P2.54mm_Vertical

1 / 3 • Motor-Driver-PCB

2.4Debug(LED/SW)-PCB BOM list.

This is our custom KR260-specific debug (LED/SW) board.

Details will be explained later, and the board data is available on GitHub.

Debug(LED/SW)-PCB --- Original-PCB
Resistor_SMD(R1, R2, R3) --- 0603_1k
Resistor_SMD(R4) --- 0603_10k
LED_D3.0mm(D1) --- Red_LED
LED_D3.0mm(D2) --- Green_LED
LED_D3.0mm(D3) --- Blue_LED
PinHeader_2x06(J1) --- 2x06_P2.54mm_Horizontal

1 / 3 • Debug(LED/SW)-PCB

2.5Others list.

Other items include only wires and spacers.

(Most of the necessary screws and nuts are included in the TAMIYA kit.)

They are used for connections from the motor to the motor driver and for connecting the motor power supply. Additionally, only M3 screws and spacers are used to secure the KR260.

1 / 2 • other items(wires)

3. Electrical Diagrams

This is Main electrical diagram. No special power supply is needed.

Main Electrical diagram

These are PMOD diagrams.

PMOD1 diagram

PMOD2 diagram

PMOD4 diagram

We're using the 12V Main Power supply that comes with the Kria starter kit.

For the Motor Power, you can use either the 5V from the KR260 Pi connector or the external battery.

The 360° camera just needs to be connected via USB.

3.1Motor Driver PCB circuit diagram

This is the circuit diagram for the motor driver.

It is designed to connect directly to a PMOD(J1) connector.

The motor power is supplied through the J4 connector, and the DC motors can be controlled individually via the J2 and J3 connectors.

Motor Driver PCB circuit diagram

Motor Driver PCB circuit Artwork

3.2Debug (LED/SW) PCBcircuit diagram

This is the circuit diagram for the Debug (LED/SW) board.

It is designed to connect directly to a PMOD(J1) connector.

The board features 3 LEDs and 1 switch.

Debug (LED/SW) PCB circuit diagram

Debug (LED/SW) PCB circuit Artwork

3.3 PMOD pins setting(.xdc)

This is the PMOD pins setting file(.xdc).

#PMOD1 motor-driver1
set_property PACKAGE_PIN H12 [get_ports PWM_0]
set_property PACKAGE_PIN E10 [get_ports PWM_1]
set_property PACKAGE_PIN D10 [get_ports PWM_2]
set_property PACKAGE_PIN C11 [get_ports PWM_3]
set_property PACKAGE_PIN B10 [get_ports gpio_rtl_0_tri_o[0]]

#PMOD2 motor-driver2
set_property PACKAGE_PIN J11 [get_ports gpio_rtl_0_tri_o[5]]
set_property PACKAGE_PIN J10 [get_ports gpio_rtl_0_tri_o[6]]
set_property PACKAGE_PIN H11 [get_ports gpio_rtl_0_tri_o[1]]

#PMOD3 infrared sensor
set_property PACKAGE_PIN AE12 [get_ports gpio_rtl_1_tri_i[0]]
set_property PACKAGE_PIN AF12 [get_ports gpio_rtl_1_tri_i[1]]

#PMOD4 debug(led-sw)
set_property PACKAGE_PIN AC12 [get_ports gpio_rtl_0_tri_o[2]]
set_property PACKAGE_PIN AD12 [get_ports gpio_rtl_0_tri_o[3]]
set_property PACKAGE_PIN AE10 [get_ports gpio_rtl_0_tri_o[4]]
set_property PACKAGE_PIN AF10 [get_ports gpio_rtl_1_tri_i[2]]


set_property IOSTANDARD LVCMOS33 [get_ports PWM_0]
set_property IOSTANDARD LVCMOS33 [get_ports PWM_1]
set_property IOSTANDARD LVCMOS33 [get_ports PWM_2]
set_property IOSTANDARD LVCMOS33 [get_ports PWM_3]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[0]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[1]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[2]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[3]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[4]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[5]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[6]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_1_tri_i[0]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_1_tri_i[1]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_1_tri_i[2]]

4. Assembly Instructions

Assembling the Robot Car and Arm is straightforward by following the TAMIYA kit instructions.

The custom KR260-specific board data is freely available on GitHub. Soldering can be done at a hobbyist level.

Robot Car and Arm

4.1Robot-Car and Arm

The robot car can move forward, backward, and rotate by independently controlling the forward and reverse motion of two motors.

The gearbox increases torque with a gear ratio of 344.2:1. Assemble the gearbox according to the instructions and attach it to the frame.

Robot Car Assembly_1

Robot Car Assembly_2

The robot arm mechanism is simple. It uses a single DC motor to move the arm via a crank gearbox with a gear ratio of approximately 1543:1.

This was also assembled according to the instructions. The arm mechanism is included in the kit.

Robot Arm Assembly_1

Robot Arm Assembly_2

4.2Additional frames for KR260

An additional frame was created to mount the KR260 on the robot.

It is designed for easy attachment and removal to facilitate debugging when the KR260 needs to be separated from the robot.

1 / 4 • Additional frames for KR260_1

Additional frames for KR260_2

Additional frames for KR260_3

4.3Mounting the360° Camera

Drill a hole in the robot car to insert bolts for securing the 360° camera. Using a mini router in addition to nippers will make this easier.

Mounting the 360° Camera_1

Mounting the 360° Camera_2

Mounting the 360° Camera_3

4.4Motor Driver PCB

The PCB data is available on GitHub. Here is the link.

https://github.com/iotengineer22/PCB-DRV8833-TEST

Zip the "drv8833-pcb" folder and send it to any PCB fabrication company (e.g., PCBGOGO).

The PCB specifications match the default settings of most fabrication companies. Just enter the dimensions and proceed with the order.

PCB Size: 17.78mm x 40.64mm
Layers: 2
Material: FR-4
FR4-TG: TG150-160
Thickness: 1.6mm
Min track/spacing: 6/6mil
Minimum hole size: 0.3mm
Solder mask: Green
Silkscreen: White

Once the PCB arrives, solder the components according to the BOM (Bill of Materials) list.

Motor Driver PCB_1

Motor Driver PCB_2

Motor Driver PCB_3

4.5Debug (LED/SW) PCB

The PCB data is available on GitHub. Here is the link.

https://github.com/iotengineer22/PCB-KV260-PMOD-TEST

Zip the "kv260-gpio-smd-pcb" folder and send it to any PCB fabrication company.

PCB Size: 17.78mm x 40.64mm
The other PCB specifications are the same as the motor driver PCB.

Once the PCB arrives, solder the components according to the BOM (Bill of Materials) list.

Debug (LED/SW) PCB_1

Motor Driver PCB_2

Debug (LED/SW) PCB_3

5. Subprojects

This project involves various technologies and has a long series of steps to completion.

To help beginners progress step by step, each step is divided into subprojects.

While this project provides detailed explanations, please refer to the links to each subproject for specific programming and software setup instructions.

Subprojects are organized into the following chapters.

1. PYNQ + GPIO(LED Blinking)

2. PYNQ + PWM(DC-Motor Control)

3. Object Detection(Yolo) with DPU-PYNQ

4. Implementation DPU, GPIO, and PWM

5. Remote Control 360° Camera

6. GStreamer + OpenCV with 360°Camera

7. 360 Live Streaming + Object Detect(DPU)

8. ROS2 3D Marker from 360 Live Streaming

9. Control 360° Object Detection Robot Car

10. Improve Object Detection Speed with YOLOX

11. Benchmark Architectures of the DPU

12. Power Consumption of 360° Object Detection Robot Car

13. Application to Vitis AI ONNX Runtime Engine (VOE)

14. Appendix: Object Detection Using YOLOX with a Webcam

Please note that before running the above subprojects, the following setup, which is the reference for this AMDcontest, is required.

https://github.com/amd/Kria-RoboticsAI

To test as described in the following projects, you can download the repository on the KR260 as follows:

cd /home/ubuntu
git clone https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest.git

5.1 PYNQ + GPIO(LED Blinking)

For program details and specifications, please refer to the Subproject below.

1. PYNQ + GPIO(LED Blinking)

In this Subproject, we experimented with controlling GPIO on the KR260 FPGA board.

Using Python (PYNQ), we managed to perform LED output and switch input via the PMOD connector with custom-designed board.

We'll share how we designed an original board and tested its functionality.

GPIO Control with KR260 and PYNQ.

Introduction

The KR260, an FPGA board from AMD (Xilinx), is equipped with a PMOD connector that can also be utilized for GPIO purposes.

KR260

We created a custom board to experiment with LED output and switch input functionalities.

Debug(LED/SW)-PCB

Run GPIO Test

Within the /root/jupyter_notebooks/ directory in KR260, a folder is created to house the executed .ipynb file alongside the .bit and .hwl files produced by Vivado.

Below is an example of copying to the jupyter_notebooks directory on the KR260.

sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/pynq-gpio ./

After KR260 installation, the IP address is confirmed using ifconfig; in my case, it was 192.168.11.7.

Accessing Jupyter Notebook is done by navigating to http://192.168.11.7:9090/ in a web browser (Chrome was used here).

Accessing Jupyter Notebook

The test .bit.hwh.ipynb files are available on GitHub.

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/jupyter_notebooks/pynq-gpio

The notebook includes steps to control the AXI-GPIO IP, loaded from the Vivado-generated .bit file, utilizing PYNQ's AXI-GPIO library. Official documentation for AXI-GPIO manipulation can be found here.

Please check test video to see it in action.

GPIO Test Video

In the notebook, the AxiGPIO is imported, and the GPIO IP and channel1 are specified.

LED outputs are managed through a series of Write commands, cycling through 0x3, 0x2, 0x1, and 0x0 to toggle the LEDs on and off.

Switch input reading is also demonstrated, showing 1 when pressed and 0 when not.

A for loop is used to sequentially light up three LEDs.

1 / 3 • GPIO Test

5.2 PYNQ + PWM(DC-Motor Control)

For program details and specifications, please refer to the Subproject below.

2. PYNQ + PWM(DC-Motor Control)

We tested controlling PWM (Pulse Width Modulation) on the KR260 FPGA board.

Using Python (PYNQ), we output PWM signals to control a motor driver board.

We created an original board to control a DC motor and will introduce the details here.

PWM (Pulse Width Modulation) on the KR260

Introduction

The KR260 FPGA board from AMD (Xilinx) has PMOD connectors, which can be used as PWM pins.

KR260

We created a motor driver board for KR260.

This subproject demonstrates the successful PWM control of a DC motor.

(And we also tried controlling an LED with PWM.)

Motor driver board for KR260

Run PWM Test

Within the /root/jupyter_notebooks/ directory in KR260, a folder is created to house the executed .ipynb file(PWM-test-PCB.ipynb) alongside the .bit and .hwl files produced by Vivado.

Below is an example of copying to the jupyter_notebooks directory on the KR260.

sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/pynq-pwm/ ./

After KR260 installation, the IP address is confirmed using ifconfig; in this case, it was 192.168.11.9.

Use the Kria-PYNQ environment via Jupyter Notebook to control the PWM. Connect to the KR260 board using a LAN cable and find the IP address using ifconfig. Access the Jupyter Notebook at http://<IP_ADDRESS>:9090/.

Jupyter Notebook

ipynb File

The test .bit.hwh.ipynb files(PWM-test-PCB.ipynb) are available on GitHub.

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/jupyter_notebooks/pynq-pwm

LED PWM Test

Before controlling the motor, I tested PWM control on an LED.

The test video shows the LED brightness changing with PWM values at 10%, 50%, and 99%.

PWM Test Video(PWM + LED)

1 / 3 • PWM Test(PWM + LED)

Controlling DC Motor with PWM

Connect the motor driver board and DC motor, and use PWM to control them. After loading the FPGA, control the motor in both forward and reverse directions with PWM values at 10%, 50%, and 99%.

The test video below demonstrates the successful PWM control of a DC motor.

PWM Test Video(PWM + DC-motor)

1 / 2 • PWM Test(PWM + DC-motor)

5.3 Object Detection(Yolo) with DPU-PYNQ

For program details and specifications, please refer to the Subproject below.

3. Object Detection(Yolo) with DPU-PYNQ

We tested object detection on images from camera using the KR260 and YOLOv3.

Originally, there was a sample program for PYNQ-DPU, which we modified.

Here, we will introduce the model used and the methods of modification.

object detection

Introduction

KR260 can utilize DPU (Deep Processing Unit).

KR260

We have conducted object detection(Yolo) on 360° and normal images with KR260.

Here is a test video using Jupyter Notebooks with DPU-PYNQ and KR260.

We also tested this in a Python program with similar results.

Object Detection(.ipynb)

In the /root/jupyter_notebooks/ directory of KR260, copy the.ipynb file, the COCO2017 list, and the JPEG files into the existing pynq-dpu folder.

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/jupyter_notebooks/pynq-dpu

And we have provided the .xmodel(kr260_yolov3_tf2.xmodel) at the following link. This is the model used for testing the DPU (object detection).

Below is an example of copying to the jupyter_notebooks directory on the KR260.

sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/pynq-dpu/ ./
#Please follow the README.md in the folder to download the .xmodel file.

Test.ipynb

Here is a test video using Jupyter Notebooks with DPU-PYNQ and KR260.

Test video using Jupyter Notebooks

Open the Jupyter Notebook on KR260 and proceed with the execution.

The default.bit file is used for DPU on KR260. Using TensorFlow2 Model with VART for YOLOv3 Detection

We performed object detection on 80 categories from COCO2017 using VART on DPU.

1 / 5 • Test .ipynb

We tested with three photos: two 360° images and one regular camera image.

The 360° images did not perform well in object detection, failing to detect the balls in the foreground.

In contrast, images captured with a regular smartphone camera successfully detected the ball.

1 / 3 • Test .ipynb result

Consideration of Test Results(360° images)

We have conducted object detection on 360° images(5376x2688).

While we have successfully detected human figures, the yellow ball was not detected.

360° images are too wide, making it difficult for Yolo model to detect objects.

360° images (5376x2688) are too wide

This indicates that further adjustments are needed.

By splitting the image into two and setting the aspect ratio to 1:1(2688x2688), detection improves significantly.

1 / 2 • aspect ratio to 1:1, detection improves significantly

Object Detection(.py)

We also implemented the program in Python, available on the following GitHub repository:

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/jupyter_notebooks/pynq-dpu/app_yolov3_tf2_mymodel-name-test.py

Below is an example of program(.py) execution.

ubuntu@kria:~$ sudo su
root@kria:/home/ubuntu# source /etc/profile.d/pynq_venv.sh
(pynq-venv) root@kria:/home/ubuntu# cd $PYNQ_JUPYTER_NOTEBOOKS
(pynq-venv) root@kria:~/jupyter_notebooks# cd pynq-dpu/
(pynq-venv) root@kria:~/jupyter_notebooks/pynq-dpu# python3 app_yolov3_tf2_mymodel-name-test.py

yolov3_test, in TensorFlow2
(1, 416, 416, 3)
Number of detected objects: 2
Details of detected objects: [49 60]
Performance: 2.902664666183155 FPS

Here is a test video using Python with DPU-PYNQ and KR260.

test video using Python

Object detection can be performed using.py files just as effectively as with.ipynb files.

1 / 2 • Object Detection(.py)

5.4 Implementation DPU, GPIO, and PWM

For program details and specifications, please refer to the Subproject below.

4. Implementation DPU, GPIO, and PWM

Using Vivado and Vitis, we created a project to synthesize the DPU IP.

We utilized the DPU created on PYNQ with KR260 to perform object detection using Vitis AI (Yolo).

In this post, we will introduce the process of running GPIO (PWM) alongside the DPU on KR260.

synthesize the DPU IP

Creation Process

There are various methods to create a project that includes the DPU IP.

This example follows a specific process.

The author created this project in the Vivado 2023.1 and Vitis 2023.1 environment, so adjust accordingly if following along.

Synthesize the MPSoC, clock, and reset required for the DPU using Vivado.
Synthesize the DPU using Vitis.
Synthesize other IPs, such as GPIO and PWM, using Vivado.

This Creation Process involves very lengthy steps. Therefore, the details are omitted here; please refer to the Subproject for more information.

Creation Process involves very lengthy steps

Running the DPU on KR260 with PYNQ

Within the /root/jupyter_notebooks/ directory in KR260, a folder is created to house the executed .ipynb file alongside the .bit and .hwl files produced by Vivado.

Below is an example of copying to the jupyter_notebooks directory on the KR260.

sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/pynq-original-dpu-model/ ./
#Please follow the README.md in the folder to download the .xmodel file.

After KR260 installation, the IP address is confirmed using ifconfig; in my case, it was 192.168.11.9.

Jupyter Notebook

ipynb File

The test .bit,.hwh,.xclbin, and.ipynb(my-dpu-gpio-test.ipynb) files are available on GitHub.

And we have provided the .xmodel(kr260_yolov3_tf2.xmodel) at the following link. This is the model used for testing the DPU (object detection).

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/jupyter_notebooks/pynq-original-dpu-model

Run the modified Jupyter notebook on PYNQ, which includes additional configurations for GPIO and PWM.

Run the modified Jupyter notebook on PYNQ

The main modifications include applying the DPU overlay to a standard overlay and configuring GPIO(PWM) outputs based on detected objects.

Below are the key points of this program.

from pynq_dpu import DpuOverlay
overlay = DpuOverlay("/root/jupyter_notebooks/pynq-original-dpu-model/dpu.bit")

from pynq import Overlay
from pynq.lib import AxiGPIO
ol = overlay

# LED(GPIO)_set
gpio_0_ip = ol.ip_dict['axi_gpio_0']
gpio_out = AxiGPIO(gpio_0_ip).channel1
mask = 0xffffffff

Test Results

Here is the test video demonstrating from the setup.

Test original DPU Video

We have successfully performed object detection (Yolo) using the DPU.

1 / 3 • Test original DPU

Additionally, we have managed to integrate and operate GPIO (PWM) alongside the DPU.

If the specified objects (Bus, Ball) are detected in the photos within the img folder, specific GPIO and PWM outputs(LED) are triggered.

Test original DPU result_1_1

Test original DPU result_1_2

Test original DPU result_1_3

5.5 Remote Control 360°Camera

For program details and specifications, please refer to the Subproject below.

5. Remote Control 360° Camera

We tried controlling the RICOH THETA V 360° camera from the KR260 using PYNQ.

In this project, we will introduce the installation method and provide examples of how to execute the control commands.

RICOH THETA V 360° camera and KR260

Introduction

We successfully controlled the RICOH THETA V 360° camera from the KR260 via USB.

1 / 3 • RICOH THETA V 360° camera

Here are the photos taken by the 360° camera:

360° images

Installation Steps

The 360° camera used is the RICOH THETA V. It is a user-friendly 360° camera with various APIs and libraries available.

RICOH THETA V and USB cable

Connecting THETA(360° camera) to KR260

Connect the THETA(360° camera) to the KR260 using a USB cable.

Checking the logs with dmesg shows the successful connection.

ubuntu@kria:~$ sudo su
root@kria:/home/ubuntu# dmesg | grep usb

[    8.214013] usb 1-1.2: New USB device found, idVendor=05ca, idProduct=0368, bcdDevice= 1.00
[    8.222380] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    8.229697] usb 1-1.2: Product: RICOH THETA V
[    8.234055] usb 1-1.2: Manufacturer: Ricoh Company, Ltd.
[    8.239367] usb 1-1.2: SerialNumber: 00119628

Building and Installing the Library

Follow the steps below to build and install the library from GitHub.

https://github.com/codetricity/libptp2-theta

sudo apt install build-essential libtool automake pkg-config subversion libusb-dev

git clone https://github.com/codetricity/libptp2-theta
cd libptp2-theta/
./configure 
make

autoreconf -i
./configure 
automake
make
sudo make install
sudo ldconfig -v

Run theta Test

Using the Kria-PYNQ Jupyter Notebook, you can control the camera from Python.

This example shows one possible method, but various operations are possible using the official RICOH USB API information.

https://github.com/ricohapi/theta-api-specs/tree/main/theta-usb-api

Below is the test.ipynb file. It has been uploaded to GitHub.

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/jupyter_notebooks/theta-check/theta_camera_check.ipynb

Below is an example of copying to the jupyter_notebooks directory on the KR260.

sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/theta-check/ ./

Below is the test video.

theta Test Video

You can see that the 360° camera is controlled from an.ipynb file.

Here are some examples of commands:

To wake up the THETA from power-saving mode. The camera's LED light will turn blue.

theta_wakeup = getoutput('theta --set-property=0xD80E --val=0x00')
print(theta_wakeup)

Camera: RICOH THETA V
'UNKNOWN' is set to: 0
Changing property value to 0x00 [(null)] succeeded.

Switch to camera shooting mode. The camera icon will light up blue.

theta_camera_mode = getoutput('theta --set-property=0x5013 --val=0x0001')
print(theta_camera_mode)

Camera: RICOH THETA V
'Still Capture Mode' is set to: [Normal]
Changing property value to 0x0001 [(null)] succeeded.

All that's left is to take the photo (capture).

time1 = time.time()
theta_capture = getoutput('theta --capture')
time2 = time.time()
capture_time = time2 - time1
print("Performance: {} (s)".format(capture_time))
print(theta_capture)

Performance: 3.308269500732422 (s)
Initiating capture...
Object added 0x000000f2
Capture completed successfully!

5.6 GStreamer + OpenCV with 360°Camera

For program details and specifications, please refer to the Subproject below.

6. GStreamer + OpenCV with 360°Camera

In this project, we'll walk you through how we achieved real-time image processing using a KR260 and a 360° camera (RICOH THETA).

We'll cover the installation methods and the programs used.

connected the pipeline using GStreamer

Introduction

We connected the pipeline using GStreamer and processed it with OpenCV.

This setup allows us to obtain 360° live streaming data via USB from KR260.

The following image was captured in real-time from the RICOH THETA on the KR260.

360° Live Streaming with KR260

Installation Methods and Programs

libuvc

We use libuvc to obtain video streams from the USB camera (RICOH THETA).

You need to download and install the necessary UVC (USB Video Class) libraries.

Refer to the following GitHub repository:

https://github.com/nickel110/libuvc.git

sudo apt update
sudo apt install cmake
sudo apt install libusb-1.0-0-dev

sudo su
git clone https://github.com/nickel110/libuvc.git
cd libuvc/
mkdir build
cd build/
cmake ..
make && sudo make install

GStreamer

We use GStreamer pipelines to encode the video stream.

Refer to the official GStreamer documentation:

https://gstreamer.freedesktop.org/documentation/installing/on-linux.html?gi-language=c

Install GStreamer with the following command:

sudo apt install libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev libgstreamer-plugins-bad1.0-dev gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly gstreamer1.0-libav gstreamer1.0-tools gstreamer1.0-x gstreamer1.0-alsa gstreamer1.0-gl gstreamer1.0-gtk3 gstreamer1.0-qt5 gstreamer1.0-pulseaudio

v4l2loopback-dkms

Installing v4l2loopback-dkms creates virtual video devices.

sudo apt install v4l2loopback-dkms

gstthetauvc

Download and install the GStreamer THETA UVC plugin (gstthetauvc) for RICOH THETA. Refer to the following GitHub repository:

https://github.com/nickel110/gstthetauvc.git

git clone https://github.com/nickel110/gstthetauvc.git
cd gstthetauvc/thetauvc/
make

Move the created gstthetauvc.so file to the GStreamer plugins folder. Locate gstreamer-1.0 and copy the file:

sudo find / -type d -name 'gstreamer-1.0'
ls /usr/lib/aarch64-linux-gnu/gstreamer-1.0
sudo cp gstthetauvc.so /usr/lib/aarch64-linux-gnu/gstreamer-1.0
ls /usr/lib/aarch64-linux-gnu/gstreamer-1.0

Update the library links and cache:

sudo /sbin/ldconfig -v

Check if the gstthetauvc plugin is available:

gst-inspect-1.0 thetauvcsrc

gstthetauvc

GStreamer with RICOH THETA

In this subproject, we ran the tests in a regular root environment rather than the PYNQ virtual environment.

We'll demonstrate how to perform image processing using 360° camera(RICOH THETA) with GStreamer and OpenCV.

We will process 2K (1920x960) 360° video streams using GStreamer.

connected the pipeline using GStreamer

First, we capture 360° images directly from the RICOH THETA on the KR260.

This setup allows for real-time 360° live streaming on the screen.

The program is written in Python and is available on GitHub.

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/src/gst-test/

We tested the setup by throwing a ball around the 360° camera.

sudo su
cd /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/src/gst-test/
python3 gst-test-360-no-divide.py

The test video is below:

GStreamer Test Video

The KR260 captures the 360° video stream from the RICOH THETA in real-time.

The pipeline configuration uses OpenCV only for display purposes with imshow. We capture and display 2K (1920x960) image data.

360° Live Streaming with KR260_1

We tested by throwing a ball around the 360° camera. Viewing the image from above provides a clear perspective.

360° Live Streaming with KR260_2

5.7 360 Live Streaming + Object Detect(DPU)

For program details and specifications, please refer to the Subproject below.

7. 360 Live Streaming + Object Detect(DPU)

We conducted real-time object detection on 360 live streaming image data.

In this subproject, we'll introduce how we used the KR260 and PYNQ-DPU for object detection, and how we controlled GPIO and PWM.

360° Object Detect

Introduction

We conducted real-time object detection on 360 live streaming image data using the RICOH THETA 360° camera.

Initially, we performed object detection (Yolo) using the DPU of KR260.

360° Object Detection(DPU)

GStreamer Not Included in PYNQ's OpenCV

To run the DPU, it was necessary to execute it in PYNQ's virtual environment.

However, upon checking the build information of PYNQ's OpenCV, we found that GStreamer was not included.

Here is the check program for my environment:

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/src/gst-dpu-test/gst-back-check.py

(pynq-venv) root@kria:/home/ubuntu/gst-dpu-test# python3 gst-back-check.py

GStreamer:
    GStreamer:                   NO

GStreamer is essential for processing 360° video streams in real-time.

Therefore, I uninstalled opencv-python within the PYNQ virtual environment:

sudo su
source /etc/profile.d/pynq_venv.sh
pip uninstall opencv-python

After uninstalling, I switched to the Ubuntu environment on the KR260 and confirmed that GStreamer is available.

(pynq-venv) root@kria:/home/ubuntu/gst-dpu-test# python3 gst-back-check.py

GStreamer:
    GStreamer:                   YES (1.19.90)

360° Object Detect(DPU)

The test .bit,.hwh,.xclbin, and.py(app_gst-real-360-yolov3_tf2.py) files are available on GitHub.

And we have provided the .xmodel(kr260_yolov3_tf2.xmodel) at the following link. This is the model used for testing the DPU (object detection).

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/src/gst-dpu-test/

sudo su
source /etc/profile.d/pynq_venv.sh
cd /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/src/gst-dpu-test/
python3 app_gst-real-360-yolov3_tf2.py
#Please follow the README.md in the folder to download the .xmodel file.

Here is test video:

360° Object Detect(DPU) Test VIdeo

Initially, the Python program splits the 2k (1920x960) 360° image data into four 480x480 images.

It is because applying a standard object detection YOLO model to a very wide 360° image makes object detection difficult.

By dividing and cropping into four 480x480 sections (or two 960x960 sections), object detection can be effectively performed in each section.

360° Object Detect(DPU) Test1_1

360° Object Detect(DPU) Test1_2

360° Object Detect(DPU) + GPIO

We used the KR260 and PYNQ-DPU for object detection and controlled LEDs (GPIO). To verify GPIO functionality, we connected a debug (LED/SW) PCB to the PMOD connector.

debug (LED/SW) PCB for KR260

The test.py (app_gst-real-360-yolov3_tf2.py) files are the same as last.

python3 app_gst-real-360-yolov3_tf2.py

Here is test video:

360° Object Detect(DPU) + GPIO Test VIdeo

By detecting a ball in the 360° image, specific sections were designated to light up red, green, or blue LEDs.

When a ball is detected in section 1 (front side), all LEDs turn on.

If detected in section 2 (right side), the red LED turns on.

If detected in section 3 (back side), the green LED turns on.

If detected in section 4 (left side), the blue LED turns on.

detected in section 1 (front side), all LEDs turn on

detected in section 2 (right side), the red LED turns on

detected in section 3 (back side), the green LED turns on

detected in section 4 (left side), the blue LED turns on

5.8 ROS2 3D Marker from 360 Live Streaming

For program details and specifications, please refer to the Subproject below.

8. ROS2 3D Marker from 360 Live Streaming

We experimented with processing 360° camera images using ROS2 Rviz2.

We managed real-time object detection from 360° live streaming data.

In this project, we'll introduce the program and share test videos.

ROS2 Rviz2 with 360° camera

Introduction

Using ROS2's Rviz2 and KR260, we processed 360° camera images.

We used real-time object detection data from 360° live streaming.

We placed the detected objects' bounding boxes as markers in Rviz2.

Using ROS2's Rviz2 and KR260

Installing ROS2 Rviz2

Rviz2 is a 3D visualization tool for ROS2 (Robot Operating System 2).

This time, we'll display object detection data from 360° live streaming images in Rviz2.

Install it with the following steps:

sudo apt update
sudo apt install ros-humble-rviz2
sudo su
source /opt/ros/humble/setup.bash 
rviz2

Installing OpenCV Related Packages for ROS2

We are acquiring and processing images from the 360° camera via OpenCV.

To use it with ROS2, we'll also install related libraries.

sudo apt install ros-humble-image-transport
sudo apt install ros-humble-cv-bridge

Operating Rviz2

Start Rviz2 with the following command:

sudo su
source /opt/ros/humble/setup.bash
rviz2

rviz2

ROS2 3D Marker from 360 Live Streaming

We conducted real-time object detection on 360 live streaming image data using the RICOH THETA 360° camera and DPU.

Although it's not related to the current subproject, we will go ahead and install the game controller library below.

sudo su
source /etc/profile.d/pynq_venv.sh
pip install inputs

The test .bit,.hwh,.xclbin, and.py(gst-ros2-360-detect-car.py) files are available on GitHub.

And we have provided the .xmodel(kr260_yolov3_tf2.xmodel) at the following link. This is the model used for testing the DPU (object detection).

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/src/gst-ros2

sudo su
source /etc/profile.d/pynq_venv.sh
source /opt/ros/humble/setup.bash
cd /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/src/gst-ros2/
python3 gst-ros2-360-detect-car.py
#Please follow the README.md in the folder to download the .xmodel file.

Here is test video:

ROS2 3D Marker from 360 Live Streaming Test 1 Video

We display images and markers published from KR260 using ROS2's Rviz2.

The KR260 splits a 360° image into four sections and performs object detection and marker output in each section.

ROS2 3D Marker from 360 Live Streaming Test 1_1

We are conducting tests to pick up and transport balls around the 360° camera. The real-time changes, including the ROS2 markers, can be observed.

ROS2 3D Marker from 360 Live Streaming Test 1_2

5.9 Control 360° Object Detection Robot Car

For program details and specifications, please refer to the Subproject below.

9. Control 360° Object Detection Robot Car

We control 360° Object Detection Robot Car with KR260.

The object detection is performed using DPU, and marker output is executed with ROS2 while the Robot Car is in motion.

run 360° Object Detection Robot Car

Debugging Robot actuators

We used a game controller to move the KR260 robot. Using a wireless game controller enables remote control.

We used an ELECOM Wireless Gamepad JC-U4113SBK, which is designed for PC but worked seamlessly with the KR260.

The KR260 is controlled using PYNQ, which means using Python for control. The game controller library, inputs, is used because it can operate without a display or GUI. Install it with:

sudo su
source /etc/profile.d/pynq_venv.sh
pip install inputs

game controller

Debug Test 1

The debug test .bit,.hwh,.xclbin, and.ipynb(controller-pwm-gpio-test.ipynb) files are available on GitHub.

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/jupyter_notebooks/pynq-original-dpu-model/

The test program for this run should already be copied when executing the subproject below.

4. Implementation DPU, GPIO, and PWM

Debug Test 1

Here is one of the debug test videos below.

Currently, we are not moving the robot car; this is just actuator debugging.

debug the motor operation Test1 Video

By moving the stick up and down, the robot wheel motors are controlled via PWM.

It is also evident that pressing the buttons controls the DC motor of the robot arm.

debug the motor operation Test1_1

debug the motor operation Test1_2

Debug Test 2

Here is one of the debug test videos below. We debug tests by actually moving the robot car.

debug the motor operation Test2 Video

The motor car drives its wheels to transport the ball. Additionally, the arm is activated again to lift and lower the ball.

debug the motor operation Test2_1

debug the motor operation Test2_2

360° Object Detect Test

We conducted real-time object detection on 360 live streaming image data using the RICOH THETA 360° camera and KR260 DPU.

The test .bit,.hwh,.xclbin, and.py(gst-ros2-360-detect-car.py) files are available on GitHub.

And we have provided the .xmodel(kr260_yolov3_tf2.xmodel) at the following link. This is the model used for testing the DPU (object detection).

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/src/gst-ros2

sudo su
source /etc/profile.d/pynq_venv.sh
source /opt/ros/humble/setup.bash
cd /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/src/gst-ros2/
python3 gst-ros2-360-detect-car.py
#Please follow the README.md in the folder to download the .xmodel file.

Start Rviz2 with the following command:

sudo su
source /opt/ros/humble/setup.bash
rviz2

Here is test video:

This involves performing object detection with the DPU and executing marker output with ROS2 while the robot car is in motion.

360° Object Detection Robot Car Test1 Video

We display images published from KR260 using ROS2's Rviz2.

The KR260 splits a 360° image into four sections and performs object detection in each section.

360° Object Detection Robot Car Test1

ROS2 Marker Output Test

The next test uses the below program.

python3 gst-ros2-360-2divide.py

The program content is almost the same as before, but this time it includes a demonstration with ROS2 marker output.

Here is test video:

360° Object Detection Robot Car Test2 Video

In this test, the KR260 splits the 360° image into two sections, front and back.

During the test, when objects are moved, we can see the marker outputs changing in real-time.

360° Object Detection Robot Car Test2_1

As a result of picking up the object in front of the camera, ultimately, both the marker and image detection show only the ball in front.

Additionally, the operation of lifting the ball is also confirmed.

360° Object Detection Robot Car Test2_2

Robot Action Test

The next test uses the same program in Test 1.

python3 gst-ros2-360-detect-car.py

However, the KR260 unit and the robot components are separated. By extending the wiring from the motor driver to the motors, this is achievable.

By reducing the weight of the robot body, even smoother movements are possible.

extending the wiring from the motor driver_1

extending the wiring from the motor driver_2

Here is test video:

360° Object Detection Robot Car Test3 Video

It can be confirmed that the robot car moves smoothly forward, backward, and rotates, as well as operates its arm.

360° Object Detection Robot Car Test3_1

360° Object Detection Robot Car Test3_2

Human Detect Test

The next test uses the below program.

python3 gst-ros2-360-human-trace.py

This program detects a person (or a person's hand) and automatically rotates and moves forward in that direction.

Here is test video:

360° Object Detection Robot Car Test4 Video

Initially, the program detects a person's hand behind or beside the camera, causing the robot to rotate automatically.

360° Object Detection Robot Car Test4_1

Finally, the program detects a person's hand in front of the camera, prompting the robot to move forward.

360° Object Detection Robot Car Test4_2

5.10 Improve Object Detection Speed with YOLOX

For program details and specifications, please refer to the Subproject below.

10. Improve Object Detection Speed with YOLOX

In this Subproject, we conducted object detection using the KR260's DPU with the lightweight model "YOLOX-nano" and PyTorch.

object detection using the KR260's DPU with YOLOX

We created a program (.ipynb and.py) that runs on PYNQ and confirmed its operation, and compared the detection speed with the old YOLOv3 program.

Below is the test video with the execution and speed comparison. We confirmed that the new YOLOX program is approximately five times faster.

YOLOX-TEST-Video

Creating the YOLOX-nano Model with Vitis AI

First, we created (compiled) the YOLOX model for KR260 in a Linux environment.

We downloaded and extracted the sample model of YOLOX.

wget https://www.xilinx.com/bin/public/openDownload?filename=pt_yolox-nano_3.5.zip
unzip openDownload?filename=pt_yolox-nano_3.5.zip

Compilation with Vitis AI

We used Vitis AI for compilation, launching the CPU version of Vitis AI for PyTorch.

cd Vitis-AI/
./docker_run.sh xilinx/vitis-ai-pytorch-cpu:latest

Using the arch.json created earlier as an argument, We compiled the model.

The.xmodel file is created in the folder after compilation.

cd pt_yolox-nano_3.5/
conda activate vitis-ai-pytorch
echo '{' > arch.json
echo ' "fingerprint": "0x101000016010407"' >> arch.json
echo '}' >> arch.json
vai_c_xir -x quantized/YOLOX_0_int.xmodel -a arch.json -n yolox_nano_pt -o ./yolox_nano_pt

This time, it's an example of the fingerprint of B4096 on KR260.

If you want to try different architectures like B512 or B1024 and need to check the file where the fingerprint is written (arch.json), it is located in the following folder when you synthesize the DPU with Vitis:

~/***_hw_link/Hardware/dpu.build/link/vivado/vpl/prj/prj.gen/sources_1/bd/design_1/ip/design_1_DPUCZDX8G_1_0/arch.json

to check the file (arch.json)

Program Creation on PYNQ

We created a program that runs on PYNQ, using.ipynb and.py format.

Since the algorithm differs from the YOLOv3 sample program, some modifications were necessary. The actual program can be found on the following GitHub repository:

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/jupyter_notebooks/pynq-yolox/dpu_yolox-nano_pt_coco2017.ipynb

Testing on KR260

Below is an example of copying to the jupyter_notebooks directory on the KR260.

sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/pynq-yolox/ ./

We opened the.ipynb file(dpu_yolox-nano_pt_coco2017.ipynb) on the KR260 via a web browser.

opened the .ipynb file

We checked the input/output tensors of the model converted with PyTorch, ensuring (1, 416, 416, 3) → ((1, 52, 52, 85) (1, 26, 26, 85) (1, 13, 13, 85)).

The YOLOX model detected 80 categories of COCO objects.

checked the input/output tensors

YOLOX-nano and YOLOv3-tiny Detection Speed Comparison

We compared the detection speed between YOLOX-nano and the old YOLOv3-tiny.

object detection using the KR260's DPU

compared the speed of YOLOX and YOLOv3.

The execution environment was the same DPU (B4096).

DPU execution time…0.1168→0.0154, approximately 1/8 detection time
CPU post-processing time…0.1303→0.0303, approximately 1/4 detection time
Total processing fps…3.30→18.6, approximately 5 times the speed

YOLOX-nano Results

Details of detected objects: [49, 60]
Pre-processing time: 0.0080 seconds
DPU execution time: 0.0154 seconds
Post-process time: 0.0303 seconds
Total run time: 0.0537 seconds
Performance: 18.63 FPS

(array([[ 458.1155,  125.8079,  821.8845,  489.5768],
        [  40.2464,    0.    , 1239.7537,  720.    ]]),
 array([0.5618, 0.1179]),
 array([49, 60]))

YOLOv3-tiny Results

Details of detected objects: [49, 60]
Pre-processing time: 0.0560 seconds
DPU execution time: 0.1168 seconds
Post-process time: 0.1303 seconds
Total run time: 0.3030 seconds
Performance: 3.30 FPS

(array([[ 157.7307,  455.4164,  434.6538,  812.3395],
        [  49.6795,   66.1538,  658.0765, 1213.8462]], dtype=float32),
 array([0.2461, 0.7143], dtype=float32),
 array([49, 60], dtype=int32))

Applying YOLOX to 360° Object Detection

We also applied the YOLOX to 360° live streaming object detection. The actual program can be found on the following GitHub repository:

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/src/yolox-test/app_gst-yolox-real-360-2divide.py

sudo su
source /etc/profile.d/pynq_venv.sh
cd /src/yolox-test/
python3 app_gst-yolox-real-360-2divide.py

Below is the test video.

applied the YOLOX to 360° live streaming Video

The 360° camera(RICOH THETA V) used was old and of USB 2.0 type, resulting in a simple live streaming of about 6fps.

We split the 1920x960 image into two 960x960 images for display.

simple live streaming of about 6fps_1

simple live streaming of about 6fps_2

Implementing object detection with the slow YOLOv3 reduced the frame rate to about 1.5fps.

Implementing object detection YOLOv3_1

Implementing object detection YOLOv3_2

Changing to YOLOX improved it to about 3.5fps. (Further optimization might bring it closer to 6fps).

By applying YOLOX, we were also able to speed up object detection in 360° live streaming.

Changing to YOLOX improved object detection speed_1

Changing to YOLOX improved object detection speed_2

5.11 Benchmark Architectures of the DPU

For program details and specifications, please refer to the Subproject below.

11. Benchmark Architectures of the DPU

In this Subproject, We conducted object detection using the KR260's DPU with the lightweight model "YOLOX-nano" and PyTorch

We measured the speed of object detection using various architectures of the DPU (DPUCZDX8G).

compared the DPU execution(inference) speeds_1

The tests primarily used a 150MHz clock, with some checks at 300MHz.

Generally, larger sizes resulted in shorter inference times.

Creating Models for Each Architecture

Refer to the article below for details on how to create files(".bit", ".xclbin", ".hwh") for each DPU architecture.

4. Implementation DPU, GPIO, and PWM

To run YOLOX object detection, you need the following models(".xmodel"). Please refer to the following article.

10. Imporve Object Detection Speed with YOLOX

Generated Files for Each Architecture

The created files for each architecture ("B512, B800, B1024, B1600, B2304, B3136, B4096") are listed below.

For B512, B3196, and B4096, files were also generated with the DPU clock set to 300MHz instead of 150MHz.

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/jupyter_notebooks/pynq-benchmark

YOLOX Inference Results for Each Architecture

The inference time for object detection on a single image using YOLOX was measured. (Testing object detection with an orange ball on table)

Testing object detection(YOLOX) with an orange ball on table

The times below exclude preprocessing and postprocessing.

# Fetch data to DPU and trigger it
dpu_start = time.time()
job_id = dpu.execute_async(input_data, output_data)
dpu.wait(job_id)
dpu_end = time.time()

The actual program can be found on the following GitHub repository:

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/jupyter_notebooks/pynq-benchmark/benchmark_dpu_yolox-nano_pt_coco2017.ipynb

Testing on KR260

Below is an example of copying to the jupyter_notebooks directory on the KR260.

sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/pynq-benchmark/ ./

We opened the.ipynb file(benchmark_dpu_yolox-nano_pt_coco2017.ipynb) on the KR260 via a web browser.

opened the .ipynb file

Summary of Key Findings

Larger architecture sizes generally result in faster performance.
Increasing the clock speed to the DPU also generally results in faster performance.

For B512, increasing the clock speed from 150MHz to 300MHz significantly improved the execution speed. For B3136 and B4096, the impact was less noticeable compared to B512.

In conclusion, the B4096 at 300MHz is the fastest for object detection with YOLOX. Therefore,we are implementing the B4096 at 300MHz in this main project as well.

compared the DPU execution(inference) speeds_2

5.12 Power Consumption of 360° Object Detection Robot Car

For program details and specifications, please refer to the Subproject below.

12. Power Consumption of 360° Object Detection Robot Car

In this Subproject, we measured Power Consumption of Robot Car with KR260.

measured Power Consumption of Robot Car with KR260

When trying to power the KR260 from a mobile battery, a power shortage occurred during the program startup.

Initially, our plan was to operate the robot equipped with the KR260 using a battery. This was because we'd like to operate the robot without being hindered by power cables.

We used a commercially available PD-compatible mobile battery (20W).

experienced a power shortage during startup

Here is a test video showing the actual power shortage:

actual power shortage video

Operating KR260 with a PD20W Mobile Battery

We prepared a PD-compatible mobile battery capable of 12V output. It’s a Philips DLP7721C, with the following specs:

Capacity: 20000mAh/3.7V

Output:

USB-A1/A2: DC 5V/3A, 9V/2A, 12V/1.5A
USB-C: DC 5V/3A, 9V/2.2A, 12V/1.67A

Philips DLP7721C PD_20W(12V/1.67A)

Since light mobile-batteries with outputs above 20W are quite expensive, we opted for the 20W version (as of 2024).

We also purchased a general-purpose current checker and a cable compatible with PD12V output to input 12V power into the KR260’s DC jack.

current checker and PD_12V cable

Power Consumption During Idle

The KR260 startup worked fine with the mobile battery. Linux booted normally, and during idle after plugging in the DC jack, the consumption was about 8.4W (12V_700mA).

idle_about 8.4W (12V_700mA)

Power Consumption with Device Connection + GUI

With a USB keyboard, mouse, and display connected, the consumption was about 10.2W (12V_850mA).

Idle(Device Connection + GUI)_about 10.2W (12V_850mA)

Power Consumption with 360° Camera Connection

Connecting a 360° camera resulted in a power consumption of about 12.0W (12V_1000mA).

360° Camera Connection_about 12.0W (12V_1000mA)

Power Consumption During 360° Object Detection

Running the DPU for 360° object detection, the power consumption was about 16.2W (12V_1350mA).

Up to this point, the 20W mobile battery could still handle the operation.

360° Object Detection_about 16.2W (12V_1350mA)

Motor Power Consumption

The robot uses three DC motors: one for arm control and two for robot car control.

The motors use 5V power, controlled by PWM from the KR260, and were estimated at 100% operation for measurement purposes.

The robot uses three DC motors by 5V

Arm DC Motor: about 2.5W (5V_500mA)

Arm DC Motor: about 2.5W (5V_500mA)

Robot Car DC Motor: about 4W each (5V_800mA)

Robot Car DC Motor: about 4W each (5V_800mA)

Thus, with all motors running, the estimated power consumption was 2.5W + 4W*2 = 10.5W (though it's rare for all three to operate simultaneously).

Testing Power Deficit with KR260

Adding up the total power consumption, it reached 26.7W, exceeding the 20W capacity of the battery. Hence, the power shortage and subsequent reboot during program startup were expected.

add up the actual power consumption table

add up the actual power consumption figure

No Issues with the 36W AC Adapter

The 36W (12V_3A) AC adapter included in the KR260 starter kit had no issues, comfortably handling the estimated 26.7W power consumption.

36W (12V_3A) AC adapter included in the KR260 starter kit

Therefore, for the final robot car demonstration, We had to use the AC adapter.

5.13 Application to Vitis AI ONNX Runtime Engine (VOE)

For program details and specifications, please refer to the Subproject below.

13. Application to Vitis AI ONNX Runtime Engine (VOE)

Note:

In this subproject, we will conduct tests in a different environment from the main project as part of the benchmarking process.

This subproject involved setting up a dedicated ONNX environment on the KR260. We will also introduce the use of the Vitis AI ONNX Runtime Engine (VOE).

Vitis AI ONNX Runtime Engine (VOE)

Using ONNX Runtime makes Vitis AI even more user-friendly. This time, we are testing the comparison between CPU and DPU using ONNX.

Using ONNXRuntime result example

Vitis AI 3.5 ONNX

Vitis AI 3.5 ONNX supports both C++ and Python. Refer to the official documentation here:

https://docs.amd.com/r/en-US/ug1414-vitis-ai/Programming-with-VOE

In this subproject, we will write and test YOLOX in Python code.

Building the Environment with PetaLinux from BSP

This subproject, we used PetaLinux to create the OS environment on the KR260. Download the BSP file from the link below and build it with PetaLinux:

https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/1641152513/Kria+SOMs+Starter+Kits#PetaLinux-Build-instructions

source /opt/petalinux/2023.1/settings.sh
petalinux-create -t project -s xilinx-kr260-starterkit-v2023.1-05080224.bsp 
cd xilinx-kr260-starterkit-2023.1/
petalinux-build
petalinux-package --boot --u-boot --force
petalinux-package --wic --images-dir images/linux/ --bootfiles "ramdisk.cpio.gz.u-boot,boot.scr,Image,system.dtb,system-zynqmp-sck-kr-g-revB.dtb" --disk-name "sda"

Writing the Image to the SD Card

Write the SD card image (.wic) created with PetaLinux, found in the following folder:

~/xilinx-kr260-starterkit-2023.1/images/linux/

We use balenaEtcher to write it to the SD card.

Write the SD card image (.wic)

Installing onnxruntime

The initial login name for KR260 is "petalinux". Follow the official documentation to install Vitis AI and the ONNX runtime on the KR260:

https://docs.amd.com/r/en-US/ug1414-vitis-ai/Programming-with-VOE

wget https://www.xilinx.com/bin/public/openDownload?filename=vitis_ai_2023.1-r3.5.0.tar.gz
sudo tar -xzvf openDownload\?filename\=vitis_ai_2023.1-r3.5.0.tar.gz -C /
ls
wget https://www.xilinx.com/bin/public/openDownload?filename=voe-0.1.0-py3-none-any.whl -O voe-0.1.0-py3-none-any.whl
pip3 install voe*.whl
wget https://www.xilinx.com/bin/public/openDownload?filename=onnxruntime_vitisai-1.16.0-py3-none-any.whl -O onnxruntime_vitisai-1.16.0-py3-none-any.whl
pip3 install onnxruntime_vitisai*.whl

It seems that xrt is not installed, so we responded with the following:

sudo dnf install xrt packagegroup-petalinux-opencv

Creating the DPU Environment with PetaLinux

In Ubuntu, the DPU environment of the FPGA could be overlaid via the PYNQ-DPU library.

In PetaLinux, we used xmutil to prepare the DPU environment.

We prepared the necessary files, including those needed for xmutil, and placed them on GitHub.

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/src/onnx-test

This Creation Process involves very lengthy steps. Therefore, the details are omitted here; please refer to the Subproject for more information.

Preparing the ONNX YOLOX Model

This time, we used the models provided by Vitis AI. Pre-trained and quantized models are provided as samples by Xilinx (AMD), converted from PyTorch models to ONNX:

https://github.com/Xilinx/Vitis-AI/tree/master/model_zoo/model-list/pt_yolox-nano_3.5

Download and unzip the YOLOX sample model:

wget https://www.xilinx.com/bin/public/openDownload?filename=pt_yolox-nano_3.5.zip
unzip openDownload\?filename\=pt_yolox-nano_3.5.zip

The "yolox_nano_onnx_pt.onnx" file is in the "quantized" folder. Transfer this file to the KR260 without further compilation.

yolox_nano_onnx_pt.onnx

Python Program for ONNX YOLOX (.py)

The actual program we ran is available on GitHub below.

This program is executed on the KR260.

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/src/onnx-test/onnx-yolox.py

Testing ONNX on KR260

We send the files created for the KR260.

First, We have configured the DPU to be loadable with xmutil.

We created an application called b4096_300m.

ls /lib/firmware/xilinx/
sudo mkdir /lib/firmware/xilinx/b4096_300m
sudo cp pl.dtbo shell.json /lib/firmware/xilinx/b4096_300m/
sudo cp dpu.xclbin /lib/firmware/xilinx/b4096_300m/binary_container_1.bin
ls /lib/firmware/xilinx/b4096_300m/

We also replaced the existing vart.conf with the newly created one.

sudo mv /etc/vart.conf /etc/old_vart.conf
sudo cp vart.conf /etc/
sudo reboot

From here, the steps follow the flow presented in the demo video.

ONNX and DPU (KR260) Test video

First, load the DPU application (b4096_300m).

xilinx-kr260-starterkit-20231:~$ sudo xmutil listapps
xilinx-kr260-starterkit-20231:~$ sudo xmutil unloadapp
xilinx-kr260-starterkit-20231:~$ sudo xmutil loadapp b4096_300m
xilinx-kr260-starterkit-20231:~$ sudo xmutil listapps

We run a Python program(onnx-yolox.py) within the onnx-test directory on the KR260.

xilinx-kr260-starterkit-20231:~$ cd onnx-test/
xilinx-kr260-starterkit-20231:~/onnx-test$ python3 onnx-yolox.py

When the program runs, it begins compiling to match the DPU specifications.

Note: The initial compilation takes a few minutes. Subsequent runs will skip the compilation step, making the process faster.

During the compilation, the log shows that the program reads the loaded DPU and compiles in DPU mode:

Compile mode: dpu
Debug mode: performance
Target architecture: DPUCZDX8G_ISA1_B4096_0101000016010407
Graph name: torch_jit, with op num: 815
Begin to compile...

compiling to match the DPU specifications_1

Once the compilation is complete, the Python program will execute.

compiling to match the DPU specifications_2

We're using YOLOX for object detection on a single image.

The results of the image recognition showed that the orange ball was successfully detected without any issues.

bboxes of detected objects: [[ 473.17449951  137.78985596  812.97937012  477.59475708]
 [   0.            5.46184874 1280.          720.        ]]
scores of detected objects: [0.73033565 0.20149007]
Details of detected objects: [49. 60.]
Pre-processing time: 0.0108 seconds
DPU execution time: 0.0129 seconds
Post-process time: 0.0360 seconds
Total run time: 0.0597 seconds
Performance: 16.740788045213616 FPS

YOLOX result_1

YOLOX result_2

Comparison with YOLOv3 and YOLOX

Although not a precise comparison, we will compare the content and speed tested in an Ubuntu environment as described in the article below.

10. Improve Object Detection Speed with YOLOX

We conducted similar tests with TensorFlow2's YOLOv3 and PyTorch's YOLOX.

When comparing the pre-processing, DPU inference, and post-processing, the results were as follows:

For the ONNX YOLOX, no particular speed optimizations were made from the original PyTorch. Both versions of YOLOX yielded almost identical results, which was as expected.

Comparison YOLO Speed with DPU table

Comparison YOLO Speed with DPU figure

Comparing YOLOX on CPU and DPU with ONNX

With the capability to use ONNX, comparing the CPU and DPU on the KR260 becomes straightforward.

By modifying a single line in the same program, you can switch from DPU to CPU for inference:

providers=["CPUExecutionProvider"]

session = onnxruntime.InferenceSession(
'yolox_nano_onnx_pt.onnx',
# providers=["VitisAIExecutionProvider"],
providers=["CPUExecutionProvider"],
provider_options=[{"config_file":"/usr/bin/vaip_config.json"}])

The actual program we ran is available on GitHub below.

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/src/onnx-test/onnx-cpu-yolox.py

xilinx-kr260-starterkit-20231:~/onnx-test$ python3 onnx-cpu-yolox.py

Here is the comparison of CPU and DPU on ONNX Test video:

comparison of CPU and DPU on ONNX Test video

The DPU inference was over 20 times faster than the CPU.

CPU inference

DPU inference

comparison of CPU and DPU on ONNX Table

comparison of CPU and DPU on ONNX Figure

When comparing the previous YOLOv3 and PyTorch YOLOX, the graph below summarizes the results.

It clearly shows how using the DPU can significantly speed up the process.

Comparing YOLO* on CPU and DPU Figure

5.14 Appendix: Object Detection Using YOLOX with a Webcam

For program details and specifications, please refer to the Subproject below.

14. Appendix: Object Detection Using YOLOX with a Webcam

Many people may not have the 360° camera required for Main project.

Therefore, as a reference, we will introduce a subproject using a generic webcam, as well as controlling DPU, GPIO and Output ROS2.

As a reference, we will introduce a subproject using a generic webcam

Introduction

Using ROS2's Rviz2 and KR260, we processed 360° camera images.We tried object detection with a regular USB-connected webcam using the KR260.

By using the FPGA’s DPU for YOLOX inference, we can achieve fast, real-time detection.

We will introduce the program along with the test results.

using the FPGA’s DPU for YOLOX inference with Webcam

Test Program for GStreamer and YOLOX with a Webcam

The program is saved in the following GitHub repository:

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/src/usb-camera

Here is the test video:

YOLOX Object Detect with Web camera and KR260

Perform object detection using live streaming from the webcam.

sudo su
cd /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/src/usb-camera/
source /etc/profile.d/pynq_venv.sh
python3 app_gst-yolox-real-normal-camera-gpio.py

When the program starts, you can see the webcam image with added object detection.

In the actual live streaming with YOLOX, about 17 fps is achieved.

When a yellow ball (sports ball) is detected, you can see that the LED (GPIO) is turned on.

Live Streaming Object Detection with Web camera_1

Live Streaming Object Detection with Web camera_2

Live Streaming Object Detection with Web camera_3

Testing Marker and Image Output with ROS2

We will test outputting Marker and Image data to ROS2 using data from a webcam.

ROS2 Marker and Image Output with Web camera

To visualize ROS2, start rviz2.

sudo su
source /opt/ros/humble/setup.bash 
rviz2

Once rviz2 is set up, start the program (gst-yolox-ros2-normal-camera.py).

sudo su
source /etc/profile.d/pynq_venv.sh
source /opt/ros/humble/setup.bash
cd /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/src/usb-camera/
python3 gst-yolox-ros2-normal-camera.py

The test video is as follows:

ROS2 Marker and Image Output with Web camera Video

Using the information detected by YOLOX from the webcam, the output to ROS2 is executed at about 17 fps.

Markers and images are published to ROS2 without any issues.

Marker and Image Output with ROS2_1

Marker and Image Output with ROS2_2

6. Future Work

This has been a greatfun challenge. Thanks to AMD and Hackster for the opportunity and the hardware.

By leveraging the KR260, we've developed a robot with 360-degree AI vision. Although we used an older, inexpensive 360° camera with USB 2.0, we successfully achieved 360° object detection and visualization using ROS2.

Thanks to AMD and Hackster for the opportunity and the hardware.

Potential future work includes:

Developing AI vision with high-speed communication using a 360° camera with USB 3.0.
Implementing high-speed transmission of 360° images using H264/H265 hardware encoders/decoders.
Migrating CPU processing to DPU processing for further speed improvements in image recognition.

We plan to try these improvements as soon as we have the time.

Thank you for reading.

Code

Credits

misoji engineer

19 projects • 11 followers

Contact

Comments

Please log in or sign up to comment.

Awards

Robotics AI: 2nd Place

Pervasive AI Developer Contest

360° Object Detection Robot Car

Things used in this project

Hardware components

Software apps and online services

Story

1. Introduction

2. BOM

3. Electrical Diagrams

4. Assembly Instructions

5. Subprojects

5.1 PYNQ + GPIO(LED Blinking)

5.2 PYNQ + PWM(DC-Motor Control)

5.3 Object Detection(Yolo) with DPU-PYNQ

5.4 Implementation DPU, GPIO, and PWM

5.5 Remote Control 360°Camera

5.6 GStreamer + OpenCV with 360°Camera

5.7 360 Live Streaming + Object Detect(DPU)

5.8 ROS2 3D Marker from 360 Live Streaming

5.9 Control 360° Object Detection Robot Car

5.10 Improve Object Detection Speed with YOLOX

5.11 Benchmark Architectures of the DPU

5.12 Power Consumption of 360° Object Detection Robot Car

5.13 Application to Vitis AI ONNX Runtime Engine (VOE)

5.14 Appendix: Object Detection Using YOLOX with a Webcam

6. Future Work

Schematics

PCB-KR260-Debug(LED/SW)

PCB-KR260-Motor-driver

Main Electrical diagram

PMOD1 diagram

PMOD2 diagram

PMOD4 diagram

Code

AMD-Pervasive-AI-Developer-Contest

Credits

misoji engineer

Comments

Awards

Embed the widget on your own site

360° Object Detection Robot Car

360° Object Detection Robot Car

Things used in this project

Hardware components

Software apps and online services

Story

1. Introduction

2. BOM

3. Electrical Diagrams

4. Assembly Instructions

5. Subprojects

5.1 PYNQ + GPIO(LED Blinking)

5.2 PYNQ + PWM(DC-Motor Control)

5.3 Object Detection(Yolo) with DPU-PYNQ

5.4 Implementation DPU, GPIO, and PWM

5.5 Remote Control 360°Camera

5.6 GStreamer + OpenCV with 360°Camera

5.7 360 Live Streaming + Object Detect(DPU)

5.8 ROS2 3D Marker from 360 Live Streaming

5.9 Control 360° Object Detection Robot Car

5.10 Improve Object Detection Speed with YOLOX

5.11 Benchmark Architectures of the DPU

5.12 Power Consumption of 360° Object Detection Robot Car

5.13 Application to Vitis AI ONNX Runtime Engine (VOE)

5.14 Appendix: Object Detection Using YOLOX with a Webcam

6. Future Work

Schematics

PCB-KR260-Debug(LED/SW)

PCB-KR260-Motor-driver

Main Electrical diagram

PMOD1 diagram

PMOD2 diagram

PMOD4 diagram

Code

AMD-Pervasive-AI-Developer-Contest

Credits

misoji engineer

Comments

Awards

Related channels and tags