In this project, we introduce a Robotic Car with 360° camera and manipulator arm, controlled by KR260.
We will provide detailed information about the hardware and software configurations.
This project integrates advanced technologies like Object Detection, DPU, PYNQ, Vitis AI and ROS2.
A key feature is the incorporation of a 360° Camera.
By leveraging the KR260, we've developed a robot with 360-degree AI vision.
Using KR260 and PYNQ, 360° object detection is processed on the PL (FPGA).
The PL also handles PWM and GPIO for driving the Robot Car and Arm.
By integrating PL and PS, the robot achieves 360-degree AI vision.
Additionally, we are using the latest Vitis AI technology to compare and accelerate object detection(YOLO*).
The main output topics are as follows:
- Visualizing ROS2 3D markers from 360° Camera
- Object detection using DPU-PYNQ within 360 Live Streaming
- Control Robotic Car and Arm with PYNQ + Original PCB
- Comparison of Object Detection Acceleration Using the Latest Vitis AI
This bellow video summarizes the main results. It's all wrapped up in about one minute, so please take a look!
2. BOMThe BOM is composed mainly of generic parts.
This low-cost robot costs about $550 with a 360° camera (and about $200 without 360° camera).
The detailed BOM and Cost list is also saved in csv on GitHub;
2.1360° Camera BOM list.
We use Ricoh Theta V as 360° Camera.
Ricoh offers a public API for developers, along with extensive API documentation and SDKs for the Theta series. This enables remote control and integration with various applications.
And Ricoh Theta V is old model(via USB2.0), it's low cost 360° camera.
We attach the 360° camera to the robot car. It has a pre-installed 1/4-inch mounting hole for easy attachment.
- 360° Camera --- Ricoh Theta V
- Bolt to fix the 360° Camera --- 1/4-20 x 3"
2.2Robot Car and Arm BOM list.
We use TAMIYA's Robot kit as Main Flame.
We are creating additional frames to accommodate the KR260. Using TAMIYA's universal kit, it includes the DC motors and small screws.
- Robot Main-Flame --- Tamiya 70162 RC Robot Construction Set
- Robot Car-Gearbox --- Tamiya 70168 Double Gearbox
- Robot Car-Add-plate --- Tamiya 70098 Universal Plate
- Robot Car-Add-Tires --- Tamiya 70096 Off-Road Tires
2.3Motor-Driver-PCB BOM list.
This is our custom KR260-specific motor driver board. We use two boards: one for driving the robot car and one for the arm.
Details will be explained later, and the board data is available on GitHub.
- Motor-Driver-PCB --- Original-PCB
- Motor-Driver-IC(U1) --- DRV8833PWPR
- Resistor_SMD(R1) --- 0603_1608Metric_10k
- Resistor_SMD(R2.R3) --- 0603_1608Metric_0 (or short-bar)
- Capacitor_SMD(C1) --- 0603_1608Metric_2.2u
- Capacitor_SMD(C2) --- 0603_1608Metric_10u
- Capacitor_SMD(C3) --- 0603_1608Metric_0.1u
- PinHeader_2x06(J1) -- 2x06_P2.54mm_Horizontal
- PinHeader_1x02(J2, J3, J4) --- 1x02_P2.54mm_Vertical
2.4Debug(LED/SW)-PCB BOM list.
This is our custom KR260-specific debug (LED/SW) board.
Details will be explained later, and the board data is available on GitHub.
- Debug(LED/SW)-PCB --- Original-PCB
- Resistor_SMD(R1, R2, R3) --- 0603_1k
- Resistor_SMD(R4) --- 0603_10k
- LED_D3.0mm(D1) --- Red_LED
- LED_D3.0mm(D2) --- Green_LED
- LED_D3.0mm(D3) --- Blue_LED
- PinHeader_2x06(J1) --- 2x06_P2.54mm_Horizontal
2.5Others list.
Other items include only wires and spacers.
(Most of the necessary screws and nuts are included in the TAMIYA kit.)
They are used for connections from the motor to the motor driver and for connecting the motor power supply. Additionally, only M3 screws and spacers are used to secure the KR260.
This is Main electrical diagram. No special power supply is needed.
These are PMOD diagrams.
We're using the 12V Main Power supply that comes with the Kria starter kit.
For the Motor Power, you can use either the 5V from the KR260 Pi connector or the external battery.
The 360° camera just needs to be connected via USB.
3.1Motor Driver PCB circuit diagram
This is the circuit diagram for the motor driver.
It is designed to connect directly to a PMOD(J1) connector.
The motor power is supplied through the J4 connector, and the DC motors can be controlled individually via the J2 and J3 connectors.
3.2Debug (LED/SW) PCBcircuit diagram
This is the circuit diagram for the Debug (LED/SW) board.
It is designed to connect directly to a PMOD(J1) connector.
The board features 3 LEDs and 1 switch.
3.3 PMOD pins setting(.xdc)
This is the PMOD pins setting file(.xdc).
#PMOD1 motor-driver1
set_property PACKAGE_PIN H12 [get_ports PWM_0]
set_property PACKAGE_PIN E10 [get_ports PWM_1]
set_property PACKAGE_PIN D10 [get_ports PWM_2]
set_property PACKAGE_PIN C11 [get_ports PWM_3]
set_property PACKAGE_PIN B10 [get_ports gpio_rtl_0_tri_o[0]]
#PMOD2 motor-driver2
set_property PACKAGE_PIN J11 [get_ports gpio_rtl_0_tri_o[5]]
set_property PACKAGE_PIN J10 [get_ports gpio_rtl_0_tri_o[6]]
set_property PACKAGE_PIN H11 [get_ports gpio_rtl_0_tri_o[1]]
#PMOD3 infrared sensor
set_property PACKAGE_PIN AE12 [get_ports gpio_rtl_1_tri_i[0]]
set_property PACKAGE_PIN AF12 [get_ports gpio_rtl_1_tri_i[1]]
#PMOD4 debug(led-sw)
set_property PACKAGE_PIN AC12 [get_ports gpio_rtl_0_tri_o[2]]
set_property PACKAGE_PIN AD12 [get_ports gpio_rtl_0_tri_o[3]]
set_property PACKAGE_PIN AE10 [get_ports gpio_rtl_0_tri_o[4]]
set_property PACKAGE_PIN AF10 [get_ports gpio_rtl_1_tri_i[2]]
set_property IOSTANDARD LVCMOS33 [get_ports PWM_0]
set_property IOSTANDARD LVCMOS33 [get_ports PWM_1]
set_property IOSTANDARD LVCMOS33 [get_ports PWM_2]
set_property IOSTANDARD LVCMOS33 [get_ports PWM_3]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[0]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[1]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[2]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[3]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[4]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[5]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_0_tri_o[6]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_1_tri_i[0]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_1_tri_i[1]]
set_property IOSTANDARD LVCMOS33 [get_ports gpio_rtl_1_tri_i[2]]
4. Assembly InstructionsAssembling the Robot Car and Arm is straightforward by following the TAMIYA kit instructions.
The custom KR260-specific board data is freely available on GitHub. Soldering can be done at a hobbyist level.
4.1Robot-Car and Arm
The robot car can move forward, backward, and rotate by independently controlling the forward and reverse motion of two motors.
The gearbox increases torque with a gear ratio of 344.2:1. Assemble the gearbox according to the instructions and attach it to the frame.
The robot arm mechanism is simple. It uses a single DC motor to move the arm via a crank gearbox with a gear ratio of approximately 1543:1.
This was also assembled according to the instructions. The arm mechanism is included in the kit.
4.2Additional frames for KR260
An additional frame was created to mount the KR260 on the robot.
It is designed for easy attachment and removal to facilitate debugging when the KR260 needs to be separated from the robot.
4.3Mounting the360° Camera
Drill a hole in the robot car to insert bolts for securing the 360° camera. Using a mini router in addition to nippers will make this easier.
4.4Motor Driver PCB
The PCB data is available on GitHub. Here is the link.
https://github.com/iotengineer22/PCB-DRV8833-TEST
Zip the "drv8833-pcb" folder and send it to any PCB fabrication company (e.g., PCBGOGO).
The PCB specifications match the default settings of most fabrication companies. Just enter the dimensions and proceed with the order.
- PCB Size: 17.78mm x 40.64mm
- Layers: 2
- Material: FR-4
- FR4-TG: TG150-160
- Thickness: 1.6mm
- Min track/spacing: 6/6mil
- Minimum hole size: 0.3mm
- Solder mask: Green
- Silkscreen: White
Once the PCB arrives, solder the components according to the BOM (Bill of Materials) list.
4.5Debug (LED/SW) PCB
The PCB data is available on GitHub. Here is the link.
https://github.com/iotengineer22/PCB-KV260-PMOD-TEST
Zip the "kv260-gpio-smd-pcb" folder and send it to any PCB fabrication company.
- PCB Size: 17.78mm x 40.64mm
- The other PCB specifications are the same as the motor driver PCB.
Once the PCB arrives, solder the components according to the BOM (Bill of Materials) list.
This project involves various technologies and has a long series of steps to completion.
To help beginners progress step by step, each step is divided into subprojects.
While this project provides detailed explanations, please refer to the links to each subproject for specific programming and software setup instructions.
Subprojects are organized into the following chapters.
2. PYNQ + PWM(DC-Motor Control)
3. Object Detection(Yolo) with DPU-PYNQ
4. Implementation DPU, GPIO, and PWM
6. GStreamer + OpenCV with 360°Camera
7. 360 Live Streaming + Object Detect(DPU)
8. ROS2 3D Marker from 360 Live Streaming
9. Control 360° Object Detection Robot Car
10. Improve Object Detection Speed with YOLOX
11. Benchmark Architectures of the DPU
12. Power Consumption of 360° Object Detection Robot Car
13. Application to Vitis AI ONNX Runtime Engine (VOE)
14. Appendix: Object Detection Using YOLOX with a Webcam
Please note that before running the above subprojects, the following setup, which is the reference for this AMDcontest, is required.
https://github.com/amd/Kria-RoboticsAI
To test as described in the following projects, you can download the repository on the KR260 as follows:
cd /home/ubuntu
git clone https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest.git
5.1 PYNQ + GPIO(LED Blinking)For program details and specifications, please refer to the Subproject below.
In this Subproject, we experimented with controlling GPIO on the KR260 FPGA board.
Using Python (PYNQ), we managed to perform LED output and switch input via the PMOD connector with custom-designed board.
We'll share how we designed an original board and tested its functionality.
Introduction
The KR260, an FPGA board from AMD (Xilinx), is equipped with a PMOD connector that can also be utilized for GPIO purposes.
We created a custom board to experiment with LED output and switch input functionalities.
Run GPIO Test
Within the /root/jupyter_notebooks/ directory in KR260, a folder is created to house the executed .ipynb file alongside the .bit and .hwl files produced by Vivado.
Below is an example of copying to the jupyter_notebooks directory on the KR260.
sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/pynq-gpio ./
After KR260 installation, the IP address is confirmed using ifconfig; in my case, it was 192.168.11.7.
Accessing Jupyter Notebook is done by navigating to http://192.168.11.7:9090/ in a web browser (Chrome was used here).
The test .bit.hwh.ipynb files are available on GitHub.
The notebook includes steps to control the AXI-GPIO IP, loaded from the Vivado-generated .bit file, utilizing PYNQ's AXI-GPIO library. Official documentation for AXI-GPIO manipulation can be found here.
Please check test video to see it in action.
In the notebook, the AxiGPIO is imported, and the GPIO IP and channel1 are specified.
LED outputs are managed through a series of Write commands, cycling through 0x3, 0x2, 0x1, and 0x0 to toggle the LEDs on and off.
Switch input reading is also demonstrated, showing 1 when pressed and 0 when not.
A for loop is used to sequentially light up three LEDs.
For program details and specifications, please refer to the Subproject below.
2. PYNQ + PWM(DC-Motor Control)
We tested controlling PWM (Pulse Width Modulation) on the KR260 FPGA board.
Using Python (PYNQ), we output PWM signals to control a motor driver board.
We created an original board to control a DC motor and will introduce the details here.
Introduction
The KR260 FPGA board from AMD (Xilinx) has PMOD connectors, which can be used as PWM pins.
We created a motor driver board for KR260.
This subproject demonstrates the successful PWM control of a DC motor.
(And we also tried controlling an LED with PWM.)
Run PWM Test
Within the /root/jupyter_notebooks/ directory in KR260, a folder is created to house the executed .ipynb file(PWM-test-PCB.ipynb) alongside the .bit and .hwl files produced by Vivado.
Below is an example of copying to the jupyter_notebooks directory on the KR260.
sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/pynq-pwm/ ./
After KR260 installation, the IP address is confirmed using ifconfig; in this case, it was 192.168.11.9.
Use the Kria-PYNQ environment via Jupyter Notebook to control the PWM. Connect to the KR260 board using a LAN cable and find the IP address using ifconfig
. Access the Jupyter Notebook at http://<IP_ADDRESS>:9090/
.
ipynb File
The test .bit.hwh.ipynb files(PWM-test-PCB.ipynb) are available on GitHub.
LED PWM Test
Before controlling the motor, I tested PWM control on an LED.
The test video shows the LED brightness changing with PWM values at 10%, 50%, and 99%.
Controlling DC Motor with PWM
Connect the motor driver board and DC motor, and use PWM to control them. After loading the FPGA, control the motor in both forward and reverse directions with PWM values at 10%, 50%, and 99%.
The test video below demonstrates the successful PWM control of a DC motor.
For program details and specifications, please refer to the Subproject below.
3. Object Detection(Yolo) with DPU-PYNQ
We tested object detection on images from camera using the KR260 and YOLOv3.
Originally, there was a sample program for PYNQ-DPU, which we modified.
Here, we will introduce the model used and the methods of modification.
Introduction
KR260 can utilize DPU (Deep Processing Unit).
We have conducted object detection(Yolo) on 360° and normal images with KR260.
Here is a test video using Jupyter Notebooks with DPU-PYNQ and KR260.
We also tested this in a Python program with similar results.
Object Detection(.ipynb)
In the /root/jupyter_notebooks/ directory of KR260, copy the.ipynb file, the COCO2017 list, and the JPEG files into the existing pynq-dpu folder.
And we have provided the .xmodel(kr260_yolov3_tf2.xmodel) at the following link. This is the model used for testing the DPU (object detection).
Below is an example of copying to the jupyter_notebooks directory on the KR260.
sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/pynq-dpu/ ./
#Please follow the README.md in the folder to download the .xmodel file.
Use the Kria-PYNQ environment via Jupyter Notebook to control the PWM. Connect to the KR260 board using a LAN cable and find the IP address using ifconfig
. Access the Jupyter Notebook at http://<IP_ADDRESS>:9090/
.
Test.ipynb
Here is a test video using Jupyter Notebooks with DPU-PYNQ and KR260.
Open the Jupyter Notebook on KR260 and proceed with the execution.
The default.bit file is used for DPU on KR260. Using TensorFlow2 Model with VART for YOLOv3 Detection
We performed object detection on 80 categories from COCO2017 using VART on DPU.
We tested with three photos: two 360° images and one regular camera image.
The 360° images did not perform well in object detection, failing to detect the balls in the foreground.
In contrast, images captured with a regular smartphone camera successfully detected the ball.
Consideration of Test Results(360° images)
We have conducted object detection on 360° images(5376x2688).
While we have successfully detected human figures, the yellow ball was not detected.
360° images are too wide, making it difficult for Yolo model to detect objects.
This indicates that further adjustments are needed.
By splitting the image into two and setting the aspect ratio to 1:1(2688x2688), detection improves significantly.
Object Detection(.py)
We also implemented the program in Python, available on the following GitHub repository:
Below is an example of program(.py) execution.
ubuntu@kria:~$ sudo su
root@kria:/home/ubuntu# source /etc/profile.d/pynq_venv.sh
(pynq-venv) root@kria:/home/ubuntu# cd $PYNQ_JUPYTER_NOTEBOOKS
(pynq-venv) root@kria:~/jupyter_notebooks# cd pynq-dpu/
(pynq-venv) root@kria:~/jupyter_notebooks/pynq-dpu# python3 app_yolov3_tf2_mymodel-name-test.py
yolov3_test, in TensorFlow2
(1, 416, 416, 3)
Number of detected objects: 2
Details of detected objects: [49 60]
Performance: 2.902664666183155 FPS
Here is a test video using Python with DPU-PYNQ and KR260.
Object detection can be performed using.py files just as effectively as with.ipynb files.
For program details and specifications, please refer to the Subproject below.
4. Implementation DPU, GPIO, and PWM
Using Vivado and Vitis, we created a project to synthesize the DPU IP.
We utilized the DPU created on PYNQ with KR260 to perform object detection using Vitis AI (Yolo).
In this post, we will introduce the process of running GPIO (PWM) alongside the DPU on KR260.
Creation Process
There are various methods to create a project that includes the DPU IP.
This example follows a specific process.
The author created this project in the Vivado 2023.1 and Vitis 2023.1 environment, so adjust accordingly if following along.
- Synthesize the MPSoC, clock, and reset required for the DPU using Vivado.
- Synthesize the DPU using Vitis.
- Synthesize other IPs, such as GPIO and PWM, using Vivado.
This Creation Process involves very lengthy steps. Therefore, the details are omitted here; please refer to the Subproject for more information.
Running the DPU on KR260 with PYNQ
Within the /root/jupyter_notebooks/ directory in KR260, a folder is created to house the executed .ipynb file alongside the .bit and .hwl files produced by Vivado.
Below is an example of copying to the jupyter_notebooks directory on the KR260.
sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/pynq-original-dpu-model/ ./
#Please follow the README.md in the folder to download the .xmodel file.
After KR260 installation, the IP address is confirmed using ifconfig; in my case, it was 192.168.11.9.
Use the Kria-PYNQ environment via Jupyter Notebook to control the PWM. Connect to the KR260 board using a LAN cable and find the IP address using ifconfig
. Access the Jupyter Notebook at http://<IP_ADDRESS>:9090/
.
ipynb File
The test .bit,.hwh,.xclbin, and.ipynb(my-dpu-gpio-test.ipynb) files are available on GitHub.
And we have provided the .xmodel(kr260_yolov3_tf2.xmodel) at the following link. This is the model used for testing the DPU (object detection).
Run the modified Jupyter notebook on PYNQ, which includes additional configurations for GPIO and PWM.
The main modifications include applying the DPU overlay to a standard overlay and configuring GPIO(PWM) outputs based on detected objects.
Below are the key points of this program.
from pynq_dpu import DpuOverlay
overlay = DpuOverlay("/root/jupyter_notebooks/pynq-original-dpu-model/dpu.bit")
from pynq import Overlay
from pynq.lib import AxiGPIO
ol = overlay
# LED(GPIO)_set
gpio_0_ip = ol.ip_dict['axi_gpio_0']
gpio_out = AxiGPIO(gpio_0_ip).channel1
mask = 0xffffffff
Test Results
Here is the test video demonstrating from the setup.
We have successfully performed object detection (Yolo) using the DPU.
Additionally, we have managed to integrate and operate GPIO (PWM) alongside the DPU.
If the specified objects (Bus, Ball) are detected in the photos within the img folder, specific GPIO and PWM outputs(LED) are triggered.
For program details and specifications, please refer to the Subproject below.
We tried controlling the RICOH THETA V 360° camera from the KR260 using PYNQ.
In this project, we will introduce the installation method and provide examples of how to execute the control commands.
Introduction
We successfully controlled the RICOH THETA V 360° camera from the KR260 via USB.
Here are the photos taken by the 360° camera:
Installation Steps
The 360° camera used is the RICOH THETA V. It is a user-friendly 360° camera with various APIs and libraries available.
Connecting THETA(360° camera) to KR260
Connect the THETA(360° camera) to the KR260 using a USB cable.
Checking the logs with dmesg
shows the successful connection.
ubuntu@kria:~$ sudo su
root@kria:/home/ubuntu# dmesg | grep usb
[ 8.214013] usb 1-1.2: New USB device found, idVendor=05ca, idProduct=0368, bcdDevice= 1.00
[ 8.222380] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 8.229697] usb 1-1.2: Product: RICOH THETA V
[ 8.234055] usb 1-1.2: Manufacturer: Ricoh Company, Ltd.
[ 8.239367] usb 1-1.2: SerialNumber: 00119628
Building and Installing the Library
Follow the steps below to build and install the library from GitHub.
https://github.com/codetricity/libptp2-theta
sudo apt install build-essential libtool automake pkg-config subversion libusb-dev
git clone https://github.com/codetricity/libptp2-theta
cd libptp2-theta/
./configure
make
autoreconf -i
./configure
automake
make
sudo make install
sudo ldconfig -v
Run theta Test
Using the Kria-PYNQ Jupyter Notebook, you can control the camera from Python.
This example shows one possible method, but various operations are possible using the official RICOH USB API information.
https://github.com/ricohapi/theta-api-specs/tree/main/theta-usb-api
Below is the test.ipynb file. It has been uploaded to GitHub.
Below is an example of copying to the jupyter_notebooks directory on the KR260.
sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/theta-check/ ./
Below is the test video.
You can see that the 360° camera is controlled from an.ipynb file.
Here are some examples of commands:
To wake up the THETA from power-saving mode. The camera's LED light will turn blue.
theta_wakeup = getoutput('theta --set-property=0xD80E --val=0x00')
print(theta_wakeup)
Camera: RICOH THETA V
'UNKNOWN' is set to: 0
Changing property value to 0x00 [(null)] succeeded.
Switch to camera shooting mode. The camera icon will light up blue.
theta_camera_mode = getoutput('theta --set-property=0x5013 --val=0x0001')
print(theta_camera_mode)
Camera: RICOH THETA V
'Still Capture Mode' is set to: [Normal]
Changing property value to 0x0001 [(null)] succeeded.
All that's left is to take the photo (capture).
time1 = time.time()
theta_capture = getoutput('theta --capture')
time2 = time.time()
capture_time = time2 - time1
print("Performance: {} (s)".format(capture_time))
print(theta_capture)
Performance: 3.308269500732422 (s)
Initiating capture...
Object added 0x000000f2
Capture completed successfully!
5.6 GStreamer + OpenCV with 360°CameraFor program details and specifications, please refer to the Subproject below.
6. GStreamer + OpenCV with 360°Camera
In this project, we'll walk you through how we achieved real-time image processing using a KR260 and a 360° camera (RICOH THETA).
We'll cover the installation methods and the programs used.
Introduction
We connected the pipeline using GStreamer and processed it with OpenCV.
This setup allows us to obtain 360° live streaming data via USB from KR260.
The following image was captured in real-time from the RICOH THETA on the KR260.
Installation Methods and Programs
libuvc
We use libuvc to obtain video streams from the USB camera (RICOH THETA).
You need to download and install the necessary UVC (USB Video Class) libraries.
Refer to the following GitHub repository:
https://github.com/nickel110/libuvc.git
sudo apt update
sudo apt install cmake
sudo apt install libusb-1.0-0-dev
sudo su
git clone https://github.com/nickel110/libuvc.git
cd libuvc/
mkdir build
cd build/
cmake ..
make && sudo make install
GStreamer
We use GStreamer pipelines to encode the video stream.
Refer to the official GStreamer documentation:
https://gstreamer.freedesktop.org/documentation/installing/on-linux.html?gi-language=c
Install GStreamer with the following command:
sudo apt install libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev libgstreamer-plugins-bad1.0-dev gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly gstreamer1.0-libav gstreamer1.0-tools gstreamer1.0-x gstreamer1.0-alsa gstreamer1.0-gl gstreamer1.0-gtk3 gstreamer1.0-qt5 gstreamer1.0-pulseaudio
v4l2loopback-dkms
Installing v4l2loopback-dkms creates virtual video devices.
sudo apt install v4l2loopback-dkms
gstthetauvc
Download and install the GStreamer THETA UVC plugin (gstthetauvc) for RICOH THETA. Refer to the following GitHub repository:
https://github.com/nickel110/gstthetauvc.git
git clone https://github.com/nickel110/gstthetauvc.git
cd gstthetauvc/thetauvc/
make
Move the created gstthetauvc.so file to the GStreamer plugins folder. Locate gstreamer-1.0 and copy the file:
sudo find / -type d -name 'gstreamer-1.0'
ls /usr/lib/aarch64-linux-gnu/gstreamer-1.0
sudo cp gstthetauvc.so /usr/lib/aarch64-linux-gnu/gstreamer-1.0
ls /usr/lib/aarch64-linux-gnu/gstreamer-1.0
Update the library links and cache:
sudo /sbin/ldconfig -v
Check if the gstthetauvc plugin is available:
gst-inspect-1.0 thetauvcsrc
GStreamer with RICOH THETA
In this subproject, we ran the tests in a regular root environment rather than the PYNQ virtual environment.
We'll demonstrate how to perform image processing using 360° camera(RICOH THETA) with GStreamer and OpenCV.
We will process 2K (1920x960) 360° video streams using GStreamer.
First, we capture 360° images directly from the RICOH THETA on the KR260.
This setup allows for real-time 360° live streaming on the screen.
The program is written in Python and is available on GitHub.
https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/src/gst-test/
We tested the setup by throwing a ball around the 360° camera.
sudo su
cd /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/src/gst-test/
python3 gst-test-360-no-divide.py
The test video is below:
The KR260 captures the 360° video stream from the RICOH THETA in real-time.
The pipeline configuration uses OpenCV only for display purposes with imshow. We capture and display 2K (1920x960) image data.
We tested by throwing a ball around the 360° camera. Viewing the image from above provides a clear perspective.
For program details and specifications, please refer to the Subproject below.
7. 360 Live Streaming + Object Detect(DPU)
We conducted real-time object detection on 360 live streaming image data.
In this subproject, we'll introduce how we used the KR260 and PYNQ-DPU for object detection, and how we controlled GPIO and PWM.
Introduction
We conducted real-time object detection on 360 live streaming image data using the RICOH THETA 360° camera.
Initially, we performed object detection (Yolo) using the DPU of KR260.
GStreamer Not Included in PYNQ's OpenCV
To run the DPU, it was necessary to execute it in PYNQ's virtual environment.
However, upon checking the build information of PYNQ's OpenCV, we found that GStreamer was not included.
Here is the check program for my environment:
(pynq-venv) root@kria:/home/ubuntu/gst-dpu-test# python3 gst-back-check.py
GStreamer:
GStreamer: NO
GStreamer is essential for processing 360° video streams in real-time.
Therefore, I uninstalled opencv-python within the PYNQ virtual environment:
sudo su
source /etc/profile.d/pynq_venv.sh
pip uninstall opencv-python
After uninstalling, I switched to the Ubuntu environment on the KR260 and confirmed that GStreamer is available.
(pynq-venv) root@kria:/home/ubuntu/gst-dpu-test# python3 gst-back-check.py
GStreamer:
GStreamer: YES (1.19.90)
360° Object Detect(DPU)
The test .bit,.hwh,.xclbin, and.py(app_gst-real-360-yolov3_tf2.py) files are available on GitHub.
And we have provided the .xmodel(kr260_yolov3_tf2.xmodel) at the following link. This is the model used for testing the DPU (object detection).
https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/src/gst-dpu-test/
sudo su
source /etc/profile.d/pynq_venv.sh
cd /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/src/gst-dpu-test/
python3 app_gst-real-360-yolov3_tf2.py
#Please follow the README.md in the folder to download the .xmodel file.
Here is test video:
Initially, the Python program splits the 2k (1920x960) 360° image data into four 480x480 images.
It is because applying a standard object detection YOLO model to a very wide 360° image makes object detection difficult.
By dividing and cropping into four 480x480 sections (or two 960x960 sections), object detection can be effectively performed in each section.
360° Object Detect(DPU) + GPIO
We used the KR260 and PYNQ-DPU for object detection and controlled LEDs (GPIO). To verify GPIO functionality, we connected a debug (LED/SW) PCB to the PMOD connector.
The test.py (app_gst-real-360-yolov3_tf2.py) files are the same as last.
python3 app_gst-real-360-yolov3_tf2.py
Here is test video:
By detecting a ball in the 360° image, specific sections were designated to light up red, green, or blue LEDs.
When a ball is detected in section 1 (front side), all LEDs turn on.
If detected in section 2 (right side), the red LED turns on.
If detected in section 3 (back side), the green LED turns on.
If detected in section 4 (left side), the blue LED turns on.
For program details and specifications, please refer to the Subproject below.
8. ROS2 3D Marker from 360 Live Streaming
We experimented with processing 360° camera images using ROS2 Rviz2.
We managed real-time object detection from 360° live streaming data.
In this project, we'll introduce the program and share test videos.
Introduction
Using ROS2's Rviz2 and KR260, we processed 360° camera images.
We used real-time object detection data from 360° live streaming.
We placed the detected objects' bounding boxes as markers in Rviz2.
Installing ROS2 Rviz2
Rviz2 is a 3D visualization tool for ROS2 (Robot Operating System 2).
This time, we'll display object detection data from 360° live streaming images in Rviz2.
Install it with the following steps:
sudo apt update
sudo apt install ros-humble-rviz2
sudo su
source /opt/ros/humble/setup.bash
rviz2
Installing OpenCV Related Packages for ROS2
We are acquiring and processing images from the 360° camera via OpenCV.
To use it with ROS2, we'll also install related libraries.
sudo apt install ros-humble-image-transport
sudo apt install ros-humble-cv-bridge
Operating Rviz2
Start Rviz2 with the following command:
sudo su
source /opt/ros/humble/setup.bash
rviz2
ROS2 3D Marker from 360 Live Streaming
We conducted real-time object detection on 360 live streaming image data using the RICOH THETA 360° camera and DPU.
Although it's not related to the current subproject, we will go ahead and install the game controller library below.
sudo su
source /etc/profile.d/pynq_venv.sh
pip install inputs
The test .bit,.hwh,.xclbin, and.py(gst-ros2-360-detect-car.py) files are available on GitHub.
And we have provided the .xmodel(kr260_yolov3_tf2.xmodel) at the following link. This is the model used for testing the DPU (object detection).
https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/src/gst-ros2
sudo su
source /etc/profile.d/pynq_venv.sh
source /opt/ros/humble/setup.bash
cd /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/src/gst-ros2/
python3 gst-ros2-360-detect-car.py
#Please follow the README.md in the folder to download the .xmodel file.
Here is test video:
We display images and markers published from KR260 using ROS2's Rviz2.
The KR260 splits a 360° image into four sections and performs object detection and marker output in each section.
We are conducting tests to pick up and transport balls around the 360° camera. The real-time changes, including the ROS2 markers, can be observed.
For program details and specifications, please refer to the Subproject below.
9. Control 360° Object Detection Robot Car
We control 360° Object Detection Robot Car with KR260.
The object detection is performed using DPU, and marker output is executed with ROS2 while the Robot Car is in motion.
Debugging Robot actuators
We used a game controller to move the KR260 robot. Using a wireless game controller enables remote control.
We used an ELECOM Wireless Gamepad JC-U4113SBK, which is designed for PC but worked seamlessly with the KR260.
The KR260 is controlled using PYNQ, which means using Python for control. The game controller library, inputs
, is used because it can operate without a display or GUI. Install it with:
sudo su
source /etc/profile.d/pynq_venv.sh
pip install inputs
Debug Test 1
The debug test .bit,.hwh,.xclbin, and.ipynb(controller-pwm-gpio-test.ipynb) files are available on GitHub.
The test program for this run should already be copied when executing the subproject below.
4. Implementation DPU, GPIO, and PWM
Here is one of the debug test videos below.
Currently, we are not moving the robot car; this is just actuator debugging.
By moving the stick up and down, the robot wheel motors are controlled via PWM.
It is also evident that pressing the buttons controls the DC motor of the robot arm.
Debug Test 2
Here is one of the debug test videos below. We debug tests by actually moving the robot car.
The motor car drives its wheels to transport the ball. Additionally, the arm is activated again to lift and lower the ball.
360° Object Detect Test
We conducted real-time object detection on 360 live streaming image data using the RICOH THETA 360° camera and KR260 DPU.
The test .bit,.hwh,.xclbin, and.py(gst-ros2-360-detect-car.py) files are available on GitHub.
And we have provided the .xmodel(kr260_yolov3_tf2.xmodel) at the following link. This is the model used for testing the DPU (object detection).
https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/src/gst-ros2
sudo su
source /etc/profile.d/pynq_venv.sh
source /opt/ros/humble/setup.bash
cd /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/src/gst-ros2/
python3 gst-ros2-360-detect-car.py
#Please follow the README.md in the folder to download the .xmodel file.
Start Rviz2 with the following command:
sudo su
source /opt/ros/humble/setup.bash
rviz2
Here is test video:
This involves performing object detection with the DPU and executing marker output with ROS2 while the robot car is in motion.
We display images published from KR260 using ROS2's Rviz2.
The KR260 splits a 360° image into four sections and performs object detection in each section.
ROS2 Marker Output Test
The next test uses the below program.
python3 gst-ros2-360-2divide.py
The program content is almost the same as before, but this time it includes a demonstration with ROS2 marker output.
Here is test video:
In this test, the KR260 splits the 360° image into two sections, front and back.
During the test, when objects are moved, we can see the marker outputs changing in real-time.
As a result of picking up the object in front of the camera, ultimately, both the marker and image detection show only the ball in front.
Additionally, the operation of lifting the ball is also confirmed.
Robot Action Test
The next test uses the same program in Test 1.
python3 gst-ros2-360-detect-car.py
However, the KR260 unit and the robot components are separated. By extending the wiring from the motor driver to the motors, this is achievable.
By reducing the weight of the robot body, even smoother movements are possible.
Here is test video:
It can be confirmed that the robot car moves smoothly forward, backward, and rotates, as well as operates its arm.
Human Detect Test
The next test uses the below program.
python3 gst-ros2-360-human-trace.py
This program detects a person (or a person's hand) and automatically rotates and moves forward in that direction.
Here is test video:
Initially, the program detects a person's hand behind or beside the camera, causing the robot to rotate automatically.
Finally, the program detects a person's hand in front of the camera, prompting the robot to move forward.
For program details and specifications, please refer to the Subproject below.
10. Improve Object Detection Speed with YOLOX
In this Subproject, we conducted object detection using the KR260's DPU with the lightweight model "YOLOX-nano" and PyTorch.
We created a program (.ipynb and.py) that runs on PYNQ and confirmed its operation, and compared the detection speed with the old YOLOv3 program.
Below is the test video with the execution and speed comparison. We confirmed that the new YOLOX program is approximately five times faster.
Creating the YOLOX-nano Model with Vitis AI
First, we created (compiled) the YOLOX model for KR260 in a Linux environment.
We downloaded and extracted the sample model of YOLOX.
wget https://www.xilinx.com/bin/public/openDownload?filename=pt_yolox-nano_3.5.zip
unzip openDownload?filename=pt_yolox-nano_3.5.zip
Compilation with Vitis AI
We used Vitis AI for compilation, launching the CPU version of Vitis AI for PyTorch.
cd Vitis-AI/
./docker_run.sh xilinx/vitis-ai-pytorch-cpu:latest
Using the arch.json created earlier as an argument, We compiled the model.
The.xmodel file is created in the folder after compilation.
cd pt_yolox-nano_3.5/
conda activate vitis-ai-pytorch
echo '{' > arch.json
echo ' "fingerprint": "0x101000016010407"' >> arch.json
echo '}' >> arch.json
vai_c_xir -x quantized/YOLOX_0_int.xmodel -a arch.json -n yolox_nano_pt -o ./yolox_nano_pt
This time, it's an example of the fingerprint of B4096 on KR260.
If you want to try different architectures like B512 or B1024 and need to check the file where the fingerprint is written (arch.json), it is located in the following folder when you synthesize the DPU with Vitis:
~/***_hw_link/Hardware/dpu.build/link/vivado/vpl/prj/prj.gen/sources_1/bd/design_1/ip/design_1_DPUCZDX8G_1_0/arch.json
Program Creation on PYNQ
We created a program that runs on PYNQ, using.ipynb and.py format.
Since the algorithm differs from the YOLOv3 sample program, some modifications were necessary. The actual program can be found on the following GitHub repository:
Testing on KR260
Below is an example of copying to the jupyter_notebooks directory on the KR260.
sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/pynq-yolox/ ./
We opened the.ipynb file(dpu_yolox-nano_pt_coco2017.ipynb) on the KR260 via a web browser.
We checked the input/output tensors of the model converted with PyTorch, ensuring (1, 416, 416, 3) → ((1, 52, 52, 85) (1, 26, 26, 85) (1, 13, 13, 85)).
The YOLOX model detected 80 categories of COCO objects.
YOLOX-nano and YOLOv3-tiny Detection Speed Comparison
We compared the detection speed between YOLOX-nano and the old YOLOv3-tiny.
The execution environment was the same DPU (B4096).
- DPU execution time…0.1168→0.0154, approximately 1/8 detection time
- CPU post-processing time…0.1303→0.0303, approximately 1/4 detection time
- Total processing fps…3.30→18.6, approximately 5 times the speed
YOLOX-nano Results
Details of detected objects: [49, 60]
Pre-processing time: 0.0080 seconds
DPU execution time: 0.0154 seconds
Post-process time: 0.0303 seconds
Total run time: 0.0537 seconds
Performance: 18.63 FPS
(array([[ 458.1155, 125.8079, 821.8845, 489.5768],
[ 40.2464, 0. , 1239.7537, 720. ]]),
array([0.5618, 0.1179]),
array([49, 60]))
YOLOv3-tiny Results
Details of detected objects: [49, 60]
Pre-processing time: 0.0560 seconds
DPU execution time: 0.1168 seconds
Post-process time: 0.1303 seconds
Total run time: 0.3030 seconds
Performance: 3.30 FPS
(array([[ 157.7307, 455.4164, 434.6538, 812.3395],
[ 49.6795, 66.1538, 658.0765, 1213.8462]], dtype=float32),
array([0.2461, 0.7143], dtype=float32),
array([49, 60], dtype=int32))
Applying YOLOX to 360° Object Detection
We also applied the YOLOX to 360° live streaming object detection. The actual program can be found on the following GitHub repository:
sudo su
source /etc/profile.d/pynq_venv.sh
cd /src/yolox-test/
python3 app_gst-yolox-real-360-2divide.py
Below is the test video.
The 360° camera(RICOH THETA V) used was old and of USB 2.0 type, resulting in a simple live streaming of about 6fps.
We split the 1920x960 image into two 960x960 images for display.
Implementing object detection with the slow YOLOv3 reduced the frame rate to about 1.5fps.
Changing to YOLOX improved it to about 3.5fps. (Further optimization might bring it closer to 6fps).
By applying YOLOX, we were also able to speed up object detection in 360° live streaming.
For program details and specifications, please refer to the Subproject below.
11. Benchmark Architectures of the DPU
In this Subproject, We conducted object detection using the KR260's DPU with the lightweight model "YOLOX-nano" and PyTorch
We measured the speed of object detection using various architectures of the DPU (DPUCZDX8G).
The tests primarily used a 150MHz clock, with some checks at 300MHz.
Generally, larger sizes resulted in shorter inference times.
Creating Models for Each Architecture
Refer to the article below for details on how to create files(".bit", ".xclbin", ".hwh") for each DPU architecture.
4. Implementation DPU, GPIO, and PWM
To run YOLOX object detection, you need the following models(".xmodel"). Please refer to the following article.
10. Imporve Object Detection Speed with YOLOX
Generated Files for Each Architecture
The created files for each architecture ("B512, B800, B1024, B1600, B2304, B3136, B4096") are listed below.
For B512, B3196, and B4096, files were also generated with the DPU clock set to 300MHz instead of 150MHz.
YOLOX Inference Results for Each Architecture
The inference time for object detection on a single image using YOLOX was measured. (Testing object detection with an orange ball on table)
The times below exclude preprocessing and postprocessing.
# Fetch data to DPU and trigger it
dpu_start = time.time()
job_id = dpu.execute_async(input_data, output_data)
dpu.wait(job_id)
dpu_end = time.time()
The actual program can be found on the following GitHub repository:
Testing on KR260
Below is an example of copying to the jupyter_notebooks directory on the KR260.
sudo su
cd $PYNQ_JUPYTER_NOTEBOOKS
cd jupyter_notebooks/
ls
cp -rf /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/jupyter_notebooks/pynq-benchmark/ ./
We opened the.ipynb file(benchmark_dpu_yolox-nano_pt_coco2017.ipynb) on the KR260 via a web browser.
Summary of Key Findings
- Larger architecture sizes generally result in faster performance.
- Increasing the clock speed to the DPU also generally results in faster performance.
For B512, increasing the clock speed from 150MHz to 300MHz significantly improved the execution speed. For B3136 and B4096, the impact was less noticeable compared to B512.
In conclusion, the B4096 at 300MHz is the fastest for object detection with YOLOX. Therefore,we are implementing the B4096 at 300MHz in this main project as well.
For program details and specifications, please refer to the Subproject below.
12. Power Consumption of 360° Object Detection Robot Car
In this Subproject, we measured Power Consumption of Robot Car with KR260.
When trying to power the KR260 from a mobile battery, a power shortage occurred during the program startup.
Initially, our plan was to operate the robot equipped with the KR260 using a battery. This was because we'd like to operate the robot without being hindered by power cables.
We used a commercially available PD-compatible mobile battery (20W).
Here is a test video showing the actual power shortage:
Operating KR260 with a PD20W Mobile Battery
We prepared a PD-compatible mobile battery capable of 12V output. It’s a Philips DLP7721C, with the following specs:
- Capacity: 20000mAh/3.7V
Output:
- USB-A1/A2: DC 5V/3A, 9V/2A, 12V/1.5A
- USB-C: DC 5V/3A, 9V/2.2A, 12V/1.67A
Since light mobile-batteries with outputs above 20W are quite expensive, we opted for the 20W version (as of 2024).
We also purchased a general-purpose current checker and a cable compatible with PD12V output to input 12V power into the KR260’s DC jack.
Power Consumption During Idle
The KR260 startup worked fine with the mobile battery. Linux booted normally, and during idle after plugging in the DC jack, the consumption was about 8.4W (12V_700mA).
Power Consumption with Device Connection + GUI
With a USB keyboard, mouse, and display connected, the consumption was about 10.2W (12V_850mA).
Power Consumption with 360° Camera Connection
Connecting a 360° camera resulted in a power consumption of about 12.0W (12V_1000mA).
Power Consumption During 360° Object Detection
Running the DPU for 360° object detection, the power consumption was about 16.2W (12V_1350mA).
Up to this point, the 20W mobile battery could still handle the operation.
Motor Power Consumption
The robot uses three DC motors: one for arm control and two for robot car control.
The motors use 5V power, controlled by PWM from the KR260, and were estimated at 100% operation for measurement purposes.
- Arm DC Motor: about 2.5W (5V_500mA)
- Robot Car DC Motor: about 4W each (5V_800mA)
Thus, with all motors running, the estimated power consumption was 2.5W + 4W*2 = 10.5W (though it's rare for all three to operate simultaneously).
Testing Power Deficit with KR260
Adding up the total power consumption, it reached 26.7W, exceeding the 20W capacity of the battery. Hence, the power shortage and subsequent reboot during program startup were expected.
No Issues with the 36W AC Adapter
The 36W (12V_3A) AC adapter included in the KR260 starter kit had no issues, comfortably handling the estimated 26.7W power consumption.
Therefore, for the final robot car demonstration, We had to use the AC adapter.
5.13 Application to Vitis AI ONNX Runtime Engine (VOE)For program details and specifications, please refer to the Subproject below.
13. Application to Vitis AI ONNX Runtime Engine (VOE)
Note:
In this subproject, we will conduct tests in a different environment from the main project as part of the benchmarking process.
This subproject involved setting up a dedicated ONNX environment on the KR260. We will also introduce the use of the Vitis AI ONNX Runtime Engine (VOE).
Using ONNX Runtime makes Vitis AI even more user-friendly. This time, we are testing the comparison between CPU and DPU using ONNX.
Vitis AI 3.5 ONNX
Vitis AI 3.5 ONNX supports both C++ and Python. Refer to the official documentation here:
https://docs.amd.com/r/en-US/ug1414-vitis-ai/Programming-with-VOE
In this subproject, we will write and test YOLOX in Python code.
Building the Environment with PetaLinux from BSP
This subproject, we used PetaLinux to create the OS environment on the KR260. Download the BSP file from the link below and build it with PetaLinux:
source /opt/petalinux/2023.1/settings.sh
petalinux-create -t project -s xilinx-kr260-starterkit-v2023.1-05080224.bsp
cd xilinx-kr260-starterkit-2023.1/
petalinux-build
petalinux-package --boot --u-boot --force
petalinux-package --wic --images-dir images/linux/ --bootfiles "ramdisk.cpio.gz.u-boot,boot.scr,Image,system.dtb,system-zynqmp-sck-kr-g-revB.dtb" --disk-name "sda"
Writing the Image to the SD Card
Write the SD card image (.wic) created with PetaLinux, found in the following folder:
~/xilinx-kr260-starterkit-2023.1/images/linux/
We use balenaEtcher to write it to the SD card.
Installing onnxruntime
The initial login name for KR260 is "petalinux". Follow the official documentation to install Vitis AI and the ONNX runtime on the KR260:
https://docs.amd.com/r/en-US/ug1414-vitis-ai/Programming-with-VOE
wget https://www.xilinx.com/bin/public/openDownload?filename=vitis_ai_2023.1-r3.5.0.tar.gz
sudo tar -xzvf openDownload\?filename\=vitis_ai_2023.1-r3.5.0.tar.gz -C /
ls
wget https://www.xilinx.com/bin/public/openDownload?filename=voe-0.1.0-py3-none-any.whl -O voe-0.1.0-py3-none-any.whl
pip3 install voe*.whl
wget https://www.xilinx.com/bin/public/openDownload?filename=onnxruntime_vitisai-1.16.0-py3-none-any.whl -O onnxruntime_vitisai-1.16.0-py3-none-any.whl
pip3 install onnxruntime_vitisai*.whl
It seems that xrt is not installed, so we responded with the following:
sudo dnf install xrt packagegroup-petalinux-opencv
Creating the DPU Environment with PetaLinux
In Ubuntu, the DPU environment of the FPGA could be overlaid via the PYNQ-DPU library.
In PetaLinux, we used xmutil to prepare the DPU environment.
We prepared the necessary files, including those needed for xmutil, and placed them on GitHub.
https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/src/onnx-test
This Creation Process involves very lengthy steps. Therefore, the details are omitted here; please refer to the Subproject for more information.
Preparing the ONNX YOLOX Model
This time, we used the models provided by Vitis AI. Pre-trained and quantized models are provided as samples by Xilinx (AMD), converted from PyTorch models to ONNX:
https://github.com/Xilinx/Vitis-AI/tree/master/model_zoo/model-list/pt_yolox-nano_3.5
Download and unzip the YOLOX sample model:
wget https://www.xilinx.com/bin/public/openDownload?filename=pt_yolox-nano_3.5.zip
unzip openDownload\?filename\=pt_yolox-nano_3.5.zip
The "yolox_nano_onnx_pt.onnx" file is in the "quantized" folder. Transfer this file to the KR260 without further compilation.
Python Program for ONNX YOLOX (.py)
The actual program we ran is available on GitHub below.
This program is executed on the KR260.
Testing ONNX on KR260
We send the files created for the KR260.
First, We have configured the DPU to be loadable with xmutil.
We created an application called b4096_300m.
ls /lib/firmware/xilinx/
sudo mkdir /lib/firmware/xilinx/b4096_300m
sudo cp pl.dtbo shell.json /lib/firmware/xilinx/b4096_300m/
sudo cp dpu.xclbin /lib/firmware/xilinx/b4096_300m/binary_container_1.bin
ls /lib/firmware/xilinx/b4096_300m/
We also replaced the existing vart.conf with the newly created one.
sudo mv /etc/vart.conf /etc/old_vart.conf
sudo cp vart.conf /etc/
sudo reboot
From here, the steps follow the flow presented in the demo video.
First, load the DPU application (b4096_300m).
xilinx-kr260-starterkit-20231:~$ sudo xmutil listapps
xilinx-kr260-starterkit-20231:~$ sudo xmutil unloadapp
xilinx-kr260-starterkit-20231:~$ sudo xmutil loadapp b4096_300m
xilinx-kr260-starterkit-20231:~$ sudo xmutil listapps
We run a Python program(onnx-yolox.py) within the onnx-test directory on the KR260.
xilinx-kr260-starterkit-20231:~$ cd onnx-test/
xilinx-kr260-starterkit-20231:~/onnx-test$ python3 onnx-yolox.py
When the program runs, it begins compiling to match the DPU specifications.
Note: The initial compilation takes a few minutes. Subsequent runs will skip the compilation step, making the process faster.
During the compilation, the log shows that the program reads the loaded DPU and compiles in DPU mode:
Compile mode: dpu
Debug mode: performance
Target architecture: DPUCZDX8G_ISA1_B4096_0101000016010407
Graph name: torch_jit, with op num: 815
Begin to compile...
Once the compilation is complete, the Python program will execute.
We're using YOLOX for object detection on a single image.
The results of the image recognition showed that the orange ball was successfully detected without any issues.
bboxes of detected objects: [[ 473.17449951 137.78985596 812.97937012 477.59475708]
[ 0. 5.46184874 1280. 720. ]]
scores of detected objects: [0.73033565 0.20149007]
Details of detected objects: [49. 60.]
Pre-processing time: 0.0108 seconds
DPU execution time: 0.0129 seconds
Post-process time: 0.0360 seconds
Total run time: 0.0597 seconds
Performance: 16.740788045213616 FPS
Comparison with YOLOv3 and YOLOX
Although not a precise comparison, we will compare the content and speed tested in an Ubuntu environment as described in the article below.
10. Improve Object Detection Speed with YOLOX
We conducted similar tests with TensorFlow2's YOLOv3 and PyTorch's YOLOX.
When comparing the pre-processing, DPU inference, and post-processing, the results were as follows:
For the ONNX YOLOX, no particular speed optimizations were made from the original PyTorch. Both versions of YOLOX yielded almost identical results, which was as expected.
Comparing YOLOX on CPU and DPU with ONNX
With the capability to use ONNX, comparing the CPU and DPU on the KR260 becomes straightforward.
By modifying a single line in the same program, you can switch from DPU to CPU for inference:
providers=["CPUExecutionProvider"]
session = onnxruntime.InferenceSession(
'yolox_nano_onnx_pt.onnx',
# providers=["VitisAIExecutionProvider"],
providers=["CPUExecutionProvider"],
provider_options=[{"config_file":"/usr/bin/vaip_config.json"}])
The actual program we ran is available on GitHub below.
xilinx-kr260-starterkit-20231:~/onnx-test$ python3 onnx-cpu-yolox.py
Here is the comparison of CPU and DPU on ONNX Test video:
The DPU inference was over 20 times faster than the CPU.
When comparing the previous YOLOv3 and PyTorch YOLOX, the graph below summarizes the results.
It clearly shows how using the DPU can significantly speed up the process.
For program details and specifications, please refer to the Subproject below.
14. Appendix: Object Detection Using YOLOX with a Webcam
Many people may not have the 360° camera required for Main project.
Therefore, as a reference, we will introduce a subproject using a generic webcam, as well as controlling DPU, GPIO and Output ROS2.
Introduction
Using ROS2's Rviz2 and KR260, we processed 360° camera images.We tried object detection with a regular USB-connected webcam using the KR260.
By using the FPGA’s DPU for YOLOX inference, we can achieve fast, real-time detection.
We will introduce the program along with the test results.
Test Program for GStreamer and YOLOX with a Webcam
The program is saved in the following GitHub repository:
https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/tree/main/src/usb-camera
Here is the test video:
Perform object detection using live streaming from the webcam.
sudo su
cd /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/src/usb-camera/
source /etc/profile.d/pynq_venv.sh
python3 app_gst-yolox-real-normal-camera-gpio.py
When the program starts, you can see the webcam image with added object detection.
In the actual live streaming with YOLOX, about 17 fps is achieved.
When a yellow ball (sports ball) is detected, you can see that the LED (GPIO) is turned on.
Testing Marker and Image Output with ROS2
We will test outputting Marker and Image data to ROS2 using data from a webcam.
To visualize ROS2, start rviz2.
sudo su
source /opt/ros/humble/setup.bash
rviz2
Once rviz2 is set up, start the program (gst-yolox-ros2-normal-camera.py).
sudo su
source /etc/profile.d/pynq_venv.sh
source /opt/ros/humble/setup.bash
cd /home/ubuntu/AMD-Pervasive-AI-Developer-Contest/src/usb-camera/
python3 gst-yolox-ros2-normal-camera.py
The test video is as follows:
Using the information detected by YOLOX from the webcam, the output to ROS2 is executed at about 17 fps.
Markers and images are published to ROS2 without any issues.
This has been a greatfun challenge. Thanks to AMD and Hackster for the opportunity and the hardware.
By leveraging the KR260, we've developed a robot with 360-degree AI vision. Although we used an older, inexpensive 360° camera with USB 2.0, we successfully achieved 360° object detection and visualization using ROS2.
Potential future work includes:
- Developing AI vision with high-speed communication using a 360° camera with USB 3.0.
- Implementing high-speed transmission of 360° images using H264/H265 hardware encoders/decoders.
- Migrating CPU processing to DPU processing for further speed improvements in image recognition.
We plan to try these improvements as soon as we have the time.
Thank you for reading.
Comments