Subproject for the AMD Pervasive AI Developer Contest
Introduction
Creating the YOLOX-nano Model with Vitis AI
Program Creation on PYNQ
Key Part of the Program
Testing on KR260
YOLOX-nano and YOLOv3-tiny Detection Speed Comparison
Applying YOLOX to 360° Object Detection
Conclusion

Published July 31, 2024 © Apache-2.0

Object Detection with YOLOX + PYNQ and KR260

This project is part of a subproject for the AMD Pervasive AI Developer Contest. We conduct 360° Object Detection with YOLOX.

IntermediateFull instructions provided4 hours576

Object Detection with YOLOX + PYNQ and KR260

Things used in this project

Hardware components

AMD Kria™ KR260 Robotics Starter Kit

RICOH THETA V

Software apps and online services

AMD Vivado Design Suite

Vivado 2023.1

AMD Vitis Unified Software Platform

Vitis 2023.1 Vitis-AI 3.5

OpenCV – Open Source Computer Vision Library OpenCV

Story

We conducted object detection using the KR260's DPU, and used the lightweight model "YOLOX-nano" with PyTorch.

We also compared the execution speed of YOLOX and YOLOv3.

compared the speed of YOLOX and YOLOv3.

Subproject for the AMD Pervasive AI Developer Contest

This project is part of a subproject for the AMD Pervasive AI Developer Contest.

Be sure to check out the other projects as well.

***The main project is currently under submission. ***

0. Main project << under submission

1. PYNQ + GPIO(LED Blinking)

2. PYNQ + PWM(DC-Motor Control)

3. Object Detection(Yolo) with DPU-PYNQ

4. Implementation DPU, GPIO, and PWM

5. Remote Control 360° Camera

6. GStreamer + OpenCV with 360°Camera

7. 360 Live Streaming + Object Detect(DPU)

8. ROS2 3D Marker from 360 Live Streaming

9. Control 360° Object Detection Robot Car

10. Imporve Object Detection Speed with YOLOX << this project

11. Benchmark Architectures of the DPU

12. Power Consumption of 360° Object Detection Robot Car

13. Application to Vitis AI ONNX Runtime Engine (VOE)

14. Appendix: Object Detection Using YOLOX with a Webcam

Please note that before running the above subprojects, the following setup, which is the reference for this AMDcontest, is required.

https://github.com/amd/Kria-RoboticsAI

Introduction

We conducted object detection using the KR260's DPU with the lightweight model "YOLOX-nano" and PyTorch.

object detection using the KR260's DPU

We created a program (.ipynb and.py) that runs on PYNQ and confirmed its operation, and compared the detection speed with the old YOLOv3 program.

Below is the test video with the execution and speed comparison. We confirmed that the new YOLOX program is approximately five times faster.

YOLOX-TEST-Video

YOLOX

The pre-trained, pre-compiled model is provided as a sample by Xilinx (AMD).

The sample used is available here:

https://github.com/Xilinx/Vitis-AI/tree/master/model_zoo/model-list/pt_yolox-nano_3.5

YOLOv3

There is a YOLOv3-tiny sample program as part of the DPU-PYNQ samples. The method to execute it in the KR260+DPU environment is introduced in the following article:

3. Object Detection(Yolo) with DPU-PYNQ

However, YOLOv3 is quite an old version and was slow in detection speed during actual use.

Therefore, We used the relatively newer and lightweight model, YOLOX-nano, and conducted a benchmark comparison for speed.

Creating the YOLOX-nano Model with Vitis AI

First, we created (compiled) the YOLOX model for KR260 in a Linux environment.

We downloaded and extracted the sample model of YOLOX.

wget https://www.xilinx.com/bin/public/openDownload?filename=pt_yolox-nano_3.5.zip
unzip openDownload?filename=pt_yolox-nano_3.5.zip

Compilation with Vitis AI

We used Vitis AI for compilation, launching the CPU version of Vitis AI for PyTorch.

cd Vitis-AI/
./docker_run.sh xilinx/vitis-ai-pytorch-cpu:latest

Using the arch.json created earlier as an argument, We compiled the model.

The.xmodel file is created in the folder after compilation.

cd pt_yolox-nano_3.5/
conda activate vitis-ai-pytorch
echo '{' > arch.json
echo ' "fingerprint": "0x101000016010407"' >> arch.json
echo '}' >> arch.json
vai_c_xir -x quantized/YOLOX_0_int.xmodel -a arch.json -n yolox_nano_pt -o ./yolox_nano_pt

This time, it's an example of the fingerprint of B4096 on KR260.

If you want to try different architectures like B512 or B1024 and need to check the file where the fingerprint is written (arch.json), it is located in the following folder when you synthesize the DPU with Vitis:

~/***_hw_link/Hardware/dpu.build/link/vivado/vpl/prj/prj.gen/sources_1/bd/design_1/ip/design_1_DPUCZDX8G_1_0/arch.json

to check the file (arch.json)

Program Creation on PYNQ

We created a program that runs on PYNQ, using.ipynb and.py format.

Since the algorithm differs from the YOLOv3 sample program, some modifications were necessary. The actual program can be found on the following GitHub repository:

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/jupyter_notebooks/pynq-yolox/dpu_yolox-nano_pt_coco2017.ipynb

The process involves typical YOLOX-nano operations, including preprocessing, DPU input/output, post-processing, and writing BBOX. DPU inference replaces the usual CPU or GPU inference, also measuring execution time.

Key Part of the Program

def run(image_index, display=False):
    # Pre-processing
    input_shape = (416, 416)
    input_image = cv2.imread(os.path.join(image_folder, original_images[image_index]))
    image_data, ratio = preprocess(input_image, input_shape)
    
    # DPU inference
    image[0,...] = image_data.reshape(shapeIn[1:])
    job_id = dpu.execute_async(input_data, output_data)
    dpu.wait(job_id)
    
    # Post-processing
    outputs = np.concatenate([output.reshape(1, -1, output.shape[-1]) for output in output_data], axis=1)
    bboxes, scores, class_ids = postprocess(outputs, input_shape, ratio, nms_th, nms_score_th, image_width, image_height)
    
    if display:
        display = draw_bbox(input_image, np.array(bboxes_with_scores_and_classes), class_names)
        cv2.imwrite(os.path.join("img/", f'result.jpg'), display)
    
    return bboxes, scores, class_ids

Testing on KR260

Below is the test video mentioned at the beginning.

YOLOX-TEST-Video

We opened the.ipynb file on the KR260 via a web browser.

opened the .ipynb file

We checked the input/output tensors of the model converted with PyTorch, ensuring (1, 416, 416, 3) → ((1, 52, 52, 85) (1, 26, 26, 85) (1, 13, 13, 85)).

The YOLOX model detected 80 categories of COCO objects.

checked the input/output tensors

YOLOX-nano and YOLOv3-tiny Detection Speed Comparison

We compared the detection speed between YOLOX-nano and the old YOLOv3-tiny. The execution environment was the same DPU (B4096).

DPU execution time…0.1168→0.0154, approximately 1/8 detection time
CPU post-processing time…0.1303→0.0303, approximately 1/4 detection time
Total processing fps…3.30→18.6, approximately 5 times the speed

object detection using the KR260's DPU

compared the speed of YOLOX and YOLOv3.

YOLOX-nano Results

Details of detected objects: [49, 60]
Pre-processing time: 0.0080 seconds
DPU execution time: 0.0154 seconds
Post-process time: 0.0303 seconds
Total run time: 0.0537 seconds
Performance: 18.63 FPS

(array([[ 458.1155,  125.8079,  821.8845,  489.5768],
        [  40.2464,    0.    , 1239.7537,  720.    ]]),
 array([0.5618, 0.1179]),
 array([49, 60]))

YOLOv3-tiny Results

Details of detected objects: [49, 60]
Pre-processing time: 0.0560 seconds
DPU execution time: 0.1168 seconds
Post-process time: 0.1303 seconds
Total run time: 0.3030 seconds
Performance: 3.30 FPS

(array([[ 157.7307,  455.4164,  434.6538,  812.3395],
        [  49.6795,   66.1538,  658.0765, 1213.8462]], dtype=float32),
 array([0.2461, 0.7143], dtype=float32),
 array([49, 60], dtype=int32))

Applying YOLOX to 360° Object Detection

We also applied the YOLOX to 360° live streaming object detection. The actual program can be found on the following GitHub repository:

https://github.com/iotengineer22/AMD-Pervasive-AI-Developer-Contest/blob/main/src/yolox-test/app_gst-yolox-real-360-2divide.py

sudo su
source /etc/profile.d/pynq_venv.sh
cd /src/yolox-test/
python3 app_gst-yolox-real-360-2divide.py

Below is the test video.

applied the YOLOX to 360° live streaming Video

The 360° camera(RICOH THETA V) used was old and of USB 2.0 type, resulting in a simple live streaming of about 6fps.

We split the 1920x960 image into two 960x960 images for display.

simple live streaming of about 6fps_1

simple live streaming of about 6fps_2

Implementing object detection with the slow YOLOv3 reduced the frame rate to about 1.5fps.

Implementing object detection YOLOv3_1

Implementing object detection YOLOv3_2

Changing to YOLOX improved it to about 3.5fps. Further optimization might bring it closer to 6fps.

Changing to YOLOX improved object detection speed_1.

Changing to YOLOX improved object detection speed_2.

Conclusion

We conducted object detection using the KR260's DPU, and used the lightweight model "YOLOX-nano" with PyTorch.

We also compared the execution speed of YOLOX and YOLOv3.

In the next project, we measured the speed of object detection using various architectures of the DPU (DPUCZDX8G).

11. Benchmark Architectures of the DPU

Credits

misoji engineer

19 projects • 18 followers

Hardware Engineer

Contact

Comments

Please log in or sign up to comment.

Object Detection with YOLOX + PYNQ and KR260

Things used in this project

Hardware components

Software apps and online services

Story

Subproject for the AMD Pervasive AI Developer Contest

Introduction

Creating the YOLOX-nano Model with Vitis AI

Program Creation on PYNQ

Key Part of the Program

Testing on KR260

YOLOX-nano and YOLOv3-tiny Detection Speed Comparison

Applying YOLOX to 360° Object Detection

Conclusion

Code

KR260-YOLOX-TEST(.ipynb)

KR260-YOLOX-TEST(.py)

Credits

misoji engineer

Comments

Embed the widget on your own site

Object Detection with YOLOX + PYNQ and KR260

Object Detection with YOLOX + PYNQ and KR260

Things used in this project

Hardware components

Software apps and online services

Story

Subproject for the AMD Pervasive AI Developer Contest

Introduction

Creating the YOLOX-nano Model with Vitis AI

Program Creation on PYNQ

Key Part of the Program

Testing on KR260

YOLOX-nano and YOLOv3-tiny Detection Speed Comparison

Applying YOLOX to 360° Object Detection

Conclusion

Code

KR260-YOLOX-TEST(.ipynb)

KR260-YOLOX-TEST(.py)

Credits

misoji engineer

Comments

Related channels and tags