那谱丹 • Posted by Jinger Zeng

Created April 1, 2022

Dense pedestrian detection system based on yolov3-tiny

The project improves the yolo algorithm for dense pedestrian detection models.

Things used in this project

Hardware components

AMD Kria™ KV260 Vision AI Starter Kit

Software apps and online services

python

darknet2caffe

AMD Vitis Unified Software Platform

Story

Dense pedestrian detection system based on yolov3-tiny

This project implements a dense pedestrian detection system based on yolov3-tiny. The project work includes algorithm improvement and simulation, training of yolo model, model conversion from darknet framework to Caffe framework, quantitative compilation of Caffe model in Vitis-AI, and deployment of the model in KV260 in five parts.

Yolov3Tiny model training

Dataset production

From the dataset of PASCAL VOC2007/VOC2012, the images with pedestrians are extracted.

Model training

Configure Makefile, data/voc.names, cfg/voc.data, cfg/yolov3-tiny.cfg and other files.
Download the pre-training weights file
Train the model

These figures show the process of model training, and the model error is effectively converged after several iterations of computation.

Model validation

Use the trained model to test on some images.

Model conversion from darknet framework to Caffe framework

Since the yolo model needs to be converted to other models in Vitis-AI to complete the deployment, the caffe model, which deploys well, is used as the target model. The two frameworks are different, so here the conversion is done using darknet to caffe.

Install caffe environment

Before completing the conversion, you need to complete the installation of the caffe environment.

Network structure of the framework

The yolov3 network structure has a convolution layer, a shortcut layer compared to yolov2, an upsample layer, and a route layer to fuse the sampled features. So if we find the corresponding layers in Caffe and construct them accordingly, we can implement yolov3 using Caffe.

The shortcut layer in yolov3 can be replaced by eltwise, the route layer can be replaced by concat, and the upsample and yolo layers need to be implemented by ourselves and added to Caffe. upsample layer mainly does the work of upsampling.

Implementation of model transformation

This step is done thanks to the code provided by chenyingpeng.

This part requires adding the upsample layer, adding the .hpp and .cpp files, configuring the caffe.proto file, and recompiling caffe.

The next step is to convert the trained model (.cfg) and weights files (.weights) to .proto and .caffemodel under the corresponding Caffe. Here again, thanks to chenyingpeng for providing the tools and operating guidelines.

Quantitative compilation of Caffe models in Vitis-AI

Vitis-AI installation

Quantification

The quantization operation is performed according to the ug1414-vitis-ai.pdf manual.

The operation instructions are given in the manual.

Prepare the file to be entered.

Generate the quantized model and the model file for compilation.

📷

Compile

Again, complete the compilation of the model according to the manual.

📷

Deployment

Put the generated model into /opt/xilinx/share/vitis_ai_library/models/kv260-smartcam/ path, and configure the files such as preprocess.json, aiinference.json, and drawresult.json.

Run the following command to complete the model deployment.

sudo xmutil listapps
sudo xmutil unloadapp
sudo xmutil loadapp kv260-smartcam
sudo smartcam -f ~/paris.nv12.h264 -i h264 -W 1920 -H 1080 -t dp -r 30 -a new_yolov3tiny

Project Improvement

For the dense pedestrian detection model, the project studies the Yolov3 algorithm and improves it through three aspects: data processing, network structure, and loss function to optimize the detection performance on small targets and multiple scales.

The project enriches the sample features by means of data augmentation in the training phase, and increases the robustness of the model by re-clustering the anchor frames that match dense pedestrians according to the characteristics of dense crowds. To address the problem of high miss detection rate of small targets, the project designs pyramid network results with location feature enhancement to convey strong localization features, convey richer detail information flow, create more effective features, and improve the detection accuracy of small targets. The project uses center distance-based loss function in the loss function to optimize the edge regression to improve the accuracy of prediction frame.

The dense pedestrian detection model proposed in the project achieves improved detection accuracy for dense crowds with effectiveness and generalization.

Credits

那谱丹

Posted by

Jinger Zeng

Comments

Please log in or sign up to comment.

Embed the widget on your own site

Dense pedestrian detection system based on yolov3-tiny

Dense pedestrian detection system based on yolov3-tiny

Things used in this project

Hardware components

Software apps and online services

Story

Dense pedestrian detection system based on yolov3-tiny

Yolov3Tiny model training

Dataset production

Model training

Model validation

Model conversion from darknet framework to Caffe framework

Install caffe environment

Network structure of the framework

Implementation of model transformation

Quantitative compilation of Caffe models in Vitis-AI

Vitis-AI installation

Quantification

📷

Compile

📷

Deployment

Project Improvement

Credits

Comments