This project implements a dense pedestrian detection system based on yolov3-tiny. The project work includes algorithm improvement and simulation, training of yolo model, model conversion from darknet framework to Caffe framework, quantitative compilation of Caffe model in Vitis-AI, and deployment of the model in KV260 in five parts.
Yolov3Tiny model trainingDataset productionFrom the dataset of PASCAL VOC2007/VOC2012, the images with pedestrians are extracted.
- Configure Makefile, data/voc.names, cfg/voc.data, cfg/yolov3-tiny.cfg and other files.
- Download the pre-training weights file
- Train the model
These figures show the process of model training, and the model error is effectively converged after several iterations of computation.
Model validationUse the trained model to test on some images.
Since the yolo model needs to be converted to other models in Vitis-AI to complete the deployment, the caffe model, which deploys well, is used as the target model. The two frameworks are different, so here the conversion is done using darknet to caffe.
Install caffe environmentBefore completing the conversion, you need to complete the installation of the caffe environment.
The yolov3 network structure has a convolution layer, a shortcut layer compared to yolov2, an upsample layer, and a route layer to fuse the sampled features. So if we find the corresponding layers in Caffe and construct them accordingly, we can implement yolov3 using Caffe.
The shortcut layer in yolov3 can be replaced by eltwise, the route layer can be replaced by concat, and the upsample and yolo layers need to be implemented by ourselves and added to Caffe. upsample layer mainly does the work of upsampling.
Implementation of model transformationThis step is done thanks to the code provided by chenyingpeng.
This part requires adding the upsample layer, adding the .hpp and .cpp files, configuring the caffe.proto file, and recompiling caffe.
The next step is to convert the trained model (.cfg) and weights files (.weights) to .proto and .caffemodel under the corresponding Caffe. Here again, thanks to chenyingpeng for providing the tools and operating guidelines.
The quantization operation is performed according to the ug1414-vitis-ai.pdf manual.
The operation instructions are given in the manual.
Prepare the file to be entered.
Generate the quantized model and the model file for compilation.
📷CompileAgain, complete the compilation of the model according to the manual.
Put the generated model into /opt/xilinx/share/vitis_ai_library/models/kv260-smartcam/ path, and configure the files such as preprocess.json, aiinference.json, and drawresult.json.
Run the following command to complete the model deployment.
- sudo xmutil listapps
- sudo xmutil unloadapp
- sudo xmutil loadapp kv260-smartcam
- sudo smartcam -f ~/paris.nv12.h264 -i h264 -W 1920 -H 1080 -t dp -r 30 -a new_yolov3tiny
For the dense pedestrian detection model, the project studies the Yolov3 algorithm and improves it through three aspects: data processing, network structure, and loss function to optimize the detection performance on small targets and multiple scales.
The project enriches the sample features by means of data augmentation in the training phase, and increases the robustness of the model by re-clustering the anchor frames that match dense pedestrians according to the characteristics of dense crowds. To address the problem of high miss detection rate of small targets, the project designs pyramid network results with location feature enhancement to convey strong localization features, convey richer detail information flow, create more effective features, and improve the detection accuracy of small targets. The project uses center distance-based loss function in the loss function to optimize the edge regression to improve the accuracy of prediction frame.
The dense pedestrian detection model proposed in the project achieves improved detection accuracy for dense crowds with effectiveness and generalization.
那谱丹
Comments