This project implement the identification and localization of smoking behavior based on computer vision technology, which can quickly identify whether someone is smoking, and accurately locate the smoker's location.
We improve the detection speed by improving the complexity of the YOLO v4 model. Then we build a smoking detection dataset by collecting images and manually labeling, and use the TensorFlow framework to train the improved YOLO v4 model. Finally, quantize and compile the pre-trained model by Vitis AI in order to implement real-time detection on the KV260 board.
Detection ModelYOLO v4 model has a backbone network named CSP-Darknet53 which is too heavy to be deployed on FPGA. So, we need to modify the network strcture.
Because the DPU does not support the acceleration of the Mish activation function, we use the LeakyRelu instead. Then we reduce the number of residual blocks to simplify the model structure to achieve an balance between detection speed and detection accuracy.
DatasetWe build a dataset to train the improved YOLO model. First, obtain about 1000 pictures about smoking by the search engine, and use labelimg software to label smoking objects manually.
In order to deploy our model to KV260 board, we use Vitis AI to quantize and compile it.
Use vai_q_tensorflow command to quantize the float model. After quantization, a quantize_eval_model.pb file will be generated. This is the quantized TensorFlow model file. Then use the vai_c_tensorflow command to compile the quantized model generated in the previous step. After the compilation is completed, an xmodel file will be generated. The xmodel file is just the binaries for DPU deployment.
Comments