The objective of this project is to ensure safety in a work environment i.e., building sites, laboratories, and other places that require wearing protective equipment. Due to Covid, real-time detection algorithms have been highly demanded by public and private institutions to ensure safety. At first, we wanted to build a simple application for face-mask detection but decided to focus on personal protective equipment (PPE) to fully test the performances of Xilinx FPGA: the Kria KV260. Our system will be able to automatically detect if a worker is wearing a helmet and/or vest. The output can be used to make sure that every worker or person present on a building site for example is following safety regulations.
1. Pretrained custom YoloV3 modelDue to the lack of time and resources, we chose to use a publicly available model with pre-trained weights for the live detection of PPE. We will use a model developed by Nipun D. Nath et Al. for Real-time detection of PPE. It is a custom YoloV3, in their article, they present 3 different approaches of detection each with its advantages and disadvantages.
In this project, we used the second approach which consists of training the YOLO model to detect a person in a frame and classify him/her in one of the following four classes: "Worker", "Worker + Helmet", "Worker + Vest" and "Worker + Helmet + Vest". We can notice that the number of classes can be increased as we add more equipment.
Pretrained weights can be found on Google Drive. To be able to use this model on Kria KV260, we had to quantize and compile it first using the VITIS-AI library.
We used the Vitis-AI 2.0 which is the latest at the time of the edition of this project.
docker pull xilinx/vitis-ai-cpu:latest
2. Model QuantizationAfter docker installation, we copied the model file to the docker and ran the following script for quantization. The Quantization process required the model that we uploaded but also a calibration dataset. For consistency, we used images from the original training dataset also available here.
import os
import numpy as np
import cv2
import tensorflow as tf
from tensorflow_model_optimization.quantization.keras import vitis_quantize
def letterbox_image(image, size):
'''
Resize image with unchanged aspect ratio using padding to fit yolov3 specifications
'''
ih, iw, ic = image.shape
h, w = size
scale = min(w/iw, h/ih)
nw = int(iw*scale)
nh = int(ih*scale)
new_image = np.zeros((h, w, ic), dtype='uint8') + 128
top, left = (h - nh)//2, (w - nw)//2
new_image[top:top+nh, left:left+nw, :] = cv2.resize(image, (nw, nh))
return new_image
if __name__ == '__main__':
float_model = tf.keras.models.load_model('/workspace/pictor_ppe_a2.h5')
float_model.get_config()
quantizer = vitis_quantize.VitisQuantizer(float_model)
# Get calibration images
filenames = []
path = f'./Images/'
print('Reading files ...')
for fname in os.listdir(path):
f = os.path.join(path, fname)
if(os.path.isfile(f)):
filenames.append(f)
# Reshaping images use letterbox function. We only used the first 100 Images in 784 Images available
print('Reshaping images ...')
image_data = []
for fname in filenames[:100]:
act_img = cv2.imread(fname)
image_data.append(letterbox_image(act_img,(416,416))/255.)
image_data = np.array(image_data)
print('Starting Quantization ...')
quantized_model = quantizer.quantize_model(calib_dataset = image_data, calib_batch_size=10)
quantized_model.save('quantized_model.h5') # Save the new model
3. Model CompilationThe quantized model is saved on the workspace and the next step was to compile it. The compilation was done by running the following command:
vai_c_tensorflow2 -m quantized_model.h5 -a arch.json -n yolov3_ppe
When the compilation was done the output is written in "yolov3_ppe.xmodel".
4. Kria ConfigurationThe converted model is then copied on the FPGA (Kria KV260) using ssh. This model is then stored under "/usr/share/vitis_ai_library/models/yolov3_ppe/" along with a new file "yolov3_custom.prototxt" (here below) that contains the description of the model and values of its anchors.
model {
name: "yolov3_416x416_ppe"
kernel {
name: "yolov3_416x416_ppe"
mean: 0.0
mean: 0.0
mean: 0.0
scale: 0.00390625
scale: 0.00390625
scale: 0.00390625
}
model_type : YOLOv3
yolo_v3_param {
num_classes: 4
anchorCnt: 3
layer_name: "58"
layer_name: "66"
layer_name: "74"
conf_threshold: 0.5
nms_threshold: 0.45
biases: 6
biases: 11
biases: 11
biases: 23
biases: 19
biases: 36
biases: 32
biases: 50
biases: 40
biases: 104
biases: 76
biases: 73
biases: 73
biases: 158
biases: 128
biases: 209
biases: 224
biases: 246
test_mAP: false
}
is_tf : true
}
5. ResultsTo run tests we used a modified version of samples/yolov3 available in Vitis-AI-Library. (Files available onGithub)
This project can be expanded to include other protective equipment like face masks, gloves, headphones, lab coats... Many researchers spent time trying to improve the accuracy of real-time PPE detection models and a lot of interesting approaches came out, these can be used to further test the limits of Kria KV260.
The use of pre-trained models was especially useful as it helped us focus more on the hardware and worry less about the deep learning part (i.e., data collection, data preparation, training,...). This means that we were also able to get started with cheap hardware and yet have remarkable results.
Comments