PPE Detection with KV260 Vision AI Starter Kit

Live detection of personal protective equipment using deep-learning on the Kria KV260 for safety. Application on construction sites.

IntermediateShowcase (no instructions)1,726

PPE Detection with KV260 Vision AI Starter Kit

Things used in this project

Hardware components

AMD Kria™ KV260 Vision AI Starter Kit

Webcam, Logitech® HD Pro

Software apps and online services

AMD Vitis Unified Software Platform

TensorFlow

OpenCV – Open Source Computer Vision Library OpenCV

Story

The objective of this project is to ensure safety in a work environment i.e., building sites, laboratories, and other places that require wearing protective equipment. Due to Covid, real-time detection algorithms have been highly demanded by public and private institutions to ensure safety. At first, we wanted to build a simple application for face-mask detection but decided to focus on personal protective equipment (PPE) to fully test the performances of Xilinx FPGA: the Kria KV260. Our system will be able to automatically detect if a worker is wearing a helmet and/or vest. The output can be used to make sure that every worker or person present on a building site for example is following safety regulations.

1. Pretrained custom YoloV3 model

Due to the lack of time and resources, we chose to use a publicly available model with pre-trained weights for the live detection of PPE. We will use a model developed by Nipun D. Nath et Al. for Real-time detection of PPE. It is a custom YoloV3, in their article, they present 3 different approaches of detection each with its advantages and disadvantages.

In this project, we used the second approach which consists of training the YOLO model to detect a person in a frame and classify him/her in one of the following four classes: "Worker", "Worker + Helmet", "Worker + Vest" and "Worker + Helmet + Vest". We can notice that the number of classes can be increased as we add more equipment.

Pretrained weights can be found on Google Drive. To be able to use this model on Kria KV260, we had to quantize and compile it first using the VITIS-AI library.

We used the Vitis-AI 2.0 which is the latest at the time of the edition of this project.

docker pull xilinx/vitis-ai-cpu:latest

2. Model Quantization

After docker installation, we copied the model file to the docker and ran the following script for quantization. The Quantization process required the model that we uploaded but also a calibration dataset. For consistency, we used images from the original training dataset also available here.

import os
import numpy as np
import cv2 
import tensorflow as tf
from tensorflow_model_optimization.quantization.keras import vitis_quantize

def letterbox_image(image, size):
    '''
    Resize image with unchanged aspect ratio using padding to fit yolov3 specifications
    '''
    ih, iw, ic = image.shape
    h, w = size
    scale = min(w/iw, h/ih)
    nw = int(iw*scale)
    nh = int(ih*scale)
    new_image = np.zeros((h, w, ic), dtype='uint8') + 128
    top, left = (h - nh)//2, (w - nw)//2
    new_image[top:top+nh, left:left+nw, :] = cv2.resize(image, (nw, nh))
    return new_image

if __name__ == '__main__':
  float_model =  tf.keras.models.load_model('/workspace/pictor_ppe_a2.h5')
  float_model.get_config()
  quantizer = vitis_quantize.VitisQuantizer(float_model)
  # Get calibration images
  filenames = []
  path = f'./Images/'
  print('Reading files ...')
  for fname in os.listdir(path):
    f =  os.path.join(path, fname)
    if(os.path.isfile(f)):
      filenames.append(f)
  # Reshaping images use letterbox function. We only used the first 100 Images in 784 Images available
  print('Reshaping images ...')
  image_data = []
  for fname in filenames[:100]:
    act_img = cv2.imread(fname)
    image_data.append(letterbox_image(act_img,(416,416))/255.)
  
  image_data = np.array(image_data)
  print('Starting Quantization ...')
  quantized_model = quantizer.quantize_model(calib_dataset = image_data, calib_batch_size=10)
  quantized_model.save('quantized_model.h5') # Save the new model

3. Model Compilation

The quantized model is saved on the workspace and the next step was to compile it. The compilation was done by running the following command:

vai_c_tensorflow2 -m quantized_model.h5 -a arch.json -n yolov3_ppe

When the compilation was done the output is written in "yolov3_ppe.xmodel".

4. Kria Configuration

The converted model is then copied on the FPGA (Kria KV260) using ssh. This model is then stored under "/usr/share/vitis_ai_library/models/yolov3_ppe/" along with a new file "yolov3_custom.prototxt" (here below) that contains the description of the model and values of its anchors.

model {
  name: "yolov3_416x416_ppe"
  kernel {
     name: "yolov3_416x416_ppe"
     mean: 0.0
     mean: 0.0
     mean: 0.0
     scale: 0.00390625
     scale: 0.00390625
     scale: 0.00390625
  }
  model_type : YOLOv3
  yolo_v3_param {
    num_classes: 4
    anchorCnt: 3
    layer_name: "58"
    layer_name: "66"
    layer_name: "74"
    conf_threshold: 0.5
    nms_threshold: 0.45
    biases: 6
    biases: 11
    biases: 11
    biases: 23
    biases: 19
    biases: 36
    biases: 32
    biases: 50
    biases: 40
    biases: 104
    biases: 76
    biases: 73
    biases: 73
    biases: 158
    biases: 128
    biases: 209
    biases: 224
    biases: 246
    test_mAP: false
  }
  is_tf : true
}

5. Results

To run tests we used a modified version of samples/yolov3 available in Vitis-AI-Library. (Files available onGithub)

Live detection of Helmet and person

Live demo video

Points of Improvements

This project can be expanded to include other protective equipment like face masks, gloves, headphones, lab coats... Many researchers spent time trying to improve the accuracy of real-time PPE detection models and a lot of interesting approaches came out, these can be used to further test the limits of Kria KV260.

The use of pre-trained models was especially useful as it helped us focus more on the hardware and worry less about the deep learning part (i.e., data collection, data preparation, training,...). This means that we were also able to get started with cheap hardware and yet have remarkable results.