Team HoneyGo:

Published April 18, 2022

Near Real-Time Face-reIdentification Surveillance System

Altering person card entry and manual tracking system to AI enabled solution using Xilinx Vitis-ai on VCK5000.

AdvancedFull instructions provided1 hour362

Near Real-Time Face-reIdentification Surveillance System

Things used in this project

Hardware components

AMD VCK5000 Versal Development Card

Internet-Connected Computer

Software apps and online services

vitis-ai

OS version - Ubuntu 20.04 LTS and kernel version - 5.4.0-52-generic

java 11

docker

Story

Problem Space

With growing technology, video surveillance system are getting cheaper and affordable solution for security purpose, with that every commercial, non-commercial, offices, household and government sectors are widely using it. Retrieving person information from those surveillance video storage system is still manual, inefficient and expensive. So our focus is on this problem space.

Manual ID card entry system

Solution - Theory

Our goal is to detect, identify and store person details from video surveillance system. In order to solve this we will be using Xilinx Vitis-ai models on VCK5000, neo4j (graph database) and spring-boot. Under Xilinx Vitis-ai models we are trying to solve this by using face-detection and face-re-identification models, which are highly efficient and optimized for this use-case. Xilinx VCK5000 Versal development card consists of 400+ AI Engines, which is suitable to process neural network very efficiently. To store and query person data, we opted neo4j, which will store data in graph format that helps in visualization and faster retrieval of data by nodes and links. Here spring-boot is used to interact with back-end pipeline and provides an user interface.

Project Planning

Started this project with creating basic block diagram
Identified and collected all software and hardware requirements for building this project. These are shown in brief with software setup and hardware setup.
Then started implementing each component one-by-one as per block diagram, which was described in detail in project implementation.

Block diagram:

Block diagram

Software Setup

Install openjdk 11
Install latest version of docker
Install Docker - if Docker not installed on your machine yet
Ensure your linux user is in the group docker
Download vitis-ai docker

docker pull xilinx/vitis-ai:1.4.1.978

Clone this repository - link

git clone --recurse-submodules https://github.com/durgabhavaniv/Face_Reid_Surveillance_System.git
cd Face_Reid_Surveillance_System

Hardware Setup

Download files from, https://www.xilinx.com/member/vck5000.html#vitis, and follow below commands.

xrt_202020.2.9.317_20.04-amd64-xrt.deb
xilinx-vck5000-es1-gen3x16-platform-2-1_all.deb.tar.gz
xilinx-vck5000-es1-gen3x16-2-202020-1-dev_1-3123623_all.deb

sudo apt-get insall ./xrt_202020.2.9.317_20.04-amd64-xrt.deb
tar -xzvf xilinx-vck5000-es1-gen3x16-platform-2-1_all.deb.tar.gz
cd ./xilinx-vck5000-es1-gen3x16-platform-2-1_all.deb/
sudo apt-get install ./xilinx-*
cd ..
sudo apt-get install ./xilinx-vck5000-es1-gen3x16-2-202020-1-dev_1-3123623_all.deb
cd ~/Face_Reid_Surveillance_System/setup/vck5000
source ./install.sh

Project Implementation

Project pipeline

Each block in this project pipeline is explained below.
Go to this link for starting the application.

Input

Once application is up and running, using http://localhost:8082/, home page will get appeared as shown above.
In this page, you need to provide camera location, date-time(yyyy-MM-dd hh:mm:ss), camera input and select all next or previous camera locations with respect to current camera location.
Sample input video used in this project shown below.

input video

Once you are done with all camera inputs and their respective connections, click "next" button, which will land up you in video page.

video page

Add person faces which has to be re-identified in the stored_faces directory(Face_Reid_Surveillance_System/tree/master/demo/Vitis-AI-Library/stored_faces) as follows:

sai_0.jpg        sai_1.jpg        sai_2.jpg        ....    
padma_0.jpg      padma_1.jpg      padma_2.jpg      ....
vijaya_0.jpg     vijaya_1.jpg     vijaya_2.jpg     ....
satya_0.jpg      satya_1.jpg      satya_2.jpg      ....

Note: multiply images of same person can be added with underscore version(_*), which will improve re-identification results.

This video page is to preview camera inputs. Once you are good with all inputs please click "process" button to start back-end pipeline.
Here back-end pipeline starts with face detection, here are the application logs.

application logs

Face Detection

Face detection consists of video decode, scale and detection process which is explained as follows.

Video decode and scale:

Decode the input video to frames using video decoder and scale the frames to 360p.

decoded frame

code :

cv::VideoCapture camera(video_file_); // capture input video
video_fps = camera.get(CV_CAP_PROP_FPS); // Get video FPS

DecodeThread(int channel_id, const std::string& video_file, queue_t* queue)
  : MyThread{},
  channel_id_{channel_id},
  video_file_{video_file},
  frame_id_{0},
  video_stream_{},
  queue_{queue} {
open_stream();
auto& cap = *video_stream_.get();
if (is_camera_) {
  cap.set(cv::CAP_PROP_FRAME_WIDTH, 640);
  cap.set(cv::CAP_PROP_FRAME_HEIGHT, 360);
}

Face detection on decoded frame :

Using Xilinx Vitis-ai densebox_640_360 neural network model, we are detecting faces of the person and drawing the bounding box of the person face in each frame.

decoded frame with face detection

Code :

int main(int argc, char *argv[]) {
  std::string model = argv[1];
  return vitis::ai::main_for_video_demo(
    argc, argv,
    [model] {
      return vitis::ai::FaceDetect::create(model);
    },
    process_result, 2);
}

Face detection output for 5 channels is shown as below.

face detection output for 5 channels

Here on an average, we are able to achieve 30 FPS for 5 parallel input videos of face detection, where as each input video is of 1080p@30fps.

Filtering Detected Frames

Cropping face :

Crop the faces from the detected frames and then apply resizing to obtain evenly cropped images.

Cropped face

Code :

image_save = cv::Rect{cv::Point(r.x * image.cols, r.y * image.rows), cv::Size{(int)(r.width * image.cols), (int)(r.height * image.rows)}};
if (0 <= image_save.x // box within the image plane
&& 0 < image_save.width
&& image_save.x + image_save.width < image.cols
&& image_save.width < image.cols
&& image_save.height < image.rows
&& image.cols > 0
&& 0 <= image_save.y
&& image_save.y + image_save.height < image.rows
&& image.rows > 0 ){
  croppedFaceImage = image(image_save).clone();
  cv::resize(croppedFaceImage, croppedFaceImage,
  cv::Size(100,100));
}

Remove redundancy :

All cropped images are compared with each other sequentially based on threshold value.

Remove redundancy

If compare value is greater than threshold then save cropped images, else neglect cropped images.

Code :

if (toggle == 0){
  toggle = 1;
  old_image.cols = 100;
  old_image.rows = 100;
  old_image = croppedFaceImage.clone();
}
if (toggle == 1){
  cv::compare(old_image , croppedFaceImage  , result_cmp , cv::CMP_EQ );
  cv::cvtColor(result_cmp, result_cmp, CV_BGR2GRAY);
  diff = cv::countNonZero(result_cmp);
  old_image = croppedFaceImage.clone();
  if(diff > 3000){
    bool check = cv::imwrite(path,croppedFaceImage);
    count_num++;
  }
}

Neglecting of redundant cropped images will reduce processing time and complexity.

Extract Face Features

Using Xilinx Vitis-ai reid neural network model, we are extracting face features of cropped faces and stored faces.

Code :

int main(int argc, char* argv[]) {
  auto model_name = argv[1];
  auto det = vitis::ai::Reid::create(model_name);
  ...
}

Face Recognization

Calculate the cosine distance between the cropped face features and the stored face features.

Code :

double cosine_distance(Mat feat1, Mat feat2) { return 1 - feat1.dot(feat2); }
Mat featx = det->run(imgx).feat;
Mat featy = det->run(imgy).feat;
double dismat = cosine_distance(featx, featy);

If cosine distance value is less than threshold value, then cropped face is considered as a match with stored faces and vice versa.
Unmatched faces with time are stored in text file(output_rem.txt), which can be furture processed for manual face recognizance.
Matched faces are stored with name and time in a text file(output.txt).

Code :

if(dismat < 0.05){
  cout << file_without_extension_2 << "," << file_without_extension_3 << endl;
  std::string input = file_without_extension_2 + "," + file_without_extension_3;
  std::ofstream outfile;
  outfile.open("/workspace/demo/Vitis-AI-Library/output/output.txt", std::ios_base::app); 
  outfile << input << endl;
}
else{
  std::string input2 = file_without_extension_2+ ",not," + file_without_extension_3;;
  std::ofstream outfile2;
  outfile2.open("/workspace/demo/Vitis-AI-Library/output/output_rem.txt", std::ios_base::app);
  outfile2 << input2 << endl;
}

Here output text file includes camera number, time and person name, as shown below.

1_Gate,6,sai

Here 1_Gate indicates camera name, 6 indicates time point in video file and sai indicates the recognized person name.

Spring boot back-end process

Once face re-identification completed and we have generated output file, back-end process will read output file and update person to camera relationship in GraphDB.
We can validate the same on neo4j web interface on localhost:7474 (username: neo4j, password: test).

DB with camera information

DB with person details updated

Result Pages

Once process is done, we can see the massage "Processing is done and ready for search", then we can perform a search either based on person or camera name.
The following image shows the person based search, person name will be same as stored faces file name excluding underscore version(_*).

person based search

Here table provides information like person start and end date-time with respect to camera location, eventually we are able track person.
As you can see along with the table information, we are generating a graphical representation to analyze person location with respect to camera.
The following image shows camera based search.

camera based search

Here table provides information like all persons start and end date-time with respect to the camera. The graphical representation shows all persons with respect to search camera.
If you want to start fresh process, then press back button here which redirects to home page, using delete all button you can clean up DB.

After delete all

Let’s Get Start with Face-re-identification Surveillance System

We provide a step by step guide for working with Face-re-identification Surveillance System in the GitHub repository.
Check for the detailed instructions in README.
For those who want to work on full application, they can choose master branch - https://github.com/durgabhavaniv/Face_Reid_Surveillance_System/tree/master
For those who want to work on only hardware part, they can choose master-hardware branch - https://github.com/durgabhavaniv/Face_Reid_Surveillance_System/tree/master-hardware
Enjoy your own Face-re-identification Surveillance System.

Demo

Demo video

Conclusions

Using Xilinx face detection model, I am able to achieve on an average of 30FPS for processing five parallel video inputs, where as each video input is of 1080p@30fps. So this makes processing time to near real-time.
Due to powerful Vitis-ai tool, Xilinx optimized models and high efficient FPGA(VCK5000) with AI-engine, the project implementation became smoother.
Able to generate better visualization using GraphDb and D3js, which helps in easy understanding of complex person tracking.
In comparison with sequential CPU processing and parallel FPGA (VCK5000) processing of face re-identification, FPGA is giving 30x performance.

Code

Credits

durga bhavani vayilati

2 projects • 3 followers

Near Real-Time Face-reIdentification Surveillance System

Things used in this project

Hardware components

Software apps and online services

Story

Problem Space

Solution - Theory

Project Planning

Software Setup

Hardware Setup

Project Implementation

Input

Face Detection

Filtering Detected Frames

Extract Face Features

Face Recognization

Spring boot back-end process

Result Pages

Let’s Get Start with Face-re-identification Surveillance System

Demo

Conclusions

Schematics

Face_Reid_Surveillance_System

Code

Face_Reid_Surveillance_System

Credits

durga bhavani vayilati

Comments

Embed the widget on your own site

Near Real-Time Face-reIdentification Surveillance System

Near Real-Time Face-reIdentification Surveillance System

Things used in this project

Hardware components

Software apps and online services

Story

Problem Space

Solution - Theory

Project Planning

Software Setup

Hardware Setup

Project Implementation

Input

Face Detection

Filtering Detected Frames

Extract Face Features

Face Recognization

Spring boot back-end process

Result Pages

Let’s Get Start with Face-re-identification Surveillance System

Demo

Conclusions

Schematics

Face_Reid_Surveillance_System

Code

Face_Reid_Surveillance_System

Credits

durga bhavani vayilati

Comments

Related channels and tags