Introduction
Hailo Flow Overview
Setting up the Raspberry Pi 5 AI Kit
A Note on the pre-compiled models for Hailo-8L
Installing the python application on the Raspberry Pi 5 AI Kit
Launching the python application on the Raspberry Pi 5 AI Kit
Benchmarking the models on the Raspberry Pi 5 AI Kit
Conclusion
Acknowledgements
Version History
References

Published September 16, 2024 © Apache-2.0

Accelerating the MediaPipe models on Raspberry Pi 5 AI Kit

An exploration of accelerating the MediaPipe models with Hailo-8L on the Raspberry Pi 5 AI Kit.

BeginnerProtip1 hour2,982

Things used in this project

Hardware components

RPI5 AI Kit

Raspberry Pi 5

Software apps and online services

OpenCV – Open Source Computer Vision Library OpenCV

Story

Introduction

This project is part of a series on the subject of deploying the MediaPipe models to the edge on embedded platforms.

If you have not already seen the previous projects in this series, I recommend to start with the following projects:

These two projects describe the challenges that can be expected when deploying the MediaPipe models to embedded platforms, specifically for the Hailo-8 acceleration modules.

In this project, I will describe how to reproduce our results on the Raspberry Pi 5 AI Kit, which contains a Hailo-8L acceleration module, and report the profiling results.

Hailo Flow Overview

Hailo's AI Software Suite allows users to deploy models to the Hailo AI accelerators.

Hailo AI Software Suite - Workflow (📷: Hailo)

In addition to the Hailo AI accelerator devices, Hailo offers a scalable range of PCIe Gen 3.0 compatible M.2 AI accelerator modules:

Hailo AI Acceleration Modules (📷: AlbertaBeef)

This project will only cover the following Hailo AI acceleration modules:

Hailo-8L : M.2 B+M Key (PCIe Gen 3.0, 2 lanes), 13 TOPS

For more details on the Hailo AI SW Suite, and how the MediaPipe models were compiled for Hailo-8 and Hailo-8L acceleration modules, please refer to the following project:

Accelerating the MediaPipe models with Hailo-8

Setting up the Raspberry Pi 5 AI Kit

In order to setup the RPI5, please refer to the following documentation:

[github/hailo-ai] hailo-ai/hailo-rpi5-examples

RPI5 AI Kit Hardware Components (📷: AlbertaBeef)

I purchased a pre-assembled kit from CanaKit, and followed their getting started instructions:

https://www.canakit.com/pi5-ai

Pre-assembled RPI5 AI Kit with Hailo-8L Acceleration (📷: CanaKit)

The first step, after configuring the Raspberry Pi OS, was to open a terminal, update the system, and install the Hailo package:

sudo apt update
sudo apt full-upgrade

sudo apt install hailo-all

sudo reboot

After rebooting, I was able to identify the Hailo-8L acceleration module:

lspci
hailortcli fw-control identify

Identifying the Hailo-8L module on RPI5 AI Kit (📷 : AlbertaBeef)

Next, the Hailo example repository was cloned, and virtual environment created:

git clone https://github.com/hailo-ai/hailo-rpi5-examples.git
cd hailo-rpi5-examples

source setup_env.sh

Inside the virtual environment, the python requirements were installed, and demo resources downloaded

pip install -r requirements.txt

./download_resources.sh

Finally, the detection demo was launched

python basic_pipelines/detection.py --input resources/detection0.mp4

python basic_pipelines/detection.py --input resources/detection0.mp4 (📹 : AlbertaBeef)

Now that we have verified that the Hailo-8L acceleration module is working, we can move on to our accelerated MediaPipe demo application.

A Note on the pre-compiled models for Hailo-8L

My understanding is that the RPI5 Hailo-8L integration was performed with Hailo AI SW Suite v2024–04, with models compiled with DFC v3.27.0. In preparation for this, I have compiled the Hailo-8L models using DFC v3.27.0:

Hailo8L models : blaze_models_hailo8l_dfc_v3.27.0.zip (compiled with DFC v3.27.0)

It turns out, the models compiled with the previous version of tools also worked fine, which is a testimony to the forward version compatibility of models in the Hailo flow.

Hailo-8L models : blaze_hailo8l_models.zip (compiled with DFC v3.25.0)

Installing the python application on the Raspberry Pi 5 AI Kit

The python application can be accessed from the following github repository:

git clone https://github.com/AlbertaBeef/blaze_app_python
cd blaze_app_python

The python demo application requires certain packages which can be installed as follows:

pip3 install tflite_runtime matplotlib plotly kaleido numpy==1.24

In order to successfully use the python demo with the original TFLite models, they need to be downloaded from the google web site:

cd blaze_tflite/models
source ./get_tflite_models.sh
cd ../..

In order to successfully use the python demo with the Hailo-8L models, they need to be downloaded as follows:

cd blaze_hailo/models
source ./get_hailo8l_models.sh
unzip -o blaze_hailo8l_models.zip
cp hailo8l/*.hef .
cd ..

Although I provide pre-compiled models for the face/pose detection and landmark models, only the palm detection and hand landmark models are currently working with the python demo application.

You are all set !

Launching the python application on the Raspberry Pi 5 AI Kit

The python application can launch many variations of the dual-inference pipeline, which can be filtered with the following arguments:

--blaze : hand | face | pose
--target : blaze_tflite |... | blaze_hailo |
--pipeline : specific name of pipeline (can be queried with --list argument)

In order to display the complete list of supported pipelines, launch the python script as follows:

rpi5aikit@raspberrypi:~/blaze_app_python# python3 blaze_detect_live.py --list
[INFO] user@hosthame :  rpi5aikit@raspberrypi
[INFO] blaze_tflite supported ...
...
[INFO] blaze_hailo supported ...
...
Command line options:
 --input       :
 --image       :  False
 --blaze       :  hand,face,pose
 --target      :  blaze_tflite,blaze_pytorch,blaze_vitisai,blaze_hailo
 --pipeline    :  all
 --list        :  True
 --debug       :  False
 --withoutview :  False
 --profilelog  :  False
 --profileview  :  False
 --fps         :  False

List of target pipelines:
...
07 hai_hand_v0_10_lite       blaze_hailo/models/palm_detection_lite.hef
                             blaze_hailo/models/hand_landmark_lite.hef
08 hai_hand_v0_10_full       blaze_hailo/models/palm_detection_full.hef
                             blaze_hailo/models/hand_landmark_lite.hef
...

In order to launch the Hailo-8L pipeline for hand detection and landmarks on the rpi’s desktop, use the python script as follows::

python3 blaze_detect_live.py --pipeline=hai_hand_v0_10_lite

This will launch the 0.10 (lite) version of the model, compiled for Hailo-8L, as shown below:

python3 blaze_detect_live.py --pipeline=hai_hand_v0_10_lite (📹 : AlbertaBeef)

The previous video has not been accelerated. It shows the frame rate to be the maximum 30 fps when no hands are detected (one model running : palm detection), approximately 26-28 fps when one hand has been detected (two models running : palm detection and hand landmarks), and approximately 22-25 fps when two hands have been detected (three models running : palm detection and 2 hand landmarks).

It is worth noting that this is running with a single-threaded python script. There is an opportunity for increased performance with a multi-threaded implementation. While the graph runner is waiting for transfers from one model's sub-graphs, another (or several other) model(s) could be launched in parallel...

There is also an opportunity to accelerate the rest of the pipeline with C++ code...

Benchmarking the models on the Raspberry Pi 5 AI Kit

For reasons which I have not resolved, the "--profileview" argument does not work well on the Raspberry Pi, so we will use the "--profilelog" argument instead.

The profiling functionality uses a test image that can be downloaded from Google as follows:

source ./get_test_images.sh

The following commands can be used to generate profile results for the hai_hand_v0_10_lite pipeline using the Hailo-8L models, and the test image:

rm blaze_detect_live.csv
python3 blaze_detect_live.py --pipeline=hai_hand_v0_10_lite --image --withoutview --profilelog
mv blaze_detect_live.csv blaze_detect_live_rpi5aikit_hai_hand_v0_10_lite.csv

The following commands can be used to generate profile results for the tfl_hand_v0_10_lite pipeline using the TFLite models, and the test image:

rm blaze_detect_live.csv
python3 blaze_detect_live.py --pipeline=tfl_hand_v0_10_lite --image --withoutview --profilelog
mv blaze_detect_live.csv blaze_detect_live_rpi5aikit_tfl_hand_v0_10_lite.csv

The same is done for the hai_hand_v0_10_full & tfl_hand_v0_10_full models.

The results of all.csv files were averaged, then plotted using Excel.

Here are the profiling results for the 0.07 and 0.10 versions of the models deployed with Hailo-8L, in comparison to the reference TFLite models:

Palm Detection + Hand Landmarks - Hailo-8L Acceleration (📷: AlbertaBeef)

If we plot the acceleration ratios of the execution times for the Hailo-8L models with respect to the TFLite models, we get the following results:

Palm Detection + Hand Landmarks - Hailo-8L Acceleration ratios (📷: AlbertaBeef)

The models were accelerated by a factor of 5.6x, and the full palm+hand pipeline was accelerated by a factor of 3.8x.

Again, it is worth noting that these benchmarks have been taken with a single-threaded python script. There is additional opportunity for acceleration with a multi-threaded implementation. While the graph runner is waiting for transfers from one model's sub-graphs, another (or several other) model(s) could be launched in parallel...

There is also an opportunity to accelerate the rest of the pipeline with C++ code...

Conclusion

I hope this project will inspire you to implement your own custom application.

What applications would you like to see built on top of these foundational MediaPipe models ?

Let me know if the comments...

Acknowledgements

I want to thank Gianluca Filippini (EBV) for his pioneering work with the Hailo-8 AI Accelerator module, and bringing this marvel to my attention. His feedback, guidance, and insight have been invaluable.