Please note that there is a more recent version of this project:
Vitis-AI 2.0 Flow for Avnet VITIS Platforms
IntroductionThis guide provides detailed instructions for targeting the Xilinx Vitis-AI 1.4 flow to the following Avnet Vitis 2021.1 platforms:
- Ultra96-V2 Development Board
- UltraZed-EV SOM (7EV) + FMC Carrier Card
- UltraZed-EG SOM (3EG) + IO Carrier Card
This guide will describe how to download and install the pre-built SD card images, and execute the AI applications on the hardware.
IMPORTANT NOTE : The Ultra96-V2 Development Board requires a PMIC firmware update. See section "Known Issues - Ultra96-V2 PMIC firmware update" below for more details.
Design OverviewThe following block diagram illustrates the hardware designs included in the pre-built images.
These designs were built using the Vitis flow with the following DPU configurations:
- u96v2_sbc_base : 1 x B2304 (low RAM usage), 200MHz/400MHz
- uz7ev_evcc_base : 2 x B4096 (low RAM usage), 300MHz/600MHz
- uz3eg_iocc_base : 1 x B2304 (low RAM usage), 150MHz/300MHz
The pre-built images include compiled models for the following two distinct configurations:
- B2304_LR : B2304 DPU with low RAM usage
- B4096_LR : B4096 DPU with low RAM usage
Note that the B4096_LR configuration is the same as on the ZCU102 & ZCU104 pre-built images from Xilinx.
The following images capture the resource utilization with and without the DPU, in the form of resource utilization, for each of the platforms.
The following images capture the resource utilization with and without the DPU, in the form of resource placement, for each of the platforms.
Pre-built SD card images have been provided for the following Avnet platforms:
- u96v2_sbc_base : Ultra96-V2 Development Board
- uz7ev_evcc_base : UltraZed-EV SOM (7EV) + FMC Carrier Card
- uz3eg_iocc_base : UltraZed-EG SOM (3EG) + IO Carrier Card
You will need to download one of the following pre-built SD card images:
- uz96v2_sbc_base :
http://avnet.me/avnet-u96v2_sbc_base-vitis-ai-1.4-image
(2021-10-14 - MD5SUM = af763fa2d43cbef75bd93d0be173951f) - uz7ev_evcc_base :
http://avnet.me/avnet-uz7ev_evcc_base-vitis-ai-1.4-image
(2021-10-14 - MD5SUM = 9332be7effcf500cff304ac27a8746cd) - uz3eg_iocc_base :
http://avnet.me/avnet-uz3eg_iocc_base-vitis-ai-1.4-image
(2021-10-14 - MD5SUM = 3763cf017478791c3323425a5d20ca75)
Each board specific SD card image contains the hardware design (BOOT.BIN, dpu.xclbin), as well as the petalinux images (boot.scr, image.ub, rootfs.tar.gz). It is provided in image (IMG) format, and contains two partitions:
- BOOT – partition of type FAT (size=400MB)
- ROOTFS – partition of type EXT4
The first BOOT partition was created with a size of 400MB, and contains the following files:
- BOOT.BIN
- boot.scr
- image.ub
- init.sh
- platform_desc.txt
- dpu.xclbin
- arch.json
The second ROOTFS partition contains the rootfs.tar.gz content, and is pre-installed with the Vitis-AI runtime packages, as well as the following directories:
- /home/root/install/vitis-ai-runtime-1.4.0, which includes - Vitis-AI 1.4 runtime installation packages
- /home/root/dpu_sw_optimize
- /home/root/Vitis-AI, which includes - pre-built VART samples - pre-built Vitis-AI-Library samples
Once downloaded, and extracted, the.img file can be programmed to a 16GB micro SD card.
0. Extract the archive to obtain the Avnet-{platform}-Vitis-AI-1-4-{date}.img file
1. Program the board specific SD card image to a 16GB (or larger) micro SD card using Balena Etcher (available for Windows and Linux).
Step 2 - Installing the Vitis-AI runtime packagesSome of the configuration steps only need to be performed once (after the first boot), including the following:
2. Boot the target board with the micro SD card that was create in the previous section
3. After boot, copy the dpu.xclbin file to the /usr/lib directory
$ cp /media/sd-mmcblk0p1/dpu.xclbin /usr/lib/.
4. Next, install the Vitis-AI 1.4 runtime packages
$ cd ~/install/vitis-ai-runtime-1.4.0
$ source ./setup.sh
This script will perform the following steps
- install the packages (taken from https://github.com/Xilinx/Vitis-AI/tree/v1.4/setup/mpsoc/VART/2021.1 )
- modify the /etc/vart.conf to point to the /usr/lib/dpu.xclbin
5. Launch the dpu_sw_optimize.sh script
$ cd ~/dpu_sw_optimize/zynqmp
$ source ./zynqmp_dpu_optimize.sh
This script will perform the following steps:
- Auto resize SD card’s second (EXT4) partition
- QoS configuration for DDR memory
6. Validate the Vitis-AI runtime with the xdputil utility.
For the u96v2_sbc_base target, this should correspond to the following output:
$ xdputil query
{
"DPU IP Spec":{
"DPU Core Count":1,
"DPU Target Version":"v1.4.1",
"IP version":"v3.3.0",
"generation timestamp":"2021-06-07 19-15-00",
"git commit id":"df4d0c7",
"git commit time":2106071910,
"regmap":"1to1 version"
},
"VAI Version":{
"libvart-runner.so":"Xilinx vart-runner Version: 1.4.0-fa49b842f283242091476cf8e1ae4d242a2a838e 585 2021-07-13-18:50:52",
"libvitis_ai_library-dpu_task.so":"Xilinx vitis_ai_library dpu_task Version: 1.4.0-93d2e0097889ccebb5f1256e0afe8409de81f
482 585 2021-07-13 10:54:30 [UTC] ",
"libxir.so":"Xilinx xir Version: xir-ff89b11dcabb00eef6d148fcf660c8e6d02eb184 2021-07-13-18:49:08",
"target_factory":"target-factory.1.4.0 ce1b39e329cc06cb7545e8aa39174fb8b9969f0b"
},
"kernels":[
{
"DPU Arch":"DPUCZDX8G_ISA0_B2304_MAX_BG2",
"DPU Frequency (MHz)":200,
"IP Type":"DPU",
"Load Parallel":2,
"Load augmentation":"enable",
"Load minus mean":"disable",
"Save Parallel":2,
"XRT Frequency (MHz)":200,
"cu_addr":"0xb0000000",
"cu_handle":"0xaaaafbf9ac50",
"cu_idx":0,
"cu_mask":1,
"cu_name":"DPUCZDX8G:DPUCZDX8G_1",
"device_id":0,
"fingerprint":"0x1000020f6014405",
"name":"DPU Core 0"
}
]
}
Step 3 - Execute the AI applications on hardwareThe steps described in this section need to be done after each boot in order to run the AI examples.
7. If running on the Ultra96-V2 development board, re-launch the dpu_sw_optimize.sh script in order to optimize the DDR's QoS configuration for the DisplayPort monitor (otherwise, you may experience glitches on the monitor).
$ cd ~/dpu_sw_optimize/zynqmp
$ source ./zynqmp_dpu_optimize.sh
This script will perform the following steps:
- Auto resize SD card’s second (EXT4) partition
- QoS configuration for DDR memory
8. [Optional] Disable the dmesg verbose output:
$ dmesg -D
This can be re-enabled with the following:
$ dmesg -E
9. Define the DISPLAY environment variable
$ export DISPLAY=:0.0
10. Change the resolution of the DP monitor to a lower resolution, such as 640x480
$ xrandr --output DP-1 --mode 640x480
11. Launch the C++ based VART sample applications
a. Launch the adas_detection application
$ cd ~/Vitis-AI/demo/VART/adas_detection
$ ./adas_detection ./video/adas.avi /usr/share/vitis_ai_library/models/yolov3_adas_pruned_0_9/yolov3_adas_pruned_0_9.xmodel
b. Launch the pose_detection application
$ cd ~/Vitis-AI/demo/VART/pose_detection
$ ./pose_detection ./video/pose.mp4 /usr/share/vitis_ai_library/models/sp_net/sp_net.xmodel /usr/share/vitis_ai_library/models/ssd_pedestrian_pruned_0_97/ssd_pedestrian_pruned_0_97.xmodel
c. Launch the caffe version of the resnet50 application
$ cd ~/Vitis-AI/demo/VART/resnet50
$ ./resnet50 /usr/share/vitis_ai_library/models/resnet50/resnet50.xmodel
d. Launch the segmentation application
$ cd ~/Vitis-AI/demo/VART/segmentation
$ ./segmentation ./video/traffic.mp4 /usr/share/vitis_ai_library/models/fpn/fpn.xmodel
e. Launch the video_analysis application
$ cd ~/Vitis-AI/demo/VART/video_analysis
$ ./video_analysis ./video/structure.mp4 /usr/share/vitis_ai_library/models/ssd_traffic_pruned_0_9/ssd_traffic_pruned_0_9.xmodel
For the Vitis-AI-Library applications, refer to each sample directory’s “readme” file for details on how to execute the applications.
12. Launch the python based VART sample applications
a. Launch the resnet50_mt_py application
$ cd ~/Vitis-AI/demo/VART/resnet50_mt_py
$ python3 ./resnet50.py 8 /usr/share/vitis_ai_library/models/resnet50/resnet50.xmodel
The python script will apply the resnet50 to a batch of images. When done, you will see a result similar to the following (for Ultra96-V2):
FPS=27.00, total frames = 2880.00 , time=106.653120 seconds
b. Launch the inception_v1_mt_py application
$ cd ~/Vitis-AI/demo/VART/inception_v1_mt_py
$ python3 ./inception_v1.py 8 /usr/share/vitis_ai_library/models/inception_v1/inception_v1.xmodel
The python script will apply the resnet50 to a batch of images. When done, you will see a result similar to the following (for Ultra96-V2):
71.98 FPS
13. Launch the Vitis-AI-Library based sample applications
a. Launch the face_detect application with both variants of the densebox model (specify “0” as second argument, to specify the USB camera)
$ cd ~/Vitis-AI/demo/Vitis-AI-Library/samples/facedetect
$ ./test_video_facedetect densebox_640_360 0
$ ./test_video_facedetect densebox_320_320 0
b. Compare the performance of each variant of the densebox models
$ ./test_performance_facedetect densebox_640_360./test_performance_facedetect.list
$ ./test_performance_facedetect densebox_320_320./test_performance_facedetect.list
Known Issues – Using the e-Con USB3 camera with Ultra96-V2It has been observed that for some USB3 cameras such as the one(s) listed below, the Ultra96-V2 will have issues:
The solution is to connect these cameras in USB2 mode by connecting them via a USB2 hub/extender.
Known Issues – Ultra96-V2 PMIC firmware updateFor the case of the Ultra96-V2 Development Board, an important PMIC firmware update is required to run all of the AI applications.
Without the PMIC firmware update, the following AI applications will cause periodic peak current that exceeds the default 4A fault threshold, causing the power on reset to assert, and thus the board to reboot.
- adas_detection
- inception_v1_mt
- resnet50_mt
- segmentation
- video_analysis
The PMIC firmware update increases this fault threshold and prevents the reboot from occurring.
In order to update the PMIC firmware of your Ultra96-V2 development board, refer to the Ultra96-V2’s Getting Started Guide:
Ultra96-V2 Getting Started Guide
Chapter “14 - PMIC Version Check and Update” describes how to update the PMIC firmware, if needed.
Appendix 1 - Compile the Models from the Xilinx Model ZooThe Xilinx Model Zoo is a repository of free pre-trained deep learning models, optimized for inference deployment on Xilinx™ platforms.
This appendix will describe how to compile models from the Xilinx AI-Model-Zoo using two approaches:
- Manually compiling one model at a time
- Automatically compiling all models (scripted)
Manually compiling one model at a time
It is important to know the correlation between model and application. This table includes a non-exhaustive list of application that were verified with corresponding models from the model zoo.
1. The first step, if not done so already, is to clone the “v1.4” branch of the Vitis-AI repository:
$ git clone -b v1.4 https://github.com/Xilinx/Vitis-AI
$ cd Vitis-AI
$ export VITIS_AI_HOME=$PWD
2. The second step is to inspect the model.yaml for the specific model from the Xilinx Model Zoo. For example for the 640x360 version of the densebox model:
$ cd $VITIS_AI_HOME/models/AI-Model-Zoo
$ cat model-list/cf_densebox_wider_360_640_1.11G_1.4/model.yaml
...
description: face detection model.
input size: 360*640
float ops: 1.11G
task: face detection
framework: caffe
prune: 'no'
version: 1.4
files:
- name: cf_densebox_wider_360_640_1.11G_1.4
type: float & quantized
board: GPU
download link: https://www.xilinx.com/bin/public/openDownload?filename=cf_densebox_wider_360_640_1.11G_1.4.zip
checksum: e7a2fb60638909db368ab6bb6fa8283e
- name: densebox_640_360
type: xmodel
board: zcu102 & zcu104 & kv260
download link: https://www.xilinx.com/bin/public/openDownload?filename=densebox_640_360-zcu102_zcu104_kv260-r1.4.0.tar.gz
checksum: 101bce699b9dada0e97fdf0c95aa809f
...
license: https://github.com/Xilinx/Vitis-AI/blob/master/LICENSE
We can see that Xilinx provides several versions of the model, including:
- float & quantized : pre-quantized model, used as source for compilation
- zcu102 & zcu104 & kv260 : pre-built model binaries for zcu102/zcu104 boards
- etc…
3. The third step is to download the source archive for the model, and extract it
$ wget https://www.xilinx.com/bin/public/openDownload?filename=cf_densebox_wider_360_640_1.11G_1.4.zip -O cf_densebox_wider_360_640_1.11G_1.4.zip
$ unzip cf_densebox_wider_360_640_1.11G_1.4.zip
4. Do the same for the other models that you want to compile
$ wget https://www.xilinx.com/bin/public/openDownload?filename=cf_resnet50_imagenet_224_224_7.7G_1.4.zip -O cf_resnet50_imagenet_224_224_7.7G_1.4.zip
$ unzip cf_resnet50_imagenet_224_224_7.7G_1.4.zip
$ wget https://www.xilinx.com/bin/public/openDownload?filename=tf_resnetv1_50_imagenet_224_224_6.97G_1.4.zip -O tf_resnetv1_50_imagenet_224_224_6.97G_1.4.zip
$ unzip tf_resnetv1_50_imagenet_224_224_6.97G_1.4.zip
5. Copy the architecture file (arch.json) for your hardware platform. For the pre-built images, this file can be found in the BOOT partition of the design’s SD card.
$ cp {path_to_arch_json}/arch.json .
This file should contain content similar to the following:
{"fingerprint":"..."}
This is some kind of encrypted content, that identifies the DPU configuration for the design.
For the “uz7ev_evcc” design (which has the B4096_LR DPU configuration), the following arch.json can also be used:
{"target":"DPUCZDX8G_ISA0_B4096_MAX_BG2"}
For the “u96v2_sbc” and “uz3eg_iocc” designs (which have the B2304_LR DPU configuration), the following arch.json can also be used:
{"target":"DPUCZDX8G_ISA0_B2304_MAX_BG2"}
6. Launch the Vitis-AI docker container
6.1 If not done so already, pull version 1.4.916 of the docker container with the following command:
docker pull xilinx/vitis-ai:1.4.916
6.2 Launch version 1.4.916 of the Vitis-AI docker from the Vitis-AI directory:
$ cd $VITIS_AI_HOME
$ sh -x docker_run.sh xilinx/vitis-ai:1.4.916
7. When prompted, Read all the license notification messages, and press ENTER to accept the license terms.
8. Navigate to the AI-Model-Zoo directory
$ cd models/AI-Model-Zoo
9. Create a directory for the compiled models
$ mkdir compiled_output
10. Create a generic recipe for compiling a caffe model, by creating a script named “compile_cf_model.sh” with the following content
model_name=$1
modelzoo_name=$2
vai_c_caffe \
--prototxt ./${modelzoo_name}/quantized/deploy.prototxt \
--caffemodel ./${modelzoo_name}/quantized/deploy.caffemodel \
--arch ./arch.json \
--output_dir ./compiled_output/${model_name} \
--net_name ${model_name}
11. To compile the caffe model used by the face_detection application, invoke the generic script we just created as follows:
$ conda activate vitis-ai-caffe
(vitis-ai-caffe) $
source ./compile_cf_model.sh densebox_640_360 cf_densebox_wider_360_640_1.11G_1.4
12. To compile the caffe model used by the resnet50 application, invoke the generic script we just created as follows:
$ conda activate vitis-ai-caffe
(vitis-ai-caffe) $
source ./compile_cf_model.sh resnet50 cf_resnet50_imagenet_224_224_7.7G_1.4
13. Create a generic recipe for compiling a tensorflow model, by creating a script called “compile_tf_model.sh” with the following content
model_name=$1
modelzoo_name=$2
vai_c_tensorflow \
--frozen_pb ./${modelzoo_name}/quantized/quantize_eval_model.pb \
--arch ./arch.json \
--output_dir ./compiled_output/${model_name} \
--net_name ${model_name}
14. To compile the tensorflow model used by the resnet50 application, invoke the generic script we just created as follows:
$ conda activate vitis-ai-tensorflow
(vitis-ai-tensorflow) $
source ./compile_tf_model.sh resnet_v1_50_tf tf_resnetv1_50_imagenet_224_224_6.97G_1.4
15. Verify the contents of the directory with the tree utility:
$ tree compiled_output
compiled_output/
├── densebox_640_360
│ ├── densebox_640_360_org.xmodel
│ ├── densebox_640_360.xmodel
│ ├── meta.json
│ └── md5sum.txt
├── resnet50
│ ├── meta.json
│ ├── md5sum.txt
│ ├── resnet50_org.xmodel
│ └── resnet50.xmodel
└── resnet_v1_50_tf
├── meta.json
├── md5sum.txt
├── resnet_v1_50_tf_org.xmodel
└── resnet_v1_50_tf.xmodel
3 directories, 12 files
16. Exit the tools docker
$ exit
Automatically compiling all models
The previous instructions can be tedious to perform for the entire model zoo, which includes over 100 pre-quantized models. Also added, are support the PyTorch and TensorFlow 2 models.
In order to compile the entire model zoo, the following automated script can be used:
https://github.com/Avnet/vitis/blob/2021.1/app/zoo/compile_modelzoo.sh
This script will scan all of the model.yaml files in the model-list sub-directories, and perform the following automatically:
- download source (float & quantized) archive
- download target (zcu102 & zcu104 & kv260) archive
- compile model from source archive
- copy {model}.prototxt file from target archive (required for Vitis-AI-Library models)
- copy {model}_officialcfg.prototxt file from target archive
This script will be able to compile most of the 100+ models.
The compiled models will be output to the following directory:
vitis_ai_library/models
This content should be copied to the following location on a custom embedded platform
/usr/share/vitis_ai_library/models
Appendix 2 – Rebuilding the DesignThis section describes how to re-build this design.
The DPU-enabled designs were built with Vitis.
With this in mind, the first step is to create a Vitis platform, which can be done with a linux machine, which has the Vitis 2021.1 tools correctly installed.
The following commands will clone the Avnet “bdf”, “hdl”, “petalinux”, and “vitis” repositories, all needed to re-build the Vitis platforms:
git clone https://github.com/Avnet/bdf
git clone –b 2021.1 https://github.com/Avnet/hdl
git clone –b 2021.1 https://github.com/Avnet/petalinux
git clone –b 2021.1 https://github.com/Avnet/vitis
Then, from the “vitis” directory, run make and specify one of the following targets
- u96v2_sbc : will re-build the Vitis platform for the Ultra96-V2 Development Board
- uz7ev_evcc : will re-build the Vitis platform for the UltraZed-EV SOM (7EV) + FMC Carrier Card
- uz3eg_iocc : will re-build the Vitis platform for the UltraZed-EG SOM (3EG) + IO Carrier Card
Also specify which build steps you want to perform, in order:
- xsa : will re-build the Vivado project for the hardware design
- plnx : will re-build the petalinux project for the software
- sysroot : will re-build the root file system, used for cross-compilation on the host
- pfm : will re-build the Vitis platform
As an example, to rebuild the Vitis platform for the Ultra96-V2, use the following commands:
cd vitis
make u96v2_sbc step=xsa
make u96v2_sbc step=plnx
make u96v2_sbc step=sysroot
make u96v2_sbc step=pfm
With the Vitis platform built, you can build the DPU-TRD, as follows:
make u96v2_sbc step=dpu
For reference, this build step performs the following:
- clone branch v1.4 of the Vitis-AI repository (if not done so already)
- copy the DPU-TRD to the projects directory, and rename it to {platform}_dpu
- copy the following three files from the vitis/app/dpu directory:- Makefile : modified Makefile- dpu_conf.vh : modified DPU configuration file specifying DPU architecture, etc…- config_file/prj_config : modified configuration file specifying DPU clocks & connectivity
- build design with make
This will create a SD card image in the following directory:
vitis/projects/{platform}_dpu/prj/Vitis/binary_container_1/sd_card.img
Where {platform} will be something like “u96v2_sbc_base_2021_1”.
This SD card image can be programed to the SD card, as described previously in this tutorial. However, it does not yet contain all the installed runtime packages and pre-compiled applications.
In order to complete the full installation, you will need to follow the instructions in the following sections of the Vitis-AI repository:
- Installing the Vitis AI runtime v1.4 (for Edge)
https://github.com/Xilinx/Vitis-AI/blob/v1.4/setup/mpsoc/VART/README.md - Installing the VART examples
https://github.com/Xilinx/Vitis-AI/tree/v1.4/demo/VART
as well as the images/video files
vitis_ai_runtime_r1.4.0_image_video.tar.gz - Installing the Vitis AI Library examples
https://github.com/Xilinx/Vitis-AI/tree/v1.4/demo/Vitis-AI-Library
as well as the image/video files
vitis_ai_library_r1.4.0_images.tar.gzvitis_ai_library_r1.4.0_video.tar.gz - Install the compiled models to the /usr/share/vitis_ai_library_models directory, as described below
With the DPU-TRD design built, you can compile the AI-Model-Zoo for this design, as follows:
make u96v2_sbc step=zoo
For reference, this build step performs the following:
- clone branch v1.4 of the Vitis-AI repository (if not done so already)
- *copy the models/AI-Model-Zoo to the projects directory, and rename it to {platform}_zoo
- copy the following files from the vitis/app/zoo directory- compile_modelzoo.sh : script to compile all models
In order to perform the actual compilation (ie. for u96v2_sbc), perform the steps described below:
==================================================================
Instructions to build AI-Model-Zoo for {platform} platform:
==================================================================
cd projects/{platform}_zoo/.
./docker_run.sh xilinx/vitis-ai:1.4.916
source ./compile_modelzoo.sh
==================================================================
Additional Information:
- to compile only one (or a few) models,
remove unwanted model sub-directories from model-list directory
==================================================================
This will create compiled models in the following directory:
vitis/projects/{platform}_zoo/vitis_ai_library/models
ConclusionI hope this tutorial, with its pre-built SD card image, will help you to get started quickly with Vitis-AI 1.4 on the Avnet platforms.
If there is any other related content that you would like to see, please share your thoughts in the comments below.
Known Issues and Next StepsThe following models from the model zoo are not provided in the pre-built images, due to unresolved compilation issues:
- pt_pointpainting_nuscenes_1.4 (pointpainting_nuscenes_40000_64_[0|1]_pt, semanticfpn_nuimage_576_320_pt)
- pt_pointpillars_nuscenes_1.4 (pointpillars_nuscenes_40000_64_[0|1]_pt)
- pt_sa-gate_NYUv2_360_360_178G_1.4 (SA_gate_pt)
- tf_rcan_DIV2K_360_640_0.98_86.95G_1.4 (rcan_pruned_tf)
The following torchvision models are not provided in the pre-built images, since they cannot be re-compiled (no source provided by Xilinx):
- torchvision_inception_v3 (inception_v3_pt)
- torchvision_resnet50 (resnet50_pt)
- torchvision_squeezenet (squeezenet_pt)
2021/09/13 - Preliminary Version - u96v2_sbc_base only
2021/10/14 - Major Update
Updated the cover animation to "Point Cloud 3D detection" demo.
Added support for following platforms:
- uz7ev_evcc_base
- uz3eg_iocc_base
Added following compiled models from model zoo:
- centerpoint_[0|1]_pt
- FADNet_[0|1|2]_pt
- pointpillars_kitti_12000_[0|1]_pt
- salsanext_pt
- salsanext_v2_pt
Thank you Ali Falahati and Xinyu Chen for asking how the cover animation was created :)
Comments