Published May 12, 2020

Vitis-AI 1.1 Flow for Avnet VITIS Platforms - Part 1

This guide provides detailed instructions for targeting the DNNDK samples from the Xilinx Vitis-AI 1.1 flow for Avnet Vitis 2019.2 platforms

IntermediateFull instructions provided6 hours16,163

Things used in this project

Hardware components

Tria Technologies Ultra96-V2

Webcam, Logitech® HD Pro

DisplayPort monitor

Software apps and online services

Vitis 2019.2

Petalinux 2019.2

Vitis AI 1.1

Docker

Story

Please note that there is a more recent version of this project:

Vitis-AI 1.4 Flow for Avnet VITIS Platforms

Introduction

Avnet recently released Vitis 2019.2 platforms for several of their hardware platforms. These platforms also support the Vitis-AI flow from Xilinx.

The Xilinx Vitis-AI repository ( github.com/Xilinx/Vitis-AI ) provides an excellent tutorial called DPU-TRD on targeting the DPU AI engine to a custom Vitis platform.

This guide is part 1 of 2 which provides detailed instructions for targeting the Xilinx Vitis-AI 1.1 flow to the following Avnet Vitis 2019.2 platforms:

Ultra96-V2 Development Board
UltraZed-EV SOM (7EV) + FMC Carrier Card
UltraZed-EG SOM (3EG) + IO Carrier Card
UltraZed-EG SOM (3EG) + PCIEC Carrier Card

Part 1 covers the use of the existing Avnet Vitis platforms to target the Vitis-AI 1.1. DNNDK API based application examples.

Part 2 will cover how to modify the Avnet Vitis platforms in order to target the Vitis-AI 1.1 VART based application examples.

Once the tools have been setup, there are five (5) main steps to targeting an AI applications to one of the Avnet platforms:

1 - Build the Hardware Design
2 - Compile the Model from the Xilinx AI Model Zoo
3 - Build the AI applications
4 - Create the SD card content
5 - Execute the AI applications on hardware

IMPORTANT NOTE : The Ultra96-V2 Development Board requires a PMIC firmware update. See section "Known Issues - Ultra96-V2 PMIC firmware update" below for more details.

Setup - Install the Xilinx Tools

This project requires the following tools:

Vitis 2019.2 Unified Software Platform
Docker
Vitis-AI v1.1

Refer to Xilinx Vitis Unified Software Platform for instructions on installing Vitis 2019.2 on your linux machine.

Refer to Install Docker for instructions on installing Docker on your linux machine.

Next, clone v1.1 of the the Vitis-AI git repository.

1. Clone Xilinx’s Vitis-AI github repository:

$ git clone https://github.com/Xilinx/Vitis-AI
$ cd Vitis-AI
$ git checkout v1.1
$ export VITIS_AI_HOME="$PWD"

Setup - Install the Avnet Vitis platforms

This guide can be used for any Vitis platform, which will be denoted by {platform}.

For example

ULTRA96V2 : Ultra96-V2 Development Board
UZ7EV_EVCC : UltraZed-EV SOM (7EV) + FMC Carrier Card
UZ3EG_IOCC : UltraZed-EG SOM (3EG) + IO Carrier Card
UZ3EG_PCIEC : UltraZed-EG SOM (3EG) + PCIEC Carrier Card

1. Download the Vitis platform for the appropriate board using one of the links below, and extract to the hard drive of your linux machine:

ULTRA96V2 : http://avnet.me/ultra96v2-vitis-2019.2
UZ7EV_EVCC : http://avnet.me/uz7ev-evcc-vitis-2019.2
UZ3EG_IOCC : http://avnet.me/uz3eg-iocc-vitis-2019.2
UZ3EG_PCIEC : http://avnet.me/uz3eg-pciec-vitis-2019.2

2. Specify the location of the Vitis platform, by creating the SDX_PLATFORM environment variable that specified to the location of the.xpfm file

For the ULTRA96V2 platform, this should look similar to the following:

$ export SDX_PLATFORM=/home/Avnet/vitis/platform_repo/ULTRA96V2/ULTRA96V2.xpfm

For the UZ7EV_EVCC platform, this should look similar to the following:

$ export SDX_PLATFORM=/home/Avnet/vitis/platform_repo/UZ7EV_EVCC/UZ7EV_EVCC.xpfm

For the rest of this document, the platform denoted will be denoted by {platform}.

Replace all instances of “{platform}” with the appropriate platform name, such as “ULTRA96V2”.

Step 1 - Build the Hardware Project

The creation of the Hardware Project is well documented on Xilinx’s Vitis-AI github repository, specifically the DPU-TRD section.

https://github.com/Xilinx/Vitis-AI/tree/v1.1/DPU-TRD

DPU TRD Vitis Flow

https://github.com/Xilinx/Vitis-AI/blob/v1.1/DPU-TRD/prj/Vitis/README.md

1. Make a copy of the DPU-TRD directory for your platform:

$ cd $VITIS_AI_HOME
$ cp -r DPU-TRD DPU-TRD-{platform}
$ export TRD_HOME=$VITIS_AI_HOME/DPU-TRD-{platform}
$ cd $TRD_HOME/prj/Vitis

Depending on which platform you are targeting, following the instructions in one of the next two sections.

Step 1.1 – Build the Hardware Project for the ULTRA96V2, UZ3EG_IOCC, and UZ3EG_PCIEC platforms

1. Edit the dpu_conf.vh file, to specify the architecture and configuration of the DPU, according to the available resources on the Vitis platform

$ cd $TRD_HOME/prj/Vitis
$ vi dpu_conf.vh

Target a DPU with B2304 architecture, by making the following changes to the dpu_conf.vh file.

//`define B4096
`define B2304

Leave all other parameters the same

2. Edit the config_file/prj_config file, to specify the connectivity of the DPU core

$ vi config_file/prj_config

First, specify the number of DPU cores to instantiate in the design as 1.

[connectivity]
...
nk=dpu_xrt_top:1

Specify which frequencies to use for the 1x and 2x clocks

The Avnet Vitis platforms (ULTRA96V2, UZ3EG_IOCC, and UZ3EG_PCIEC) have the following clocks defined in their hardware design:

We will use the 150MHz & 300MHz clocks to connect the DPU.

[clock]
id=0:dpu_xrt_top_1.aclk
id=1:dpu_xrt_top_1.ap_clk_2

NOTE : the dpu_xrt_top_1.ap_clk_2 must be 2X the frequency of dpu_xrt_top_1.aclk

In order to connect up the DPU core, we also need to specify which AXI interconnects to use.

[connectivity]
sp=dpu_xrt_top_1.M_AXI_GP0:HPC0
sp=dpu_xrt_top_1.M_AXI_HP0:HP0
sp=dpu_xrt_top_1.M_AXI_HP2:HP1

NOTE : the same port can be specified twice, in which case additional AXI interconnect will be added if needed.

Leave the other settings the same.

3. Build the DPU enabled hardware design

$ make KERNEL=DPU DEVICE={platform}

The make will build the individual DPU core, then build the complete hardware project.

The Vivado project will be located in the following directory:

DPU-TRD-{platform}/prj/Vitis/binary_container_1/link/vivado/vpl/prj/prj.xpr

4. The output binaries will be located in the following directory:

$ tree binary_container_1/sd_card
├── BOOT.BIN
├── dpu.xclbin
├── image.ub
├── README.txt
└── {platform}.hwh

NOTE : The.hwh file contains details about the hardware implementation, and will be used during model compilation.

Step 1.2 – Build the Hardware Project for the UZ7EV_EVCC platform

1. Edit the dpu_conf.vh file, to specify the architecture and configuration of the DPU, according to the available resources on the Vitis platform

$ cd $TRD_HOME/prj/Vitis
$ vi dpu_conf.vh

Keep the DPU architecture set to B4096.

`define B4096

Target usage of Ultra-RAM (specific to EV devices), by making the following changes to the dpu_conf.vh file.

//`define URAM_DISABLE
`define URAM_ENABLE

Target high RAM usage, by making the following changes to the dpu_conf.vh file.

//`define RAM_USAGE_LOW
`define RAM_USAGE_HIGH

Target high DSP48 usage, by making the following changes to the dpu_conf.vh file.

//`define DSP48_USAGE_LOW
`define DSP48_USAGE_HIGH

Leave the other parameters the same

2. Edit the config_file/prj_config file, to specify the connectivity of the DPU (and optionally SFM) cores

$ vi config_file/prj_config

First, specify the number of DPU cores to instantiate in the design as 1.

[connectivity]
...
nk=dpu_xrt_top:1

Specify which frequencies to use for the 1x and 2x clocks

The Avnet Vitis platforms (UZ7EV_EVCC) have the following clocks defined in their hardware design:

We will use the 200MHz & 400MHz clocks to connect the two (2) DPU cores

[clock]
id=4:dpu_xrt_top_1.aclk
id=5:dpu_xrt_top_1.ap_clk_2

NOTE : the dpu_xrt_top_1.ap_clk_2 must be 2X the frequency of dpu_xrt_top_1.aclk

In addition, specify the 200MHz clock to connect the SFM.

id=4:sfm_xrt_top_1.aclk

In order to connect up the two (2) DPU cores (and SFM core), we also need to specify which AXI interconnects to use.

[connectivity]
sp=dpu_xrt_top_1.M_AXI_GP0:HPC0
sp=dpu_xrt_top_1.M_AXI_HP0:HP0
sp=dpu_xrt_top_1.M_AXI_HP2:HP1

NOTE : the same port can be specified twice, in which case additional AXI interconnect will be added if needed.

Leave the other settings the same.

3. Build the DPU core + SFM core

$ make KERNEL=DPU_SM DEVICE={platform}

The make will build the individual DPU (and SFM) core(s), then build the complete hardware project.

The Vivado project will be located in the following directory:

DPU-TRD-{platform}/prj/Vitis/binary_container_1/link/vivado/vpl/prj/prj.xpr

4. The output binaries will be located in the following directory:

$ tree binary_container_1/sd_card
├── BOOT.BIN
├── dpu.xclbin
├── image.ub
├── README.txt
└── {platform}.hwh

NOTE : The.hwh file contains details about the hardware implementation, and will be used during model compilation.

Step 2 - Compile the Models from the Xilinx Model Zoo

The Xilinx Model Zoo is a repository of free pre-trained deep learning models, optimized for inference deployment on Xilinx™ platforms.

This project will concentrate on the models for which examples applications have been provided. It is important to know the correlation between model and application. This table includes a non-exhaustive list of application that were verified with corresponding models from the model zoo.

1. The first step is to download the pre-trained models from the Xilinx Model Zoo:

$ cd $VITIS_AI_HOME/AI-Model-Zoo
$ source ./get_model.sh

This will download version 1.1 of the model zoo ( all_models_1.1.zip )

2. Launch the tools docker from the Vitis-AI directory

$ cd $VITIS_AI_HOME
$ sh -x docker_run.sh xilinx/vitis-ai:latest-cpu

3. When prompted, Read all the license notification messages, and press ENTER to accept the license terms.

4. Within the docker session, launch the "vitis-ai-caffe" Conda environment

$ conda activate vitis-ai-caffe
(vitis-ai-caffe) $

5. Create a modelzoo directory, and copy the hardware handoff (.hwh) file

(vitis-ai-caffe) $ cd DPU-TRD-{platform}
(vitis-ai-caffe) $ mkdir modelzoo
(vitis-ai-caffe) $ cd modelzoo
(vitis-ai-caffe) $ cp ../prj/Vitis/binary_container_1/sd_card/{platform}.hwh .

6. Use the dlet tool to generate your.dcf file

(vitis-ai-caffe) $ dlet -f {platform}.hwh

7. The previous step will generate a dcf with a name similar to dpu-11-18-2019-18-45.dcf.Rename this file to {platform}.dcf

(vitis-ai-caffe) $ mv dpu*.dcf {platform}.dcf

8. Create a file named “custom.json” with the following content

{"target": "dpuv2", "dcf": "./{platform}.dcf", "cpu_arch": "arm64"}

9. Create a directory for the compiled models

(vitis-ai-caffe) $ mkdir compiled_output

10. Create a generic recipe for compiling a caffe model, by creating a script named “compile_cf_model.sh” with the following content

model_name=$1
modelzoo_name=$2
vai_c_caffe \
--prototxt ../../AI-Model-Zoo/models/${modelzoo_name}/quantized/deploy.prototxt \
--caffemodel ../../AI-Model-Zoo/models/${modelzoo_name}/quantized/deploy.caffemodel \
--arch ./custom.json \
--output_dir ./compiled_output/${modelzoo_name} \
--net_name ${model_name} \
--options "{'mode': 'normal'}"

11. Compile the caffe model for the resnet50 application, using the generic script we just created:

$ conda activate vitis-ai-caffe
(vitis-ai-caffe) $ source ./compile_cf_model.sh resnet50 cf_resnet50_imagenet_224_224_7.7G

12. Compile the caffe model for the face_detection application, using the generic script we just created:

$ conda activate vitis-ai-caffe
(vitis-ai-caffe) $ source ./compile_cf_model.sh densebox cf_densebox_wider_360_640_1.11G

13. Create a generic recipe for compiling a tensorflow model, by creating a script called “compile_tf_model.sh” with the following content

model_name=$1
modelzoo_name=$2
vai_c_tensorflow \
--frozen_pb ../../AI-Model-Zoo/models/${modelzoo_name}/quantized/deploy_model.pb \
--arch ./custom.json \
--output_dir ./compiled_output/${modelzoo_name} \
--net_name ${model_name}

14. Compile the tensorflow models, using the generic script we just created:

$ conda activate vitis-ai-tensorflow
(vitis-ai-tensorflow) $ source ./compile_tf_model.sh tf_resnet50 tf_resnetv1_50_imagenet_224_224__6.97G

15. Verify the contents of the directory with the tree utility:

(vitis-ai-caffe) $ tree
├── compiled_output
│   ├── cf_densebox_wider_360_640_1.11G
│   │   ├── densebox_kernel_graph.gv
│   │   └── dpu_densebox.elf
│   ├── cf_resnet50_imagenet_224_224_7.7G
│   │   ├── dpu_resnet50_0.elf
│   │   └── resnet50_kernel_graph.gv
│   ├── tf_resnetv1_50_iamgenet_224_224_6.97G
│       ├── dpu_tf_resnet50_0.elf
│       └── tf_resnet50_kernel_graph.gv
├── compile_cf_model.sh
├── compile_tf_model.sh
├── custom.json
├── {platform}.dcf
└── {platform}.hwh
6 directories, 15 files

16. Exit the tools docker

(vitis-ai-caffe) $ exit

Step 3 - Compile the AI Applications

The Vitis-AI 1.1 provides several different APIs, the DNNDK API, and the VART API.

The DNNDK API is the low-level API used to communicate with the AI engine (DPU). This API is the recommended API for users that will be creating their own custom neural networks, targeted to the Xilinx devices.

The Vitis-AI RunTime (VART) API, and Vitis-AI-Library, provide a higher level of abstraction that simplifies development of AI applications. This API is recommended for users wishing to leverage the existing pre-trained models from the Xilinx Model Zoo in their custom applications.

Step 3.1 - Compile the DNNDK based AI Applications

This version of the tutorial only covers the DNNDK based examples, and is based on the documentation available on the Vitis-AI github repository, specifically the “mpsoc” section.

https://github.com/Xilinx/Vitis-AI/tree/v1.1/mpsoc

1. Change to the DPU-TRD-{platform} work directory.

$ cd DPU-TRD-{platform}

2. Download and install the SDK for cross-compilation, specifying a unique and meaningful installation destination (knowing that this SDK will be specific to the Vitis-AI 1.1 DNNDK samples)

$ wget -O sdk.sh https://www.xilinx.com/bin/public/openDownload?filename=sdk.sh
$ chmod +x sdk.sh
$ ./sdk.sh -d ~/petalinux_sdk_vai_1_1_dnndk

3. Setup the environment for cross-compilation

$ unset LD_LIBRARY_PATH
$ source ~/petalinux_sdk_vai_1_1_dnndk/environment-setup-aarch64-xilinx-linux

4. Download and extract the additional DNNDK runtime content to the previously installed SDK

$ wget -O vitis-ai_v1.1_dnndk.tar.gz  https://www.xilinx.com/bin/public/openDownload?filename=vitis-ai_v1.1_dnndk.tar.gz
$ tar -xvzf vitis-ai-v1.1_dnndk.tar.gz

5. Install the additional DNNDK runtime content to the previously installed SDK

$ cd vitis-ai-v1.1_dnndk
$ ./install.sh $SDKTARGETSYSROOT

6. Make a working copy of the “vitis_ai_dnndk_samples” directory.

$ cp -r ../mpsoc/vitis_ai_dnndk_samples .

7. Download and extract the additional content (images and video files) for the DNNDK samples.

$ wget -O vitis-ai_v1.1_dnndk_sample_img.tar.gz https://www.xilinx.com/bin/public/openDownload?filename=vitis-ai_v1.1_dnndk_sample_img.tar.gz
$ tar -xvzf vitis-ai_v1.1_dnndk_sample_img.tar.gz

FACE_DETECTION

8. For the face_detection application, create a model directory and copy the dpu_*.elf model files we previously built

$ cd $TRD_HOME/vitis_ai_dnndk_samples/face_detection
$ mkdir model_for_{platform}
$ cp ../../modelzoo/compiled_output/cf_densebox_wider_360_640_1.11G/dpu_*.elf model_for_{platform}/.

9. For the face_detection application, edit the src/main.cc file to add the following two function calls after the VideoCapture initialization:

VideoCapture camera(0);
if (!camera.isOpenned()) {
cerr << “Open camera error!” << endl;
exit(-1);
}
camera.set(CV_CAP_PROP_FRAME_WIDTH,640);
camera.set(CV_CAP_PROP_FRAME_HEIGHT,480);

10. For the face_detection application, copy the “model_for_{platform}” directory to “model”, then run the “make” command

$ cp -r model_for_{platform} model
$ make

NOTE : You could also edit the build.sh script to add support for the new {platform}. This is left as an optional exercise to the user. If you prefer to modify the build.sh script to add your plaform, refer to the pre-built solutions for an example of how this was done.

RESNET50

11. For the resnet50 application, create a model directory and copy the dpu_*.elf model files we previously built

$ cd $TRD_HOME/vitis_ai_dnndk_samples/resnet50
$ mkdir model_for_{platform}
$ cp ../../modelzoo/compiled_output/cf_resnet50_imagenet_224_224_7.7G/dpu_*.elf model_for_{platform}/.

12. For the resnet50 application, copy the “model_for_{platform}” directory to “model”, then run the “make” command

$ cp -r model_for_{platform} model
$ make

TF_RESNET50

13. For the tf_resnet50 application, create a model directory and copy the dpu_*.elf model files we previously built

$ cd $TRD_HOME/vitis_ai_dnndk_samples/tf_resnet50
$ mkdir model_for_{platform}
$ cp ../../modelzoo/compiled_output/tf_resnetv1_50_imagenet_224_224_6.97G/dpu_*.elf model_for_{platform}/.

14. For the tf_resnet50 application, copy the “model_for_{platform}” directory to “model”, then run the “make” command

$ cp -r model_for_{platform} model
$ make

Step 4 - Create the SD card

1. Create a “sdcard” directory

$ cd DPU-TRD-{platform}
$ mkdir sdcard

2. Copy the design files (hardware + petalinux) for the DPU design to the “sdcard” directory.

$ cp prj/Vitis/binary_container_1/sd_card/* sdcard/.

3. Copy the applications to the “sdcard” directory

$ cp -r vitis_ai_dnndk_samples sdcard/.

4. Copy the Vitis-AI runtime for DNNDK to the “sdcard/runtime” directory

$ mkdir sdcard/runtime
$ cp -r vitis-ai_v1.1_dnndk sdcard/runtime/.

5. At this point, your “sdcard” directory should have the following contents

$ tree sdcard

6. Copy the contents of the “sdcard” to the boot partition of the scard

7. If applicable (ie. ULTRA96V2), etract the “rootfs.tar.gz” to the second partition of the sdcard

Step 5 - Execute the AI applications on hardware

1. Boot the target board with the sdcard that was create in the previous section

2. If prompted for a login, specify “root” as login and password.

3. Navigate to the sdcard folder

a. For the ULRA96V2, this can be done as follows:

$ cd /run/media/mmcblk0p1

b. For the UZ7EV_EVCC, UZ3EG_IOCC, and UZ3EG_PCIEC, this can be done as follows:

$ cd /run/media/mmcblk1p1

4. Copy the dpu.xclbin file to the /usr/lib directory

$ cp dpu.xclbin /usr/lib/.

5. Install the Vitis-AI embedded package

$ cd runtime/vitis-ai_v1.1_dnndk
$ source ./install.sh

If the dpu.xclbin file is not manually copied to the /usr/lib directory, the install.sh script will generate an error message, since it will attempt to copy it from the /mnt directory.

cp: cannot stat ‘/mnt/dpu.xclbin’: No such file or directory

The install.sh script may also fail to install the python support, which is not critical for this tutorial

Warning: pip3 command not found, skip install python support

6. If prompted for a login, again, specify “root” as login and password

7. Re-navigate to the sdcard directory

8. Validate the Vitis-AI board package with the dexplorer utility

$ dexplorer --whoami
[DPU IP Spec]
IP Timestamp             : 2020-03-26 13:30:00
DPU Core Count           : 1
[DPU Core Configuration List]
DPU Core                 : #0
DPU Enabled              : Yes
DPU Arch                 : B2304
DPU Target Version       : v1.4.1
DPU Freqency             : 300 MHz
Ram Usage                : Low
DepthwiseConv            : Enabled
DepthwiseConv+Relu6      : Enabled
Conv+Leakyrelu           : Enabled
Conv+Relu6               : Enabled
Channel Augmentation     : Enabled
Average Pool             : Enabled

NOTE : Even if you have built the design for frequencies other than 150MHz/300MHz for the DPU, the dexplorer utility will still report 300MHz.

9. Define the DISPLAY environment variable

$ export DISPLAY=:0.0

10. Change the resolution of the DP monitor to 640x480

$ xrandr --output DP-1 --mode 640x480

11. Launch the DNNDK API based sample applications

$ cd vitis_ai_dnndk_samples

a. Launch the face_detection application

$ cd face_detection
$ ./face_detection

b. Press <CTRL-C> to exit the application

<CTRL-C>
$ cd ..

c. Launch the caffe version of the resnet50 application

$ cd resnet50
$ ./resnet50

d. Wait for application to finish, or Press <CTRL-C> to exit

<CTRL-C>
$ cd ..

e. Launch the tensorflow version of the resnet50 application

$ cd tf_resnet50
$ ./tf_resnet50

f. Wait for application to finish, or Press <CTRL-C> to exit

<CTRL-C>
$ cd ..

Solution – Pre-built SD card images

For convenience, pre-built SD card images have been created for the following Avnet platforms:

ULTRA96V2 : http://avnet.me/ultra96v2-vitis-ai-1.1-image
(MD5SUM = 24abc163ea04874f97826d4d28e7ce2e)
UZ7EV_EVCC : http://avnet.me/uz7ev-evcc-vitis-ai-1.1-image
(MD5SUM = c9fff42ef9954252ec2f9eb59b88ac3c)
UZ3EG_IOCC : http://avnet.me/uz3eg-iocc-vitis-ai-1.1-image
(MD5SUM = d4448130a5ea7950552bc41d72e94651)
UZ3EG_PCIEC : http://avnet.me/uz3eg-pciec-vitis-ai-1.1-image
(MD5SUM = b3aa17a4304c554b60bf5e87220352a4)

The following table describes the applications that are provided on the pre-built SD card images, as well as the command used to launch each of them:

NOTE : mmcblk#p1 denotes one of either mmcblk0p1 or mmcblk1p1, depending on which platform is being tested.

1. For the DNNDK API based AI applications, navigate to the application’s directory, and execute the provided command.

The pre-built solutions have been built with the following DPU configurations:

UZ7EV_EVCC : B2304 (high RAM usage, high DSP48 usage), 200MHz/400MHz
ULTRA96-V2 : B2304 (low RAM usage, low DSP48 usage), 200MHz/400MHz
UZ3EG_IOCC : B2304 (low RAM usage, low DSP48 usage), 150MHz/300MHz
UZ3EG_PCIEC : B2304 (low RAM usage, low DSP48 usage), 150MHz/300MHz

The pre-built solutions provide compiled models for the following two (2) DPU configurations:

B2304_lr : B2304 DPU with low RAM usage
B4096_hr : B4096 DPU with high RAM usage

The build.sh scripts have been modified as follows to support the four (4) Avnet platforms.

...
elif [ "$TestBoard" = "ULTRA96V2" ] || [ "$TestBoard" = "UZ3EG_IOCC" ] || [ "$TestBoard" = "UZ3EG_PCIEC" ]; then
    if [ -e ./model_for_B2304_lr ]; then
        echo "copy B2304 (low RAM usage) model file..."
        cp -r model_for_B2304_lr ./model
    else
        echo "The folder named 'model_for_B2304_lr' does not exist!"
        exit 1
    fi
elif [ "$TestBoard" = "UZ7EV_EVCC" ]; then
    if [ -e ./model_for_B4096_hr ]; then
        echo "copy B4096 (high RAM usage) model file..."
        cp -r model_for_B4096_hr ./model
    else
        echo "The folder named 'model_for_B4096_hr' does not exist!"
        exit 1
    fi
else
...

Known Issues – Ultra96-V2 PMIC firmware update

For the case of the Ultra96-V2 Development Board, an important PMIC firmware update is required to run all of the AI applications.

Without the PMIC firmware update, the following AI applications will cause periodic peak current that exceeds the default 4A fault threshold, causing the power on reset to assert, and thus the board to reboot.

adas_detection
inception_v1_mt
resnet50_mt
segmentation
video_analysis

The PMIC firmware update increases this fault threshold, and prevents the reboot from occurring.

In order to update the PMIC firmware of your Ultra96-V2 development board, refer to the following instructions:

https://www.element14.com/community/groups/power-management/blog/2020/01/23/infineon-pmic-updates-to-avnet-products

If you are unable to update the PMIC firmware on your Ultra96-V2, but still want to run all of the AI applications, you can make use of the following script (from Xilinx) to reduce the frequency of the DPU:

https://github.com/Xilinx/Vitis_Embedded_Platform_Source/blob/2019.2/Xilinx_Official_Platforms/zcu104_dpu/petalinux/project-spec/meta-user/recipes-ai/dpuclk/files/dpu_clk

This script will reduce the frequency of the PL_CLK0 (100MHz) clock source that feeds the Clock Wizard that generates the multiple clock frequencies, available for Vitis.

The recommended setting for the Ultra96-V2 without PMIC firmware update is to reduce the DPU frequencies down to 125MHz/250MHz.

If you have built the design for 150MHz/300MHz, you should use a value of 83%.

If you have built the design for 200MHz/400MHz (ie. pre-built solution), you should use a value of 65%.

1. Execute the following commands after boot:

$ dpu_clk
Real PL0_CLK 100000000
DPU Performance 100.0%

$ dpu_clk 65

$ dpu_clk
Real PL0_CLK 65217391
DPU Performance 65.2%

This will set the PL_CLK0 frequency to 65MHz, and thus the DPU frequencies to 65% of their original values.

Known Issues – adas_detection

Attempting to compile the “yolo” model will result in errors.

(vitis-ai-caffe) $ source ./compile_cf_model.sh yolo dk_yolov3_cityscapes_256_512_0.9_5.46G

NOTE : In version 1.0 of the Xilinx AI Model Zoo, the models had an additional directory called “compiler”, where these edits were already provided. In version 1.1 of the Xilinx AI Model Zoo, this “compiler” directory was removed, so the edits need to be done manually.

dk_yolov3_cityscapes_256_512_0.9_5.46G

When this model is converted from Darknet, during quantization, two additional terms “yolo_height” and “yolo_width” were automatically added in the first “Input” layer of the model.

1. These two lines need to be deleted from “quantized/deploy.prototxt”, for model compilation.

yolo_height: 256
  yolo_width: 512

Known Issues – mobilenet

Attempting to compile the “mobilenet” model will result in errors.

(vitis-ai-caffe) $ source ./compile_cf_model.sh mobilenet cf_mobilenetv2_imagenet_224_224_0.59G

cf_mobilenetv2_imagenet_224_224_0.59G

For the mobilenet model, the "Softmax" layer must be deleted.

1. These two lines need to be deleted from “quantized/deploy.prototxt”, for model compilation.

layer {
   name: "prob"
   type: "Softmax"
   bottom: "417"
   top: "prob"
}

Known Issues – pose_detection

Attempting to compile the “ssd_person” model will result in errors.

(vitis-ai-caffe) $ source ./compile_cf_model.sh ssd_person cf_ssdpedestrian_coco_360_640_0.97_5.9G

cf_ssdpedestrian_coco_360_640_0.97_5.9G

For the SSD models, the last “Reshape”, “Softmax”, “Flatten”, and “DetectionOutput” layers need to be deleted.

1. The following lines need to be deleted from “quantized/deploy.prototxt”, for model compilation.

layer {
  name: "mbox_conf_reshape"
  type: "Reshape"
  bottom: "mbox_conf"
  top: "mbox_conf_reshape"
  include {
    phase: TEST
  }
  reshape_param {
    shape {
      dim: 0
      dim: -1
      dim: 2
    }
  }
}
layer {
  name: "mbox_conf_softmax"
  type: "Softmax"
  bottom: "mbox_conf_reshape"
  top: "mbox_conf_softmax"
  include {
    phase: TEST
  }
  softmax_param {
    axis: 2
  }
}
layer {
  name: "mbox_conf_flatten"
  type: "Flatten"
  bottom: "mbox_conf_softmax"
  top: "mbox_conf_flatten"
  include {
    phase: TEST
  }
  flatten_param {
    axis: 1
  }
}
layer {
  name: "detection_out"
  type: "DetectionOutput"
  bottom: "mbox_loc"
  bottom: "mbox_conf_flatten"
  bottom: "mbox_priorbox"
  top: "detection_out"
  include {
    phase: TEST
  }
  detection_output_param {
    num_classes: 2
    share_location: true
    background_label_id: 0
    nms_param {
      nms_threshold: 0.5
      top_k: 400
    }
    code_type: CENTER_SIZE
    keep_top_k: 200
    confidence_threshold: 0.01
  }
}

SSD application code

Furthermore, the SSD application code for the DNNDK based application will not work correctly with the SSD model from v1.1 of the Xilinx AI model zoo.

This issue has been fixed with the following update:

https://github.com/Xilinx/Vitis-AI/commit/7285d5f78cbe4add65e864b46f53ae120d04b6c5

Update pose_detection of dnndk sample to use the same model with VART sample

The following image illustrates the output from the original code (top image) and corrected code (bottom image).

Known Issues – video_analysis

Attempting to compile the “yolo” model will result in errors.

(vitis-ai-caffe) $ source ./compile_cf_model.sh ssd cf_ssdtraffic_360_480_0.9_11.6G

cf_ssdtraffic_360_480_0.9_11.6G

For the SSD models, the last “Reshape”, “Softmax”, “Flatten”, and “DetectionOutput” layers need to be deleted.

1. The following lines need to be deleted from “quantized/deploy.prototxt”, for model compilation.

layer {
  name: "mbox_conf_reshape"
  type: "Reshape"
  bottom: "mbox_conf"
  top: "mbox_conf_reshape"
  reshape_param {
    shape {
      dim: 0
      dim: -1
      dim: 4
    }
  }
}
layer {
  name: "mbox_conf_softmax"
  type: "Softmax"
  bottom: "mbox_conf_reshape"
  top: "mbox_conf_softmax"
  softmax_param {
    axis: 2
  }
}
layer {
  name: "mbox_conf_flatten"
  type: "Flatten"
  bottom: "mbox_conf_softmax"
  top: "mbox_conf_flatten"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "detection_out"
  type: "DetectionOutput"
  bottom: "mbox_loc"
  bottom: "mbox_conf_flatten"
  bottom: "mbox_priorbox"
  top: "detection_out"
  include {
    phase: TEST
  }
  detection_output_param {
    num_classes: 4
    share_location: true
    background_label_id: 0
    nms_param {
      nms_threshold: 0.45
      top_k: 400
    }
    code_type: CENTER_SIZE
    keep_top_k: 200
    confidence_threshold: 0.01
  }
}

2. Furthermore, the first “test” layer must be deleted from “quantized/deploy.prototxt”

name:"test"
input:"data"
input_shape{
  dim:1
  dim:3
  dim:360
  dim:480
}

The previous deleted lines need to be replaced by the same “Input” layer as the cf_ssdpedestrian_coco_360_360_0.97_5.9G, replacing the “640” width for “480”.

3. Add the following lines to the start of “quantized/deploy.prototxt”

layer {
  name: "data"
  type: "Input"
  top: "data"
  transform_param {
    mean_value: 104
    mean_value: 117
    mean_value: 123
    force_color: true
    resize_param {
      prob: 1
      resize_mode: WARP
      height: 360
      width: 480
      interp_mode: LINEAR
    }
  }
  input_param {
    shape {
      dim: 1
      dim: 3
      dim: 360
      dim: 480
    }
  }
}

Known Issues - python applications

The following python application were successfully verified on the ULTRA96V2 platform:

inception_v1_mt_py
miniresnet_py
resnet50_mt_py

However, they will not run on the UZ3EG_IOCC, UZ3EG_PCIEC, and UZ7EV_EVCC, since the pip3 utility is not included in their petalinux project. This issue will be addressed in Part 2 of this tutorial.