Published January 8, 2022 © GPL3+

Create a custom Kria KV260 accelerated ML application

Learn how to create a custom Kria Vision AI Starter Kit (KV260) ML application with PL overlay.

IntermediateFull instructions provided3 hours8,369

Things used in this project

Hardware components

AMD Kria KV260 Vision AI Starter Kit

Kria KV260 Power Supply and Adapter

Kria KV260 Basic Accessory Pack

3D printed KV260 stand (optional)

Software apps and online services

AMD Vivado Design Suite

2020.2.2

AMD Vitis Unified Software Platform

2020.2.2

AMD PetaLinux

2020.2.2

Story

Introduction

The introduction of the Kria SOM from Xilinx is exciting! The KV260 Vision AI Starter Kit is a great platform for developing and prototyping accelerated algorithms including, but not limited to, Machine Learning, Computer Vision, and Signal Processing. There are several pre-built accelerated applications available in the Kria App Store that can be download and run on the KV260. The Smart Camera, and NLP-SmartVision apps from Xilinx use the AR1335 image sensor along with the AP1302 ISP (the AR1335 sensor comes in the KV260 accessory pack, and is a great add on for image processing applications). After testing out some of the pre-built applications you may be asking, "how do I create my own application". This project will describe steps that can be used to create a basic design for accelerating ML inference tasks.

Before we get started it's helpful to review some KV260 terminology

Platform: The Vitis platform used as the base design. Defines physical interfaces to off-chip components such image sensors. Also defines accelerator clock and memory interfaces.
Overlay:The accelerated application secret sauce. This is where we add our ML accelerator (i.e. the DPU), and any additional accelerators that we need. The term "overlay" is used because we are overlaying these accelerators on the Platform.

This project will create a custom overlay for a platform that supports the AR1335 sensor.

Requirements

KV260 Starter Kit with power supply and basic accessory pack
Linux build machine
Vitis 2020.2.2
PetaLinux 2020.2.2

Project set up

For this project we will be starting with the KV260 BSP and some reference designs. From there we will modify the BSP & reference designs to create our custom accelerated ML inference application.

Download the KV260 2020.2.2 BSP from here, and save to the ~/Downloads directory
Create a project directory on your Linux build machine. The following command will create a project directory named kv260_ml_accel

mkdir ~/kv260_ml_accel
export PROJ_DIR=~/kv260_ml_accel

Clone the KV260-Vitis example projects from GitHub using the following commands

cd $PROJ_DIR
git clone https://github.com/xilinx/kv260-vitis
cd kv260-vitis
git checkout release-2020.2.2_k26

Clone the Vitis-AI repository from GitHub. The Vitis-AI repository will be used to add the deep-learning processing unit (DPU) to the design.

cd $PROJ_DIR
git clone https://github.com/xilinx/Vitis-AI
cd Vitis-AI
git checkout v1.3

Note: The DPU is a soft CNN co-processor to the ARM A53 processor complex in the Xilinx Zynq UltraScale+ MPSoC chip that resides on the KV260 SOM.

Clone the Vitis_Libraries repository from GitHub. The Vitis Vision library will be used to add an ML pre-processing accelerator (image resizing)

cd $PROJ_DIR
git clone https://github.com/Xilinx/Vitis_Libraries
cd Vitis_Libraries
git checkout 2020.2

Source the Vitis & PetaLinux environment setup scripts. If you close the terminal from which these scripts are sourced, you will need to repeat this step in a new terminal

source <Vitis install directory>/2020.2/settings64.sh
source <PetaLinux install directory>/2020.2.2/settings.sh

Note: The setup scripts shown in the previous command are located in the tool install directory, for example Vitis may be installed in the /tools/Xilinx/Vitis/2020.2 directory on your machine.

Download the KV260 board files (if not already done). The script shown in the following command will install the board files in the $XILINX_VIVADO/data/boards/board_files directory.

cd $PROJ_DIR
wget https://www.hackster.io/code_files/543211/download -O get_kv260_boards.sh
dos2unix get_kv260_boards.sh
sh ./get_kv260_boards.sh

Create the platform

The NLP-SmartVision platform supplies basic clock (100, 300, 600 MHz), and memory connections for the addition of accelerators to the PL. The platform also provides the capture pipeline necessary for interfacing with the AR1335 and AP1302. This will be the base platform for this project.

Modify the platform to remove the scalar IP in the capture pipeline. If it's desired to perform scaling of the sensor data that can be accomplished with AP1302 ISP. This modification is necessary to fit within the KV260 device URAM resources.

cd $PROJ_DIR/kv260-vitis/platforms/vivado/kv260_ispMipiRx_DP/scripts
cp config_bd.tcl config_bd.tcl.orig
sed -i 's/C_TOPOLOGY {0}/C_TOPOLOGY {3}/g' config_bd.tcl
sed -i '132i\ \ \ CONFIG.C_CSC_ENABLE_WINDOW {false} \\' config_bd.tcl
sed -i 's/v_proc_ss_0\/aclk_axis/v_proc_ss_0\/aclk/g' config_bd.tcl
sed -i 's/\[get_bd_pins v_proc_ss_0\/aclk_ctrl\]//g' config_bd.tcl
sed -i 's/v_proc_ss_0\/aresetn_ctrl/v_proc_ss_0\/aresetn/g' config_bd.tcl

The following commands will build the platform:

cd $PROJ_DIR/kv260-vitis
make platform PFM=kv260_ispMipiRx_DP

Once the platform build is complete, the output will be located in $PROJ_DIR/kv260-vitis/platforms/xilinx_kv260_ispMipiRx_DP_202022_1

Compile the ML pre-processing accelerator

In most cases sensor data will need a resolution reduction before ML inference can be performed. The input capture pipeline is set up to capture up to 4K image sizes, but most ML networks do not support such a large input size. In order to scale images prior to ML inference, we will use the Vitis Vision library to add an image scaler IP to the PL as an accelerator.

Modify the default resize accelerator configuration to support color (RGB) images

cd $PROJ_DIR/Vitis_Libraries/vision/L2/examples/resize
sed -i 's/RGB 0/RGB 1/g' build/xf_config_params.h
sed -i 's/GRAY 1/GRAY 0/g' build/xf_config_params.h

Compile the Vitis Vision library resize function using the Vitis v++ command

v++ -c -t hw xf_resize_accel.cpp \
--platform $PROJ_DIR/kv260-vitis/platforms/xilinx_kv260_ispMipiRx_DP_202022_1/kv260_ispMipiRx_DP.xpfm \
--kernel_frequency 300 \
-I../../../L1/include \
-I./build \
--save-temps \
-k resize_accel \
-o resize_accel.xo

The output of the compilation process is the Xilinx object file (resize_accel.xo), which is what we need in order to add the accelerator to our PL overlay.

Create the ML inference acceleration overlay

We will use the Vitis-AI DPU-TRD to add the DPU IP to the design. The DPU is the IP used to accelerate CNN inference tasks. For this project we will use the largest DPU size - the B4096 DPU.

Navigate to the DPU-TRD directory

cd $PROJ_DIR/Vitis-AI/dsa/DPU-TRD/prj/Vitis

Update the dpu_conf.vh file to use UltraRAM. The following sed command will update the dpu_conf.vh file to enable UltraRAM

sed -i 's/^`define URAM_DISABLE/`define URAM_ENABLE/' dpu_conf.vh

Define the DPU clock and memory connections using a configuration file. A configuration file is provided with this project. The following commands will download the configuration file and place it in the DPU-TRD project.

cd config_file
mv prj_config prj_config.orig
wget https://www.hackster.io/code_files/542906/download -O prj_config
cd ..

Modify the DPU-TRD project Makefile to include the pre-processing accelerator (resize_accel.xo)

sed -i '53i kernel_xo += ${PROJ_DIR}/Vitis_Libraries/vision/L2/examples/resize/resize_accel.xo' Makefile

Build the DPU-TRD to add the DPU & resize accelerator to the platform using the following commands

export SDX_PLATFORM=$PROJ_DIR/kv260-vitis/platforms/xilinx_kv260_ispMipiRx_DP_202022_1/kv260_ispMipiRx_DP.xpfm
make binary_container_1/dpu.xclbin KERNEL=DPU DEVICE=kv260

When the build completes you will see a dpu.xclbin file located in the $PROJ_DIR/Vitis-AI/dsa/DPU-TRD/prj/Vitis/binary_container_1 directory, and the bitfile located in the $PROJ_DIR/Vitis-AI/dsa/DPU-TRD/prj/Vitis/binary_container_1/link/vivado/vpl/prj/prj.runs/impl_1 directory.

Copy build files to a working directory

mkdir -p $PROJ_DIR/overlay_files
cd binary_container_1
cp dpu.xclbin $PROJ_DIR/overlay_files
cp link/vivado/vpl/prj/prj.runs/impl_1/*.bit $PROJ_DIR/overlay_files/kv260-ml-accel.bit

Create the PetaLinux project from BSP

In the project setup section we downloaded the BSP from the Xilinx downloads site. We will use that downloaded BSP to create the PetaLinux project. The following commands will create the project from BSP.

cd $PROJ_DIR
petalinux-create -t project -s ~/Downloads/xilinx-k26-starterkit-v2020.2.2-final.bsp 
cd xilinx-k26-starterkit-2020.2.2
echo 'BOARD_VARIANT = "kv"' >>  project-spec/meta-user/conf/petalinuxbsp.conf
petalinux-config --silentconfig

Add the custom PL overlay that includes the DPU to the PetaLinux project

The custom PL overlay will be packaged as an application in the PetaLinux project and is added to the target root file system. This allows the xmutil utility to load the custom overlay as an "accelerated application" after Linux has booted on the KV260.

Download the platform device tree definition for the kv260_ispMipiRx_DP platform from GitHub

wget https://raw.githubusercontent.com/Xilinx/kv260-firmware/release-2020.2.2_k26/nlp-smartvision/kv260-nlp-smartvision.dtsi -O $PROJ_DIR/overlay_files/kv260-ml-accel.dtsi

Modify the device tree to change the driver for the color space conversion block (needed since we modified the platform to remove scaling capabilities)

cd $PROJ_DIR/overlay_files
sed -i 's/scaler-2.2/csc/g' kv260-ml-accel.dtsi
sed -i 's/clock-names = "aclk_axis", "aclk_ctrl"/clock-names = "aclk"/g' kv260-ml-accel.dtsi

Remove scalar specific properties from the device tree since we removed scaling capabilities from the capture pipeline (Note: the AP1302 ISP is still capable of performing scaling if desired).

sed -i 's/clocks = <\&misc_clk_2>, <\&misc_clk_2>/clocks = <\&misc_clk_2>/g' kv260-ml-accel.dtsi
sed -i '/xlnx,num-hori-taps = <6>;/d' kv260-ml-accel.dtsi
sed -i '/xlnx,num-vert-taps = <6>;/d' kv260-ml-accel.dtsi

Create the PetaLinux application recipe. This will create the files/directory necessary to add our custom overlay to the PetaLinux project.

cd $PROJ_DIR/xilinx-k26-starterkit-2020.2.2
petalinux-create -t apps --template fpgamanager --name kv260-ml-accel --enable --srcuri "$PROJ_DIR/overlay_files/kv260-ml-accel.bit $PROJ_DIR/overlay_files/kv260-ml-accel.dtsi $PROJ_DIR/overlay_files/dpu.xclbin"

The app will be created in $PROJ_DIR/xilinx-k26-starterkit-2020.2.2/project-spec/meta-user/recipes-apps/kv260-ml-accel. If you need to update the bitstream, device tree, or xclbin then you can just replace the files located in that directory. However, make sure to keep the same names because the.bb file is looking for specific names.

Create a recipe to add supporting software packages

Adding additional software packages to the project will create a target root file system with the necessary libraries for ML inference. These libraries include Vitis-AI, and OpenCV among other various utilities. Additionally, there are firmware files for programming the AP1302 ISP that need to be added to the project as well.

The following commands will add the packages to the project:

mkdir -p project-spec/meta-user/recipes-core/packagegroups

echo '
DESCRIPTION = "KV260 ML inference app related packages"

inherit packagegroup

KV260_ML_ACCEL_PACKAGES = " \
      ap1302-ar1335-single-firmware \
      dnf \
      e2fsprogs-resize2fs \
      parted \
      resize-part \
      packagegroup-petalinux-vitisai \
      packagegroup-petalinux-vitisai-dev \
      packagegroup-petalinux-gstreamer \      
      cmake \
      libgcc \
      gcc-symlinks \
      g++-symlinks \
      binutils \
      xrt \
      xrt-dev \
      zocl \
      opencl-clhpp-dev \
      opencl-headers-dev \
      packagegroup-petalinux-opencv \
      packagegroup-petalinux-opencv-dev \
      packagegroup-petalinux-v4lutils  \
      "

RDEPENDS_${PN} = "${KV260_ML_ACCEL_PACKAGES}"

COMPATIBLE_MACHINE = "^$"
COMPATIBLE_MACHINE_k26-kv = "${MACHINE}"
PACKAGE_ARCH = "${BOARDVARIANT_ARCH}"

' > project-spec/meta-user/recipes-core/packagegroups/packagegroup-kv260-ml-accel.bb

Add the custom packagegroup to the root file system configuration

echo "CONFIG_packagegroup-kv260-ml-accel" >> project-spec/meta-user/conf/user-rootfsconfig
echo "CONFIG_packagegroup-kv260-ml-accel=y" >> project-spec/configs/rootfs_config

Build the PetaLinux project and create the SD card wic image

The following commands will build the PetaLinux project and then package the output files as a wic image, which can be written to an SD card.

Build the project

petalinux-build

Generate the wic image file

petalinux-package --wic --bootfiles "ramdisk.cpio.gz.u-boot boot.scr Image system.dtb"

Write the wic image to an SD card

The wic image file can be written to an SD card using an imaging utility such as BalenaEtcher or if on Linux the dd command. The following commands can be used on a Linux machine to write the wic image to an SD card. Please make sure to read the following disclaimer before proceeding:

Write the SD card image to a blank SD card. The following command can be used to write the SD card image using Linux:

sudo dd if=images/linux/petalinux-sdimage.wic of=/dev/sd<X> status=progress

NOTE: The SD card /dev/sd<X> referenced above will be unique to your system. You will need to replace the sd<X> with the appropriate drive mapping for your system. For example, sd<X> could be equal to sda, sdb, sdc, etc. depending on how your system enumerates the SD card device. Make sure the name specified in the of= argument above is the device name, and not just a partition (i.e. of=/dev/sd<X> is correct, but of=dev/sd<X>1 would be incorrect).

When the SD card write process completes you can eject the device with the following command

sudo eject /dev/sd<X>

Note: The note about SD card device enumeration in the previous step applies to this step as well.

Set up the KV260

The KV260 should be connected as described in the getting started guide located at https://www.xilinx.com/products/som/kria/kv260-vision-starter-kit/kv260-getting-started/connecting-everything.html

Boot the KV260

Insert the micro-SD card imaged in the last section into the KV260 micro-SD card slot, and apply power to the board. The board should start to boot, and when complete you will see a login prompt. Login using the username "petalinux", and then change the password as directed by the prompt. See figure below for a serial port console example.

KV260 Login Prompt

Download the pre-compiled DenseBox face detection model from the Xilinx Vitis-AI Model Zoo

The Xilinx Vitis-AI Model Zoo has pre-compiled models for the B4096 DPU architecture. The example application we are creating uses the DenseBox face detection model. The following commands will download the pre-trained & pre-compiled model, then install it in the KV260 root file system. Please execute the following commands on the KV260.

cd /home/petalinux
wget https://www.xilinx.com/bin/public/openDownload?filename=densebox_640_360-zcu102_zcu104-r1.3.0.tar.gz -O densebox_640_360-zcu102_zcu104-r1.3.0.tar.gz
sudo mkdir -p /usr/share/vitis_ai_library/models

Extract the model:

sudo tar -xvzf densebox_640_360-zcu102_zcu104-r1.3.0.tar.gz -C /usr/share/vitis_ai_library/models

Download the sample application code

The following commands will download and extract the sample application code

wget https://hacksterio.s3.amazonaws.com/uploads/attachments/1393631/face_detect_px1rBmdftj.zip -O face_detect.zip
unzip face_detect.zip

The sample application code performs image rescaling using the hardware accelerator created in section Compile the ML pre-processing accelerator. The input layer size for the DenseBox face detection model is 640x360, but the capture resolution is 1920x1080. The hardware accelerator is called to resize the image using the OpenCL API. The hardware accelerator management is handled by a class defined in the resize_accel.hpp header file.

Compile the application

Before we can compile the application we must load the kv260-ml-accel Kria application. The following commands executed on the KV260 will unload the default application, then load the kv260-ml-accel application.

sudo xmutil unloadapp
sudo xmutil loadapp kv260-ml-accel

After the kv260-ml-accel loads you should see the following in the terminal

kv260-ml-accel Application Load

After the application loads you may need to press "return or enter" on your keyboard to get back to a prompt.

The following command will compile the application

cd /home/petalinux/face_detect
make -j4

When compilation completes you will see an executable name facedetect.exe

Run the application

Before running the application we need to set up the media pipeline for the MIPI camera and AP1302 ISP. If you are using a USB camera or other source such as a video file, you can skip this step. Execute the following command to set up the MIPI capture pipeline.

cd /home/petalinux/face_detect
./setup_media_pipe.sh

After running the script you should see output similar to the following if you have the MIPI camera attached to connector J7.

Make note of the video device (/dev/video*) output by the setup script. This value will need to be passed as an argument to the sample application.

Execute the following commands to run the application. You should see the 1920x1080 capture image with face detections overlaid on the HDMI monitor.

cd /home/petalinux/face_detect
./facedetect.exe /dev/video2

Please note that this is a very simple sample application and is not optimized for performance (i.e. it's single threaded).

Face detection on 1080p MIPI camera input

Summary

This project created a custom machine learning accelerated application for the Kria KV260 Vision AI Starter Kit. The accelerated application included an overlay with B4096 DPU CNN and image scaling accelerators.

I hope you enjoyed this project. Please follow me in order to stay up to date on my latest projects. I'm working on a project that describes how to add this accelerated application to the certified Ubuntu image for the Xilinx Kria KV260 Vision AI Starter Kit.

Update 2/9/2022 - Please see my additional project that uses the official Canonical Ubuntu image for the KV260 - Easy Machine Learning on Ubuntu with the Xilinx Kria KV260

Code

# /*
# * Copyright 2019 Xilinx Inc.
# *
# * Licensed under the Apache License, Version 2.0 (the "License");
# * you may not use this file except in compliance with the License.
# * You may obtain a copy of the License at
# *
# *    http://www.apache.org/licenses/LICENSE-2.0
# *
# * Unless required by applicable law or agreed to in writing, software
# * distributed under the License is distributed on an "AS IS" BASIS,
# * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# * See the License for the specific language governing permissions and
# * limitations under the License.
# */


[clock]

freqHz=300000000:DPUCZDX8G_1.aclk
freqHz=600000000:DPUCZDX8G_1.ap_clk_2
freqHz=300000000:resize_accel_1.ap_clk

[connectivity]

sp=DPUCZDX8G_1.M_AXI_GP0:HP1
sp=DPUCZDX8G_1.M_AXI_HP0:HP2
sp=DPUCZDX8G_1.M_AXI_HP2:HPC1

sp=resize_accel_1.m_axi_gmem1:HPC0
sp=resize_accel_1.m_axi_gmem2:HPC0


[advanced]
misc=:solution_name=link

[vivado]
#prop=run.impl_1.strategy=Performance_EarlyBlockPlacement
prop=run.impl_1.strategy=Performance_Explore

#!/bin/bash

# KV260
cd $XILINX_VIVADO/data/boards/board_files
sudo rm -rf kv260
sudo mkdir -p kv260/1.1
sudo chown -hR $USER:$USER kv260
cd kv260/1.1
wget https://raw.githubusercontent.com/Xilinx/XilinxBoardStore/2020.2.2/boards/Xilinx/kv260/1.1/LICENSE
wget https://raw.githubusercontent.com/Xilinx/XilinxBoardStore/2020.2.2/boards/Xilinx/kv260/1.1/board.xml
wget https://raw.githubusercontent.com/Xilinx/XilinxBoardStore/2020.2.2/boards/Xilinx/kv260/1.1/changelog.txt
wget https://raw.githubusercontent.com/Xilinx/XilinxBoardStore/2020.2.2/boards/Xilinx/kv260/1.1/part0_pins.xml
wget https://raw.githubusercontent.com/Xilinx/XilinxBoardStore/2020.2.2/boards/Xilinx/kv260/1.1/preset_s4.xml
wget https://raw.githubusercontent.com/Xilinx/XilinxBoardStore/2020.2.2/boards/Xilinx/kv260/1.1/vsom_kit.jpg
wget https://raw.githubusercontent.com/Xilinx/XilinxBoardStore/2020.2.2/boards/Xilinx/kv260/1.1/xitem.json

# SOM-240
cd ../..
sudo rm -rf som240
sudo mkdir -p som240/1.1
sudo chown -hR $USER:$USER som240
cd som240/1.1
wget https://raw.githubusercontent.com/Xilinx/XilinxBoardStore/2020.2.2/boards/Xilinx/som240/1.1/LICENSE
wget https://raw.githubusercontent.com/Xilinx/XilinxBoardStore/2020.2.2/boards/Xilinx/som240/1.1/board.xml
wget https://raw.githubusercontent.com/Xilinx/XilinxBoardStore/2020.2.2/boards/Xilinx/som240/1.1/changelog.txt
wget https://raw.githubusercontent.com/Xilinx/XilinxBoardStore/2020.2.2/boards/Xilinx/som240/1.1/preset.xml
wget https://raw.githubusercontent.com/Xilinx/XilinxBoardStore/2020.2.2/boards/Xilinx/som240/1.1/xitem.json

Credits

Tom Simpson

5 projects • 76 followers

DSP & Machine Learning specialist at Avnet

Contact

Comments

Please log in or sign up to comment.

Create a custom Kria KV260 accelerated ML application

Things used in this project

Hardware components

Software apps and online services

Story

Introduction

Requirements

Project set up

Create the platform

Compile the ML pre-processing accelerator

Create the ML inference acceleration overlay

Create the PetaLinux project from BSP

Add the custom PL overlay that includes the DPU to the PetaLinux project

Create a recipe to add supporting software packages

Build the PetaLinux project and create the SD card wic image

Write the wic image to an SD card

Set up the KV260

Boot the KV260

Download the pre-compiled DenseBox face detection model from the Xilinx Vitis-AI Model Zoo

Download the sample application code

Compile the application

Run the application

Summary

Schematics

Face detection application

Code

DPU-TRD connections

KV260 board files download script

Credits

Tom Simpson

Comments

Embed the widget on your own site

Create a custom Kria KV260 accelerated ML application

Create a custom Kria KV260 accelerated ML application

Things used in this project

Hardware components

Software apps and online services

Story

Introduction

Requirements

Project set up

Create the platform

Compile the ML pre-processing accelerator

Create the ML inference acceleration overlay

Create the PetaLinux project from BSP

Add the custom PL overlay that includes the DPU to the PetaLinux project

Create a recipe to add supporting software packages

Build the PetaLinux project and create the SD card wic image

Write the wic image to an SD card

Set up the KV260

Boot the KV260

Download the pre-compiled DenseBox face detection model from the Xilinx Vitis-AI Model Zoo

Download the sample application code

Compile the application

Run the application

Summary

Schematics

Face detection application

Code

DPU-TRD connections

KV260 board files download script

Credits

Tom Simpson

Comments

Related channels and tags