The introduction of the Kria SOM from Xilinx is exciting! The KV260 Vision AI Starter Kit is a great platform for developing and prototyping accelerated algorithms including, but not limited to, Machine Learning, Computer Vision, and Signal Processing. There are several pre-built accelerated applications available in the Kria App Store that can be download and run on the KV260. The Smart Camera, and NLP-SmartVision apps from Xilinx use the AR1335 image sensor along with the AP1302 ISP (the AR1335 sensor comes in the KV260 accessory pack, and is a great add on for image processing applications). After testing out some of the pre-built applications you may be asking, "how do I create my own application". This project will describe steps that can be used to create a basic design for accelerating ML inference tasks.
Before we get started it's helpful to review some KV260 terminology
- Platform: The Vitis platform used as the base design. Defines physical interfaces to off-chip components such image sensors. Also defines accelerator clock and memory interfaces.
- Overlay:The accelerated application secret sauce. This is where we add our ML accelerator (i.e. the DPU), and any additional accelerators that we need. The term "overlay" is used because we are overlaying these accelerators on the Platform.
This project will create a custom overlay for a platform that supports the AR1335 sensor.
Requirements- KV260 Starter Kit with power supply and basic accessory pack
- Linux build machine
- Vitis 2020.2.2
- PetaLinux 2020.2.2
For this project we will be starting with the KV260 BSP and some reference designs. From there we will modify the BSP & reference designs to create our custom accelerated ML inference application.
- Download the KV260 2020.2.2 BSP from here, and save to the
~/Downloads
directory - Create a project directory on your Linux build machine. The following command will create a project directory named
kv260_ml_accel
mkdir ~/kv260_ml_accel
export PROJ_DIR=~/kv260_ml_accel
- Clone the
KV260-Vitis
example projects from GitHub using the following commands
cd $PROJ_DIR
git clone https://github.com/xilinx/kv260-vitis
cd kv260-vitis
git checkout release-2020.2.2_k26
- Clone the
Vitis-AI
repository from GitHub. TheVitis-AI
repository will be used to add the deep-learning processing unit (DPU) to the design.
cd $PROJ_DIR
git clone https://github.com/xilinx/Vitis-AI
cd Vitis-AI
git checkout v1.3
Note: The DPU is a soft CNN co-processor to the ARM A53 processor complex in the Xilinx Zynq UltraScale+ MPSoC chip that resides on the KV260 SOM.
- Clone the Vitis_Libraries repository from GitHub. The Vitis Vision library will be used to add an ML pre-processing accelerator (image resizing)
cd $PROJ_DIR
git clone https://github.com/Xilinx/Vitis_Libraries
cd Vitis_Libraries
git checkout 2020.2
- Source the Vitis & PetaLinux environment setup scripts. If you close the terminal from which these scripts are sourced, you will need to repeat this step in a new terminal
source <Vitis install directory>/2020.2/settings64.sh
source <PetaLinux install directory>/2020.2.2/settings.sh
Note: The setup scripts shown in the previous command are located in the tool install directory, for example Vitis may be installed in the /tools/Xilinx/Vitis/2020.2
directory on your machine.
- Download the KV260 board files (if not already done). The script shown in the following command will install the board files in the
$XILINX_VIVADO/data/boards/board_files
directory.
cd $PROJ_DIR
wget https://www.hackster.io/code_files/543211/download -O get_kv260_boards.sh
dos2unix get_kv260_boards.sh
sh ./get_kv260_boards.sh
Create the platformThe NLP-SmartVision platform supplies basic clock (100, 300, 600 MHz), and memory connections for the addition of accelerators to the PL. The platform also provides the capture pipeline necessary for interfacing with the AR1335 and AP1302. This will be the base platform for this project.
Modify the platform to remove the scalar IP in the capture pipeline. If it's desired to perform scaling of the sensor data that can be accomplished with AP1302 ISP. This modification is necessary to fit within the KV260 device URAM resources.
cd $PROJ_DIR/kv260-vitis/platforms/vivado/kv260_ispMipiRx_DP/scripts
cp config_bd.tcl config_bd.tcl.orig
sed -i 's/C_TOPOLOGY {0}/C_TOPOLOGY {3}/g' config_bd.tcl
sed -i '132i\ \ \ CONFIG.C_CSC_ENABLE_WINDOW {false} \\' config_bd.tcl
sed -i 's/v_proc_ss_0\/aclk_axis/v_proc_ss_0\/aclk/g' config_bd.tcl
sed -i 's/\[get_bd_pins v_proc_ss_0\/aclk_ctrl\]//g' config_bd.tcl
sed -i 's/v_proc_ss_0\/aresetn_ctrl/v_proc_ss_0\/aresetn/g' config_bd.tcl
The following commands will build the platform:
cd $PROJ_DIR/kv260-vitis
make platform PFM=kv260_ispMipiRx_DP
Once the platform build is complete, the output will be located in $PROJ_DIR/kv260-vitis/platforms/xilinx_kv260_ispMipiRx_DP_202022_1
In most cases sensor data will need a resolution reduction before ML inference can be performed. The input capture pipeline is set up to capture up to 4K image sizes, but most ML networks do not support such a large input size. In order to scale images prior to ML inference, we will use the Vitis Vision library to add an image scaler IP to the PL as an accelerator.
- Modify the default resize accelerator configuration to support color (RGB) images
cd $PROJ_DIR/Vitis_Libraries/vision/L2/examples/resize
sed -i 's/RGB 0/RGB 1/g' build/xf_config_params.h
sed -i 's/GRAY 1/GRAY 0/g' build/xf_config_params.h
- Compile the Vitis Vision library resize function using the Vitis v++ command
v++ -c -t hw xf_resize_accel.cpp \
--platform $PROJ_DIR/kv260-vitis/platforms/xilinx_kv260_ispMipiRx_DP_202022_1/kv260_ispMipiRx_DP.xpfm \
--kernel_frequency 300 \
-I../../../L1/include \
-I./build \
--save-temps \
-k resize_accel \
-o resize_accel.xo
- The output of the compilation process is the Xilinx object file (
resize_accel.xo
), which is what we need in order to add the accelerator to our PL overlay.
We will use the Vitis-AI DPU-TRD to add the DPU IP to the design. The DPU is the IP used to accelerate CNN inference tasks. For this project we will use the largest DPU size - the B4096 DPU.
- Navigate to the DPU-TRD directory
cd $PROJ_DIR/Vitis-AI/dsa/DPU-TRD/prj/Vitis
- Update the
dpu_conf.vh
file to use UltraRAM. The followingsed
command will update thedpu_conf.vh
file to enable UltraRAM
sed -i 's/^`define URAM_DISABLE/`define URAM_ENABLE/' dpu_conf.vh
- Define the DPU clock and memory connections using a configuration file. A configuration file is provided with this project. The following commands will download the configuration file and place it in the DPU-TRD project.
cd config_file
mv prj_config prj_config.orig
wget https://www.hackster.io/code_files/542906/download -O prj_config
cd ..
- Modify the DPU-TRD project Makefile to include the pre-processing accelerator (
resize_accel.xo
)
sed -i '53i kernel_xo += ${PROJ_DIR}/Vitis_Libraries/vision/L2/examples/resize/resize_accel.xo' Makefile
- Build the DPU-TRD to add the DPU & resize accelerator to the platform using the following commands
export SDX_PLATFORM=$PROJ_DIR/kv260-vitis/platforms/xilinx_kv260_ispMipiRx_DP_202022_1/kv260_ispMipiRx_DP.xpfm
make binary_container_1/dpu.xclbin KERNEL=DPU DEVICE=kv260
When the build completes you will see a dpu.xclbin
file located in the $PROJ_DIR/Vitis-AI/dsa/DPU-TRD/prj/Vitis/binary_container_1
directory, and the bitfile located in the $PROJ_DIR/Vitis-AI/dsa/DPU-TRD/prj/Vitis/binary_container_1/link/vivado/vpl/prj/prj.runs/impl_1
directory.
- Copy build files to a working directory
mkdir -p $PROJ_DIR/overlay_files
cd binary_container_1
cp dpu.xclbin $PROJ_DIR/overlay_files
cp link/vivado/vpl/prj/prj.runs/impl_1/*.bit $PROJ_DIR/overlay_files/kv260-ml-accel.bit
Create the PetaLinux project from BSPIn the project setup section we downloaded the BSP from the Xilinx downloads site. We will use that downloaded BSP to create the PetaLinux project. The following commands will create the project from BSP.
cd $PROJ_DIR
petalinux-create -t project -s ~/Downloads/xilinx-k26-starterkit-v2020.2.2-final.bsp
cd xilinx-k26-starterkit-2020.2.2
echo 'BOARD_VARIANT = "kv"' >> project-spec/meta-user/conf/petalinuxbsp.conf
petalinux-config --silentconfig
Add the custom PL overlay that includes the DPU to the PetaLinux projectThe custom PL overlay will be packaged as an application in the PetaLinux project and is added to the target root file system. This allows the xmutil utility to load the custom overlay as an "accelerated application" after Linux has booted on the KV260.
- Download the platform device tree definition for the kv260_ispMipiRx_DP platform from GitHub
wget https://raw.githubusercontent.com/Xilinx/kv260-firmware/release-2020.2.2_k26/nlp-smartvision/kv260-nlp-smartvision.dtsi -O $PROJ_DIR/overlay_files/kv260-ml-accel.dtsi
- Modify the device tree to change the driver for the color space conversion block (needed since we modified the platform to remove scaling capabilities)
cd $PROJ_DIR/overlay_files
sed -i 's/scaler-2.2/csc/g' kv260-ml-accel.dtsi
sed -i 's/clock-names = "aclk_axis", "aclk_ctrl"/clock-names = "aclk"/g' kv260-ml-accel.dtsi
- Remove scalar specific properties from the device tree since we removed scaling capabilities from the capture pipeline (Note: the AP1302 ISP is still capable of performing scaling if desired).
sed -i 's/clocks = <\&misc_clk_2>, <\&misc_clk_2>/clocks = <\&misc_clk_2>/g' kv260-ml-accel.dtsi
sed -i '/xlnx,num-hori-taps = <6>;/d' kv260-ml-accel.dtsi
sed -i '/xlnx,num-vert-taps = <6>;/d' kv260-ml-accel.dtsi
- Create the PetaLinux application recipe. This will create the files/directory necessary to add our custom overlay to the PetaLinux project.
cd $PROJ_DIR/xilinx-k26-starterkit-2020.2.2
petalinux-create -t apps --template fpgamanager --name kv260-ml-accel --enable --srcuri "$PROJ_DIR/overlay_files/kv260-ml-accel.bit $PROJ_DIR/overlay_files/kv260-ml-accel.dtsi $PROJ_DIR/overlay_files/dpu.xclbin"
The app will be created in $PROJ_DIR/xilinx-k26-starterkit-2020.2.2/project-spec/meta-user/recipes-apps/kv260-ml-accel.
If you need to update the bitstream, device tree, or xclbin then you can just replace the files located in that directory. However, make sure to keep the same names because the.bb file is looking for specific names.
Adding additional software packages to the project will create a target root file system with the necessary libraries for ML inference. These libraries include Vitis-AI, and OpenCV among other various utilities. Additionally, there are firmware files for programming the AP1302 ISP that need to be added to the project as well.
The following commands will add the packages to the project:
mkdir -p project-spec/meta-user/recipes-core/packagegroups
echo '
DESCRIPTION = "KV260 ML inference app related packages"
inherit packagegroup
KV260_ML_ACCEL_PACKAGES = " \
ap1302-ar1335-single-firmware \
dnf \
e2fsprogs-resize2fs \
parted \
resize-part \
packagegroup-petalinux-vitisai \
packagegroup-petalinux-vitisai-dev \
packagegroup-petalinux-gstreamer \
cmake \
libgcc \
gcc-symlinks \
g++-symlinks \
binutils \
xrt \
xrt-dev \
zocl \
opencl-clhpp-dev \
opencl-headers-dev \
packagegroup-petalinux-opencv \
packagegroup-petalinux-opencv-dev \
packagegroup-petalinux-v4lutils \
"
RDEPENDS_${PN} = "${KV260_ML_ACCEL_PACKAGES}"
COMPATIBLE_MACHINE = "^$"
COMPATIBLE_MACHINE_k26-kv = "${MACHINE}"
PACKAGE_ARCH = "${BOARDVARIANT_ARCH}"
' > project-spec/meta-user/recipes-core/packagegroups/packagegroup-kv260-ml-accel.bb
- Add the custom packagegroup to the root file system configuration
echo "CONFIG_packagegroup-kv260-ml-accel" >> project-spec/meta-user/conf/user-rootfsconfig
echo "CONFIG_packagegroup-kv260-ml-accel=y" >> project-spec/configs/rootfs_config
Build the PetaLinux project and create the SD card wic imageThe following commands will build the PetaLinux project and then package the output files as a wic image, which can be written to an SD card.
- Build the project
petalinux-build
- Generate the wic image file
petalinux-package --wic --bootfiles "ramdisk.cpio.gz.u-boot boot.scr Image system.dtb"
Write the wic image to an SD cardThe wic image file can be written to an SD card using an imaging utility such as BalenaEtcher or if on Linux the dd
command. The following commands can be used on a Linux machine to write the wic image to an SD card. Please make sure to read the following disclaimer before proceeding:
- Write the SD card image to a blank SD card. The following command can be used to write the SD card image using Linux:
sudo dd if=images/linux/petalinux-sdimage.wic of=/dev/sd<X> status=progress
NOTE: The SD card /dev/sd<X>
referenced above will be unique to your system. You will need to replace the sd<X>
with the appropriate drive mapping for your system. For example, sd<X>
could be equal to sda
, sdb
, sdc
, etc. depending on how your system enumerates the SD card device. Make sure the name specified in the of=
argument above is the device name, and not just a partition (i.e. of=/dev/sd<X>
is correct, but of=dev/sd<X>1
would be incorrect).
- When the SD card write process completes you can eject the device with the following command
sudo eject /dev/sd<X>
Note: The note about SD card device enumeration in the previous step applies to this step as well.
Set up the KV260The KV260 should be connected as described in the getting started guide located at https://www.xilinx.com/products/som/kria/kv260-vision-starter-kit/kv260-getting-started/connecting-everything.html
Insert the micro-SD card imaged in the last section into the KV260 micro-SD card slot, and apply power to the board. The board should start to boot, and when complete you will see a login prompt. Login using the username "petalinux", and then change the password as directed by the prompt. See figure below for a serial port console example.
The Xilinx Vitis-AI Model Zoo has pre-compiled models for the B4096 DPU architecture. The example application we are creating uses the DenseBox face detection model. The following commands will download the pre-trained & pre-compiled model, then install it in the KV260 root file system. Please execute the following commands on the KV260.
cd /home/petalinux
wget https://www.xilinx.com/bin/public/openDownload?filename=densebox_640_360-zcu102_zcu104-r1.3.0.tar.gz -O densebox_640_360-zcu102_zcu104-r1.3.0.tar.gz
sudo mkdir -p /usr/share/vitis_ai_library/models
Extract the model:
sudo tar -xvzf densebox_640_360-zcu102_zcu104-r1.3.0.tar.gz -C /usr/share/vitis_ai_library/models
Download the sample application codeThe following commands will download and extract the sample application code
wget https://hacksterio.s3.amazonaws.com/uploads/attachments/1393631/face_detect_px1rBmdftj.zip -O face_detect.zip
unzip face_detect.zip
The sample application code performs image rescaling using the hardware accelerator created in section Compile the ML pre-processing accelerator. The input layer size for the DenseBox face detection model is 640x360, but the capture resolution is 1920x1080. The hardware accelerator is called to resize the image using the OpenCL API. The hardware accelerator management is handled by a class defined in the resize_accel.hpp
header file.
Before we can compile the application we must load the kv260-ml-accel Kria application. The following commands executed on the KV260 will unload the default application, then load the kv260-ml-accel application.
sudo xmutil unloadapp
sudo xmutil loadapp kv260-ml-accel
After the kv260-ml-accel loads you should see the following in the terminal
After the application loads you may need to press "return or enter" on your keyboard to get back to a prompt.
The following command will compile the application
cd /home/petalinux/face_detect
make -j4
When compilation completes you will see an executable name facedetect.exe
Run the applicationBefore running the application we need to set up the media pipeline for the MIPI camera and AP1302 ISP. If you are using a USB camera or other source such as a video file, you can skip this step. Execute the following command to set up the MIPI capture pipeline.
cd /home/petalinux/face_detect
./setup_media_pipe.sh
After running the script you should see output similar to the following if you have the MIPI camera attached to connector J7.
Make note of the video device (/dev/video*) output by the setup script. This value will need to be passed as an argument to the sample application.
Execute the following commands to run the application. You should see the 1920x1080 capture image with face detections overlaid on the HDMI monitor.
cd /home/petalinux/face_detect
./facedetect.exe /dev/video2
Please note that this is a very simple sample application and is not optimized for performance (i.e. it's single threaded).
This project created a custom machine learning accelerated application for the Kria KV260 Vision AI Starter Kit. The accelerated application included an overlay with B4096 DPU CNN and image scaling accelerators.
I hope you enjoyed this project. Please follow me in order to stay up to date on my latest projects. I'm working on a project that describes how to add this accelerated application to the certified Ubuntu image for the Xilinx Kria KV260 Vision AI Starter Kit.
Update 2/9/2022 - Please see my additional project that uses the official Canonical Ubuntu image for the KV260 - Easy Machine Learning on Ubuntu with the Xilinx Kria KV260
Comments