Overview
Things
Story

Introduction
Design Overview
Getting Started
Running the "benchmark" demos
Running the "dualcam_dpu" demos
Going Further
Don't have a ZUBoard
Conclusion
Revision History

Published September 13, 2022 © Apache-2.0

Introducing ZUBoard for Accelerated Edge AI

In early June Avnet announced the ZUBoard, featuring the ZUB1CG, the smallest device in the AMD-Xilinx Zynq UltraScale+ MPSoC family.

BeginnerFull instructions provided1 hour3,281

Introducing ZUBoard for Accelerated Edge AI

Things used in this project

Hardware components

Webcam, Logitech® HD Pro

Avnet Ultra96-V2

Avnet ZUBoard 1CG

Software apps and online services

AMD Vitis Unified Software Platform

Vitis-AI 2.0

OpenCV

Story

Introduction

In early June Avnet announced the ZUBoard, featuring the ZUB1CG, the smallest device in the AMD-Xilinx Zynq UltraScale+ MPSoC family.

This project proposes the same design methodology used by the Kria family, allowing users to dynamically load their various accelerated applications using a single SD image.

Design Overview

The pre-built image includes the following accelerated apps:

avnet-zub1cg-benchmark
avnet-zub1cg-dualcam-dpu
avnet-zub1cg-ar0144-dual
avnet-zub1cg-ar0144-single
avnet-zub1cg-ar1335-single

The avnet-zub1cg-benchmark accelerated app features the Vitis-AI 2.0 samples, implemented with the Deep Learning Processing Unit (DPU).

avnet-zub1cg-benchmark

The avnet-zub1cg-dualcam-dpu accelerated app implements a MIPI capture pipeline, in addition to the DPU.

avnet-zub1cg-dualcam-dpu

The avnet-zub1cg-ar0144-dualcam, avnet-zub1cg-ar0144-single, and avnet-zub1cg-ar1335-single apps share the same Vivado-only hardware design that contains the MIPI capture pipeline.

avnet-zub1cg-ar0144-dual/ar0144-single/ar1335-single

The avnet-zub1cg-dualcam-dpu and avnet-zub1cg-ar0144-dualcam apps configure the design (via device tree) for the following a0144-dual configuration:

ar0144-dual-configuration

The On Semiconductor AP1302 device is an ISP that synchronously captures images from the two AR0144 image sensors, and provides a single side-by-side image on its MIPI interface.

AP1302 ISP - Stereo side-by-side capture

The avnet-zub1cg-ar0144-single app configures the design (via device tree) for the following a0144-single configuration:

ar0144-single configuration

The AP1302 ISP captures images from the AR0144 image sensor populated on the right side.

The avnet-zub1cg-ar1335-single app configures the design (via device tree) for the following ar1335-single configuration:

ar1335-single configuration

The AP1302 ISP captures images from the AR1335 image sensor populated on the right-side, and implements auto-gain, auto-white-balance, and auto-focus.

Getting Started

A pre-built image is provided with a set of "accelerated apps" that can be dynamically loaded in the programmable logic.

To get started, download the following image, and program to a 16GB or greater microSDcard:

http://avnet.me/avnet-zub1cg-sbc-2021.2-sdimage
(2022/09/21 - md5sum : 1b021a30f10629acc077aa977f88582d)

Configure the ZUBoard as shown in the following diagram:

ZUBoard with DualCam SYZYGY - Hardware Setup

The DualCam SYZYGY is optional, and allows to run the stereo examples.

A live tour was given for a subset of the demos during the "Learn Embedded Design with the ZUBoard 1CG" webinar:

https://avnet.me/zuboard-webinar

After booting the ZUBoard, the list of "accelerated apps" can be queried with the xmutil utility.

$ xmutil listapps
               Accelerator                        Base       Type  #slots  Active

    avnet-zub1cg-benchmark      avnet-zub1cg-benchmark   XRT_FLAT       0      0,
  avnet-zub1cg-dualcam-dpu    avnet-zub1cg-dualcam-dpu   XRT_FLAT       0      -1

  avnet-zub1cg-ar0144-dual    avnet-zub1cg-ar0144-dual   XRT_FLAT       0      -1
avnet-zub1cg-ar0144-single  avnet-zub1cg-ar0144-single   XRT_FLAT       0      -1
avnet-zub1cg-ar1335-single  avnet-zub1cg-ar1335-single   XRT_FLAT       0      -1

This output indicates that the "avnet-zub1cg-benchmark" app is loaded by default at boot. This is determined by the following file, which can be modified if desired:

$ cat /etc/dfx-mgrd/default_firmware
avnet-zub1cg-benchmark

Running the "benchmark" demos

The "benchmark" app is analogous to the Kria KV260 "benchmark" app, in the sense that it contains the largest DPU that fits in the device. For the ZUBoard, this is the B512 DPU.

avnet-zub1cg-benchmark

If not done so already, load the "benchmark" app, using the xmutil utility.

$ xmutil unloadapp

$ xmutil loadapp avnet-zub1cg-benchmark

$ xmutil listapps
               Accelerator                        Base       Type  #slots  Active

    avnet-zub1cg-benchmark      avnet-zub1cg-benchmark   XRT_FLAT       0      0,
  avnet-zub1cg-dualcam-dpu    avnet-zub1cg-dualcam-dpu   XRT_FLAT       0      -1

  avnet-zub1cg-ar0144-dual    avnet-zub1cg-ar0144-dual   XRT_FLAT       0      -1
avnet-zub1cg-ar0144-single  avnet-zub1cg-ar0144-single   XRT_FLAT       0      -1
avnet-zub1cg-ar1335-single  avnet-zub1cg-ar1335-single   XRT_FLAT       0      -1

We can query the details of the DPU (B512) inside this overlay with the xdputil utility:

$ xdputil query
{
    "DPU IP Spec":{
        "DPU Core Count":1,
        "DPU Target Version":"v1.4.1",
        "IP version":"v3.4.0",
        "generation timestamp":"2021-12-15 10-30-00",
        "git commit id":"706bd10",
        "git commit time":2112151029,
        "regmap":"1to1 version"
    },
    "VAI Version":{
        "libvart-runner.so":"Xilinx vart-runner Version: 2.0.0-d02dcb6041663dbc7ecbc0c6af9fafa087a789de  2022-09-02-17:50:46 ",
        "libvitis_ai_library-dpu_task.so":"Xilinx vitis_ai_library dpu_task Version: 2.0.0-d02dcb6041663dbc7ecbc0c6af9fafa087a789de  2022-01-20 07:11:10 [UTC] ",
        "libxir.so":"Xilinx xir Version: xir-d02dcb6041663dbc7ecbc0c6af9fafa087a789de 2022-09-02-17:48:00",
        "target_factory":"target-factory.2.0.0 d02dcb6041663dbc7ecbc0c6af9fafa087a789de"
    },
    "kernels":[
        {
            "DPU Arch":"DPUCZDX8G_ISA0_B512_01000020F6012200",
            "DPU Frequency (MHz)":300,
            "IP Type":"DPU",
            "Load Parallel":2,
            "Load augmentation":"enable",
            "Load minus mean":"disable",
            "Save Parallel":2,
            "XRT Frequency (MHz)":300,
            "cu_addr":"0xa0000000",
            "cu_handle":"0xaaab11c02d30",
            "cu_idx":0,
            "cu_mask":1,
            "cu_name":"DPUCZDX8G:DPUCZDX8G_1",
            "device_id":0,
            "fingerprint":"0x1000020f6012200",
            "name":"DPU Core 0"
        }
    ]
}

Notice that we have one kernel of type DPU with the B512 architecture:

"DPU Arch":"DPUCZDX8G_ISA0_B512_01000020F6012200",
            "DPU Frequency (MHz)":300,

Before running the demos, we need to verify that we are using the B512 version of the pre-compiled ModelZoo, as shown below:

$ cd /usr/share/vitis_ai_library/
$ ls -la
total 40
drwxr-xr-x   6 root root  4096 Sep 16  2022 .
drwxr-xr-x  84 root root  4096 Mar  9 12:34 ..
lrwxrwxrwx   1 root root    11 Mar  9 12:34 models -> models.b512
drwxr-xr-x 147 root root 12288 Sep 16  2022 models.b128
drwxr-xr-x 223 root root 12288 Sep 16  2022 models.b512
drwxr-xr-x   5 root root  4096 Mar  9 12:34 samples
drwxr-xr-x  58 root root  4096 Mar  9 12:34 test

There are several Vitis-AI demos available, which are available in the "~/Vitis-AI/demo/Vitis-AI-Library" directory.

Face Detection

The face detection demo can be run as follows:

$ cd ~/Vitis-AI/demo/Vitis-AI-Library/samples/facedetect
$ ./test_video_facedetect densebox_640_360 0

facedetect

Pose Estimation

The pose detection demo can be run with the USB camera as follows:

$ cd ~/Vitis-AI/demo/Vitis-AI-Library/samples/posedetect
$ ./test_video_posedetect_with_ssd 0

posedetect_with_ssd (live input)

The pose detection demo can be run with the locally provided video as follows:

$ ./test_video_posedetect_with_ssd ../../../VART/pose_detection/video/pose.mp4

posedetect_with_ssd (video input)

The Vitis-AI demos all include source code and can be modified for your needs, as shown in the following examples:

Face Detection with Tracking

The face detection demo, augmented with centroid-based object tracking, can be run as follows:

$ cd ~/vitis_ai_cpp_examples/facedetectwithtracking
$ ./test_video_facedetectwiothtracking 0

facedetectwithtracking

More details on how this custom example was created can be found here:

https://avnet.me/vitis-ai-1.3-project

Face Detection with Head Pose Estimation

The face detection demo, augmented with face landmarks and head pose, can be run as follows:

$ cd ~/vitis_ai_cpp_examples/facedetectwithheadpose
$ ./test_video_facedetectwithheadpose 0

facedetectwithheadpose

More details on how this custom example was created can be found here:

https://avnet.me/vitis-ai-1.3-headpose

License Plate Recognition

A multiple neural inference example has been implemented for the recognition of asian license plates, and can be run as follows:

$ cd ~/vitis_ai_cpp_examples/platerecognition
$ ./test_video_platerecognition ./video/plate_recognition_video.mp4

platerecognition

More details on how this example was created can be found here:

https://avnet.me/vitis-ai-1.3-alpr

3D Object Detection

A more advanced example showcasing 3D object detection with lidar point cloud data can be run as follows:

$ cd ~/xilinx_developer/ppdemo/
$ ./demo ./ppd/vlist.txt ./ppd/ 3

3D object detection

More details on how this example was created can be found here:

https://avnet.me/vitis-ai-1.4-blog

Multi-Task example

Another more advanced example showcasing a multi-task model (common backbone, multiple heads) can be run as follows:

$ cd ~/Vitis-AI/demo/Vitis-AI-Library/apps/multitask_v3_quad_windows
$ ./multitaskv3_quad_windows_x d58cbda2-97976be7__640x360.avi -t 4

Multi-Task-V3

Face Applications

The pre-built image also includes examples written in python, which can be leveraged for rapidly prototyping your own ideas.

The face applications webserver can be run as follows:

$ cd ~/vitis_ai_python_examples/webserver
$ python3 webserver.py

Once executing, a webserver can be used to display the page at the ZUBoard's IP address:

face applications webserver

Custom Model Training

The programmable nature of the ZUBoard allows users to train and deploy their own custom models with Vitis-AI.

As an example, we have gone through the exercise of creating a custom dataset for the dobble card game, and trained a classification model with TensorFlow.

The trained model was then deployed for inference using Vitis-AI:

https://avnet.me/vitis-ai-1.3-dobble

The dobble classification demo is part of the ZUBoard pre-built image, and can be run as follows:

$ cd ~/dobble_classification
$ python3 dobble_detect_live.py

dobble classification

Running the "dualcam_dpu" demos

The "dualcam_dpu" app implements the B128 version of the DPU, along with a MIPI capture capture, configured for the "dual_ar0144".

avnet-zub1cg-dualcam-dpu

If not done so already, load the "dualcam_dpu" app, using the xmutil utility.

$ xmutil unloadapp

$ xmutil loadapp avnet-zub1cg-dualcam-dpu

$ xmutil listapps
               Accelerator                        Base       Type  #slots  Active

    avnet-zub1cg-benchmark      avnet-zub1cg-benchmark   XRT_FLAT       0      -1
  avnet-zub1cg-dualcam-dpu    avnet-zub1cg-dualcam-dpu   XRT_FLAT       0      0,

  avnet-zub1cg-ar0144-dual    avnet-zub1cg-ar0144-dual   XRT_FLAT       0      -1
avnet-zub1cg-ar0144-single  avnet-zub1cg-ar0144-single   XRT_FLAT       0      -1
avnet-zub1cg-ar1335-single  avnet-zub1cg-ar1335-single   XRT_FLAT       0      -1

We can query the details of the DPU (B128) inside this overlay with the xdputil utility:

$ xdputil query
{
    "DPU IP Spec":{
        "DPU Core Count":1,
        "DPU Target Version":"v1.4.1",
        "IP version":"v3.4.0",
        "generation timestamp":"2021-12-15 10-30-00",
        "git commit id":"706bd10",
        "git commit time":2112151029,
        "regmap":"1to1 version"
    },
    "VAI Version":{
        "libvart-runner.so":"Xilinx vart-runner Version: 2.0.0-d02dcb6041663dbc7ecbc0c6af9fafa087a789de  2022-09-02-17:50:46 ",
        "libvitis_ai_library-dpu_task.so":"Xilinx vitis_ai_library dpu_task Version: 2.0.0-d02dcb6041663dbc7ecbc0c6af9fafa087a789de  2022-01-20 07:11:10 [UTC] ",
        "libxir.so":"Xilinx xir Version: xir-d02dcb6041663dbc7ecbc0c6af9fafa087a789de 2022-09-02-17:48:00",
        "target_factory":"target-factory.2.0.0 d02dcb6041663dbc7ecbc0c6af9fafa087a789de"
    },
    "kernels":[
        {
            "DPU Arch":"DPUCZDX8G_ISA0_B128_01000020E2012208",
            "DPU Frequency (MHz)":300,
            "IP Type":"DPU",
            "Load Parallel":2,
            "Load augmentation":"disable",
            "Load minus mean":"disable",
            "Save Parallel":2,
            "XRT Frequency (MHz)":300,
            "cu_addr":"0xa0020000",
            "cu_handle":"0xaaab00237970",
            "cu_idx":0,
            "cu_mask":1,
            "cu_name":"DPUCZDX8G:DPUCZDX8G_1",
            "device_id":0,
            "fingerprint":"0x1000020e2012208",
            "name":"DPU Core 0"
        }
    ]
}

Notice that we have one kernel of type DPU with the B128 architecture:

"DPU Arch":"DPUCZDX8G_ISA0_B128_01000020E2012208",
            "DPU Frequency (MHz)":300,

Before running the demos, we need to verify that we are using the B128 version of the pre-compiled ModelZoo, as shown below:

$ cd /usr/share/vitis_ai_library/
$ ls -la
total 40
drwxr-xr-x   6 root root  4096 Sep 16  2022 .
drwxr-xr-x  84 root root  4096 Mar  9 12:34 ..
lrwxrwxrwx   1 root root    11 Mar  9 12:34 models -> models.b512
drwxr-xr-x 147 root root 12288 Sep 16  2022 models.b128
drwxr-xr-x 223 root root 12288 Sep 16  2022 models.b512
drwxr-xr-x   5 root root  4096 Mar  9 12:34 samples
drwxr-xr-x  58 root root  4096 Mar  9 12:34 test

$ rm models
$ ln -sf models.b128 models
$ ls -la
total 40
drwxr-xr-x   6 root root  4096 Mar  9 13:05 .
drwxr-xr-x  84 root root  4096 Mar  9 12:34 ..
lrwxrwxrwx   1 root root    11 Mar  9 13:05 models -> models.b128
drwxr-xr-x 147 root root 12288 Sep 16  2022 models.b128
drwxr-xr-x 223 root root 12288 Mar  9 13:05 models.b512
drwxr-xr-x   5 root root  4096 Mar  9 12:34 samples
drwxr-xr-x  58 root root  4096 Mar  9 12:34 test

Stereo Face Detection

The dual inference example, or stereo face detection, can be run as follows:

$ cd ~/avnet_dualcam_python_examples
$ python3 avnet_ar0144_dual_stereo_face_detection.py

stereo face detection

More details on how this example was created can be found here:

http://avnet.me/vitis-ai-2.0-dualcam

Going Further

To learn more about the ZUBoard, please watch the "Learn Embedded Design with the ZUBoard 1CG" webinar:

https://avnet.me/zuboard-webinar

Don't have a ZUBoard ?

If you have an Ultra96-V2 board (with or without the dualcam mezzanine), you can run these sames designs on the u96v2 image:

http://avnet.me/avnet-u96v2-sbc-2021.2-sdimage
(2022/09/21 - md5sum : a04ecf831b4e654f2d13e6641b92a02c)

Reuse the same instructions, but swap out the following when using the Ultra96-V2 board:

zub1cg => u96v2
b512 => b2304
b128 => b1152

Please let me know in the comments below if you are using the Ultra96-V2 based design.

Conclusion

I hope this tutorial, with its pre-built SD card image, will help you to get your custom AI applications up and running quickly on the ZUBoard.

If there are any other accelerated apps you would like to see on ZUBoard, please share your thoughts in the comments below.

Revision History

2022/09/23

Update project image. Add instructions to run multi-task-v3 example.

2022/09/21

Update SD image for ZUBoard, and add SD image for Ultra96-V2.

2022/09/15

Add instructions and videos on how to run demos for following accelerated apps:

avnet-zub1cg-benchmark
avnet-zub1cg-dualcam-dpu

Updated SD card image

2022/09/06

Preliminary Version, with recorded video covering following accelerated apps

avnet-zub1cg-benchmark
avnet-zub1cg-dualcam-dpu

Mario Bergeron

54 projects • 287 followers

Mario Bergeron is a Technical Marketing Engineer working at Tria, specializing in embedded vision and machine learning.

Contact

Comments

Please log in or sign up to comment.

Embed the widget on your own site

Introducing ZUBoard for Accelerated Edge AI

Introducing ZUBoard for Accelerated Edge AI

Things used in this project

Hardware components

Software apps and online services

Story

Introduction

Design Overview

Getting Started

Running the "benchmark" demos

Running the "dualcam_dpu" demos

Going Further

Don't have a ZUBoard ?

Conclusion

Revision History

Credits

Mario Bergeron

Comments

Related channels and tags