Published April 19, 2022 © CC BY-SA

Artificial Intelligence as a Service (AIaaS) on the VCK5000

Run hardware-accelerated AI Workloads easily over the Artificial Intelligence as a Service (AIaaS) API. Featuring the Xilinx VCK5000 card.

IntermediateFull instructions provided3 days740

Artificial Intelligence as a Service (AIaaS) on the VCK5000

Things used in this project

Hardware components

AMD VCK5000 Versal Development Card

Software apps and online services

AMD Vitis Unified Software Platform

Vitis AI

Jupyter Notebook

TensorFlow

OpenCV – Open Source Computer Vision Library OpenCV

Story

The biggest challenge in deploying Artificial Intelligence (AI) based solutions, is often finding the suitable hardware to run our model in a efficient and cost effective way.

Virtual machines offered by big Cloud Providers (AWS, Google Gloud, Azure) usually don't come with hardware suitable for AI / ML acceleration (FPGA, GPU, ASIC, etc).

Special instance types(ex. F1 / Inf1 in AWS) are needed to be able run more complex AI models. These instance types are usually more expensive, and smaller project / teams may not be able afford them. On top that, with low traffic applications, the hardware is probably not fully exploited.

To deploy AI workloads in the cloud, more efficient solutions like using shared servers could be used. Servers equipped with accelerator cards like the Xilinx VCK5000 could be used to fulfill AI Inference tasks from multiple applications.

In this project, I propose an Artificial Intelligence as a Service (AIaaS) platform, with a server featuring the Xilinx VCK5000 accelerator card. The platform allows clients to run different types of AI workloads remotely, using an easy to use API.

In the following sections I will show how I built a proof-of-concept (PoC) of this Artificial Intelligence as a Service (AIaaS) platform.

Concept & Overview

The main idea of an Artificial Intelligence as a Service (AIaaS) platform is to offer hardware accelerated AI workloads over an easy to use REST API.

This allows applications to incorporate powerful AI features without the need to run them directly on costly hardware acceleration enabled machines.

It also allows multiple applications to take better advantage of compute capacity of a single hardware acceleration enabled machine.

In this project I will present proof-of-concept (PoC) demonstrating this ideas.

The main features of this PoC are the following:

an AIaaS API Server featuring the Xilinx VCK 5000 Versal development card
several Hardware-Accelerated AI Workflows exposed over a REST API(Image Classification, Image Batch Classification, Face Detection, Lane Detection, Object Detection in Video + more to come)
a standard OpenAPI Specification file describing the REST API, which can be used to generate REST Clients and documentation
a set of Jupyter Notebooks demonstrating the use of each API
an Administration Interface that allows inspecting the server status, task history and task details
Build & Installation scripts

Here is quick demo video showing how the project works:

In the following sections we will get into more details about each component of the system. Finally we will take a look at wide variety of possible new features and improvements.

Architecture

The High-Level Architecture of the system looks like this:

High-Level Architecture

The system is composed from the Client, Server and API components. The contract between them is the Artificial Intelligence as a Service (AIaaS) REST API described in a OpenAPI specification file.

The Server part runs on a is implemented as two independent parts:

The Backend Server is a web application implementing the AIaaS REST API. The application is written in Java using the Spring Boot framework, and is responsible for general management tasks.

A set of VitisAI MicroApp-s are used to implement different hardware accelerated AI workloads. These are C++ and/or Python applications built over the Vitis AI framework / libraries, and they are accelerated by the DPU-s running on Xilinx VCK5000 card.

The Client applications can use the AIaaS REST API either by manually written, or automatically generated REST Clients.

In the following sections, I will get into more details about each of the components.

Vitis-AI MicroApps

The Xilinx VCK5000 card can be used to accelerate AI workloads using DPU-s implemented in the programmable logic.

To access this functionality, the Vitis-AI C++ or PythonAPI-s can be used. In this project I opted for C++ API as it seems to be more feature complete.

To keep thing simple, I decided to implement the different hardware accelerated AI workloads in small self-contained applications called "Vitis-AI MicroApps".

Each micro-app implements a specific AI workload as follows:

Image Classification (vitis_ai_image_classify.cpp) - image classification with custom model (ex resnet50)
Image Batch Classification (vitis_ai_image_classify_batch.cpp) - similar to the above but can classify multiple images at the same time - this is also a sample app for batch-processing
Image Face Detection (vitis_ai_image_face_detect.cpp) - face detection on images
Image Lane Detection (vitis_ai_image_lane_detect.cpp) - lane detection on images
Video YOLO V3 Object Detection (vitis_ai_video_yolov3.cpp) - YOLO V3 object detection implemented on videos - this is also a sample app for video processing(with many more to follow in the future)

These micro-apps usually take as input an image or video, and produces output in JSON format. The apps will later be called by the API sever.

Now lets see how each of the micro-apps works!

> Image Classification

The Image Classification micro-app (vitis_ai_image_classify) implements a classification of a single image, using an arbitrary model like the ResNet50.

The app is called as:

$ ./vitis_ai_image_classify <model> <image-file>

Sample images for classification (ResNet50 model) - source VitisAI

and produces an JSON output like:

{
  "results": [
    { "idx": 20, "class": "water ouzel, dipper,", "score": 0.999 },
    { "idx": 42, "class": "agama,", "score": 0.083 }
  ]
}

> Batch Classification

The Image Batch Classification micro-app (vitis_ai_image_classify_batch) is similar to the previous one, but instead of process one image at time, it processes a batch of images. This way the performance should be improved.

The app is called like:

$ ./vitis_ai_image_classify_batch <model> <image-file-1> <image-file-2> ...

and the result looks like:

{
  "results": [
  {
    "image": "image1.jpg",
    "results": [
        { "idx": 20, "class": "water ouzel, dipper,", "score": 0.999 },
        { "idx": 42, "class": "agama,", "score": 0.083 }
      ]
   },
  {
    "image": "image2.jpg", ...
   }
  ]
}

This micro-app will also acts as the template for batch processing type micro-apps.

> Face Detection

The Image Face Detection (vitis_ai_image_face_detect) app can be used to detect one or more face in an image. It uses a model like the DenseBox (320x320) and is used like:

$ ./vitis_ai_face_detect densebox_320_320 sample_facedetect.jpg
{
  "results": [
    { "x": 75, "y": 65, "width": 57, "height": 70, "score": 0.997199 },
    { "x": 204, "y": 45, "width": 55, "height": 73, "score": 0.994089 }
  ]
}

> Lane Detection

The LaneDetection (vitis_ai_image_face_detect) app lane detection on an road, and it is called like:

$ ./vitis_ai_lane_detect vpgnet_pruned_0_99 ../lanedetect/sample_lanedetect.jpg
{
  "results": [
    { "type": 3, "points": [
        { "x": 164, "y": 377 },
        ...

> Object Detection in Videos with YOLO V3

The final micro-app I implemented for this PoC demonstrates object detection in a video. It uses the YOLO V3 model, and works as follow:

$ ./vitis_ai_yolo3_video <model> <video-file>
...results for each frame

API Server

The purpose of the API Server is to expose the hardware accelerated AI workflows as an easy to use REST API.

The API Server is a web application built using Java and Spring Boot. It is responsible for the following functions:

implementing REST API-s defined in the OpenAPI specification file
managing temporary files and other resources
calling the Vitis-AI MicroApps
collecting and parsing the output generated by Vitis-AI MicroApps
packing and returning the results to the caller
etc.

> Open API Specification

The REST API-s exposed by the API Server is documented using the industry standard Open API specification files.

This files describe following REST API-s exposed by the API Server:

Image Classification - /api/v1/images/classify
Image Batch Classification - /api/v1/images/classify/batch
Image Face Detection - /api/v1/images/face-detect
Image Lane Detection - /api/v1/images/lane-detect
VideoObject Detection with YOLO V3 - /api/v1/videos/yolo-v3(with more to follow in the future)

The OpenAPI specification can also be used to generate HTML documentation, and as well Clients for different programming languages.

> Admin Interface

Along the AIaaS API, the back-end server also exposes an Administration Interface.

In this first version, the Admin Interface can be used inspect the state of the service, and the AI task history, along with task details:

Admin Interface

The interface can be accessed under the /admin.htmlpath, and it is implemented HTML / JavaScript app. Under the hood it uses a set of Admin API-s implemented in the back-end server.

Note: as this project is still a PoC, the AIasS API and the Admin Interface are not protected by authentication or authorization in any way.

> Docker Images & Setup Script

The back-end part of the AIaaS project runs on an extended version of the Vitis-AI Docker image.

The official image was image was extended with things like Java Runtime and custom setup scripts.

The Dockerfile, the setup scripts and instructions can be found in the Backend folder of the attached GitHub repository.

Clients and Examples

To interact with the Artificial Intelligence as a Service (AIaaS) REST API some kind of REST / API clients need to be used. This can be either manually written REST calls or clients automatically generated from the OpenAPI specification files.

For this PoC I'm using manually written REST calls in Jupyter Notebooks. In this notebooks I prepare the input images (or video), make the calls and then process and visualize the result.

There is one example notebook prepared for each API:

1 / 5 • Image Classify

Setup & Installation

The Xilinx VCK5000 Versal Development Cardis a PCI Express based card.

To use it we need either a Server of a Desktop PC, with a free PCI Express 3.0 x 16 slot.

> Hardware Installation

I opted for a Desktop PC with the following specs:

AMD Ryzen 5 3400G CPU (with integrated graphics)
Gigabyte X570 GAMING X motherboard
G-Skill 16 GB DDR4 3200 MHz memory
Seasonic Focus GX 550W ATX power supply

Hardware Setup

The VCK5000 is installed in a PCI Express 3.0(+) x16 slot. The two 8 + 6 pin PCI power connectors also needs to be connected.

> OS and Packages

On the software side we need a Linux based OS distribution like RHEL, CentOS or Ubuntu. I went with Ubuntu 20.04.3 LTS, with the Linux kernel downgraded to the version 5.8.0-43-generic. (Note: at the point of writing this, the latest 5.11.x kernel is not supported by the xocl and xclmgmt kernel drivers)

Next we have to install a couple of software packages to get the VCK5000 working. We can follow the Vitis AI Setup Instruction for the VCK5000:

by running the provided ./install.sh we install the Xilinx Runtime Library (XRT), Xilinx Resource Manager (XRT) and the DPU V4E xclbin for VCK5000
then we need download and install some DEB packages containing the firmware for the VCK5000, and as well some utilities for flashing and validation:

$ wget https://www.xilinx.com/bin/public/openDownload?filename=xilinx-vck5000-es1-gen3x16-platform-2-1_all.deb.tar.gz -O xilinx-vck5000-es1-gen3x16-platform-2-1_all.deb.tar.gz

$ tar -xzvf xilinx-vck5000-es1-gen3x16-platform-2-1_all.deb.tar.gz

$ sudo dpkg -i xilinx-sc-fw-vck5000_4.4.6-2.e1f5e26_all.deb
$ sudo dpkg -i xilinx-vck5000-es1-gen3x16-validate_2-3123623_all.deb
$ sudo dpkg -i xilinx-vck5000-es1-gen3x16-base_2-3123623_all.deb

At this point running sudo lspci -vd 10ee: should show the VCK5000 card detected and running the with the Kernel drives xclmgmt / xocl.

To flash latest firmware we can run:

$ sudo /opt/xilinx/xrt/bin/xbmgmt flash --scan
$ sudo /opt/xilinx/xrt/bin/xbmgmt flash --update

After a cold restart --scan should show our VCK5000 is running version 4.4.6.

The validation utility can be used the card is functioning correctly:

$ /opt/xilinx/xrt/bin/xbutil validate --device 0000:01:00.1

> Installing Vitis AI

At this point we should be ready to install Vitis AI.

I opted to run Vitis AI in a Docker container. So, the first step was to install Docker Engine by following the official installation guide.

To run Vitis AI in a container we need to clone the Vitis AI GitHub repository:

$ git clone --recurse-submodules https://github.com/Xilinx/Vitis-AI

$ cd Vitis-AI

Then we can use the provided ./docker_run.sh script to pull and run the latest Vitis AI container:

./docker_run.sh xilinx/vitis-ai-cpu:latest

The script can also detects when a VCK5000 card is installed, and it automatically attaches the PCI-E device to our container.

This should land us in a Docker container with Vitis AI

Vitis-AI in a Docker container

> Running Demos / Performance Tests

To validate our Vitis-AI setup, we can run some demos and performance test. There are a good number of demos that came with Vitis-AI. I choose to run demos with the ResNet-50 models.

First we need to download and extract the the VCK5000 optimized version of the ResNet-50 Vitis-AI model. The official documentation recommends to download the models directly to the /usr/share/vitis_ai_library/models folder:

$ wget https://www.xilinx.com/bin/public/openDownload?filename=resnet50-vck5000-DPUCVDX8H-r1.4.1.tar.gz -O resnet50-vck5000-DPUCVDX8H-r1.4.1.tar.gz
$ tar -xzvf resnet50-vck5000-DPUCVDX8H-r1.4.1.tar.gz
$ sudo cp resnet50 /usr/share/vitis_ai_library/models -r

As I wanted to make these downloads persistent, I decided to link /usr/share/vitis_ai_library/models to a externally mounted folder:

$ cp -R /usr/share/vitis_ai_library/models .tmp-vck5000-models
$ sudo rm -rf /usr/share/vitis_ai_library/models
$ sudo ln -s /workspace/.tmp-vck5000-models /usr/share/vitis_ai_library/models

This will save the models in a .tmp-vck5000-models folder in the Vitis-AI git repository from the Docker host.

Next, we need download some sample images and videos to test with:

$ wget https://www.xilinx.com/bin/public/openDownload?filename=vitis_ai_library_r1.4.0_images.tar.gz -O vitis_ai_library_r1.4.0_images.tar.gz
$ wget https://www.xilinx.com/bin/public/openDownload?filename=vitis_ai_library_r1.4.0_videos.tar.gz -O vitis_ai_library_r1.4.0_video.tar.gz
$ tar -xzvf vitis_ai_library_r1.4.0_images.tar.gz -C demo/Vitis-AI-Library/
$ tar -xzvf vitis_ai_library_r1.4.0_video.tar.gz -C demo/Vitis-AI-Library/

(note: the sample images and videos are already saved on the shared folder)

Finally we need to compile and run the sample classification application:

$ cd /workspace/demo/Vitis-AI-Library/samples/classification
$ bash -x build.sh

To run the model on the sample image we can run:

$ source /workspace/setup/vck5000/setup.sh
$ ./test_jpeg_classification resnet50 sample_classification.jpg

WARNING: Logging before InitGoogleLogging() is written to STDERR
I0104 03:37:21.292187   177 demo.hpp:1183] batch: 0     image: sample_classification.jpg
I0104 03:37:21.292309   177 process_result.hpp:24] r.index 109 brain coral, r.score 0.982666 
I0104 03:37:21.293148   177 process_result.hpp:24] r.index 973 coral reef, r.score 0.00850172 
I0104 03:37:21.293203   177 process_result.hpp:24] r.index 955 jackfruit, jak, jack, r.score 0.00662115 
I0104 03:37:21.293256   177 process_result.hpp:24] r.index 397 puffer, pufferfish, blowfish, globefish, r.score 0.000543497 
I0104 03:37:21.293325   177 process_result.hpp:24] r.index 390 eel, r.score 0.000329648
...

We can also check the performance is as expected:

> ./test_performance_classification resnet50 test_performance_classification.list -t 8 -s 60
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0104 03:37:57.178503   242 benchmark.hpp:184] writing report to <STDOUT>
I0104 03:38:01.277192   242 benchmark.hpp:211] waiting for 0/60 seconds, 8 threads running
I0104 03:38:11.277294   242 benchmark.hpp:211] waiting for 10/60 seconds, 8 threads running
I0104 03:38:21.277393   242 benchmark.hpp:211] waiting for 20/60 seconds, 8 threads running
I0104 03:38:31.277499   242 benchmark.hpp:211] waiting for 30/60 seconds, 8 threads running
I0104 03:38:41.277577   242 benchmark.hpp:211] waiting for 40/60 seconds, 8 threads running
I0104 03:38:51.277688   242 benchmark.hpp:211] waiting for 50/60 seconds, 8 threads running
I0104 03:39:01.277866   242 benchmark.hpp:219] waiting for threads terminated
FPS=4540.63

We got 4540 frames / second, which is a pretty impressive performance.

> Building the AIaaS Docker Image

Now, as we have VitisAI up and running, we can prepare the Docker image for the AIasS on VCK5000 server.

To build the Docker image we need to navigate to the Backend folder and ran:

$ docker build .

Then we can launch the new image as:

Vitis-AI $ ./docker_run.sh <image-id>

To be able to run the VCK5000 server a couple of steps is needed to be done:

run the ./setup.sh script to prepare the VitisAI environment
copy the /VitisAI-MicroApps folder to /workspace/demo/Vitis-AI-Library/samples/, and compile them using the ./build.sh script
compile the API Server using the mvn package command, and copy the resulting .jar file to /workspace

After that we can launch the API Server as:

$ java -jar /workspace/vitis-aiaas-0.0.1-SNAPSHOT.jar

This will expose the REST API on port :8080, and we should be able to call it either from local or from remote machine.

Next Steps

As we saw this project is still in the proof-of-concept (PoC) stage, with room for many many new features and improvements.

Short term features would implement more hardware accelerated AI workloads like for different domains like Automotive, Medical, Virtual Reality and others. It would also be useful to add more video processing endpoints.

Long term goals would be to:

provide auto-generated clients for different programming languages (Python, Java, C++, etc.)
support for custom models
support for video streaming endpoints
support for customizing models by pruning / re-training
support for additional AI/ML frameworks as TensorFlow, Caffe and others
migrate the project to Vitis AI 2.0
add authorization and tracking features
and many others

Hope you enjoyed this project! :)

Code

Credits

Attila Tőkés

37 projects • 225 followers

Software Engineer experimenting with hardware projects involving IoT, Computer Vision, ML & AI, FPGA, Crypto and other related technologies.

Contact

Comments

Please log in or sign up to comment.

Artificial Intelligence as a Service (AIaaS) on the VCK5000

Things used in this project

Hardware components

Software apps and online services

Story

Concept & Overview

Architecture

Vitis-AI MicroApps

> Image Classification

> Batch Classification

> Face Detection

> Lane Detection

> Object Detection in Videos with YOLO V3

API Server

> Open API Specification

> Admin Interface

> Docker Images & Setup Script

Clients and Examples

Setup & Installation

> Hardware Installation

> OS and Packages

> Installing Vitis AI

> Running Demos / Performance Tests

> Building the AIaaS Docker Image

Next Steps

Schematics

System Architecture

Code

VCK5000 AIaaS - GitHub Repository

Credits

Attila Tőkés

Comments

Embed the widget on your own site

Artificial Intelligence as a Service (AIaaS) on the VCK5000

Artificial Intelligence as a Service (AIaaS) on the VCK5000

Things used in this project

Hardware components

Software apps and online services

Story

Concept & Overview

Architecture

Vitis-AI MicroApps

> Image Classification

> Batch Classification

> Face Detection

> Lane Detection

> Object Detection in Videos with YOLO V3

API Server

> Open API Specification

> Admin Interface

> Docker Images & Setup Script

Clients and Examples

Setup & Installation

> Hardware Installation

> OS and Packages

> Installing Vitis AI

> Running Demos / Performance Tests

> Building the AIaaS Docker Image

Next Steps

Schematics

System Architecture

Code

VCK5000 AIaaS - GitHub Repository

Credits

Attila Tőkés

Comments

Related channels and tags