Background: Globally, liver cancer ranks the sixth in diagnosed cancer and the second leading cause of cancer death [1]. Accurate liver tumor segmentation, which assigns a label to each pixel in medical images, is a prerequisite for many clinical applications such as computer-aided diagnosis, organ surgery planning, treatment planning and post-treatment evaluation. To reduce the workload of manual delineation, radiologists usually rely on a computer-assisted system to realize a pixel-wise lesion segmentation.
Challenges: The current computer-assisted diagnosis (CAD) systems usually requires a local PC, a cloud server, and high-speed Ethernet for communication. The workflow is that the local PC uploads the medical image to the cloud server through the Ethernet, the cloud server processes the image and then downloads it to the local PC.
However, the complex CAD systems require high equipment cost, slow response speed and low computational power consumption. For the broad adoption of CAD for medical diagnosis, a small and cost-effective system is desired. Therefore, the CAD system should run on edge devices.
Aim: In this project, the computer-assisted diagnosis system should be implemented on the FPGA edge device Kria KV260 form AMD-Xilinx. Let's call this system for this project KV260-CAD. The KV260-CAD can realize the image acquisition, calculation, storage and display even without network connection, as well as provide doctors with the interaction by wired mouse, keyboard, display screen and wireless smart phone and tablet. As the interactive deep learning network for liver tumor segmentation was designed for GPU, the required effort to run such a given network on an FPGA and the performance impact is of major interest.
Approach: The application requires a camera interface and a deep learning processing unit. To test these hardware parts, a given example project that uses these parts was run on the Kria board fist. Then, AMD-Xilinx's Vitis AI is used to compile the interactive deep learning segmentation network for the Deep-Learning Processor Unit (DPU) on the FPGA. The included Vitis AI Runtime Engine with its Python API communicates with the DPU via an embedded Linux on the FPGAs microprocessor.
Take-home: This tutorial is vivid proof that you don’t need to be a FPGA engineer to rapidly run a model on KV260 developing board to proactively solve practical challenges. And, most importantly, the implementation of such solutions using KV260, which saves lots of money and resources, doesn’t require high costs or efforts, but only a AI development tool and a chip SOC!
Technologies Used- Vitis AI: The Vitis AI development environment accelerates AI inference on Xilinx hardware platforms, including both Edge devices and Alveo accelerator cards. It consists of optimized IP cores, tools, libraries, models, and example designs. It is designed with high efficiency and ease of use in mind to unleash the full potential of AI acceleration on Xilinx FPGAs and on adaptive compute acceleration platforms (ACAPs). The Vitis AI development environment makes it easy for users without FPGA knowledge to develop deep-learning inference applications by abstracting the intricacies of the underlying FPGA and ACAP.
- Xilinx Kria KV260 Vision AI Starter Kit: The Kit is comprised of the K26 system-onmodule (SOM), carrier card, and thermal solution. The SOM is very compact and only includes key components such as a Zynq® UltraScale+™ MPSoC based silicon device, memory, boot, and security module. The carrier card allows various interfacing options and includes a power solution and network connectors for camera, display, and microSD card. The thermal solution has a heat sink, heat sink cover, and fan. The Kria KV260 Vision AI Starter Kit is designed to provide customers a platform to evaluate their target applications and ultimately design their own carrier card with Xilinx K26 SOMs. While the SOM itself has broad AI/ML applicability across markets and applications, target applications for the Kria KV260 Vision AI Starter Kit include smart city and machine vision, security cameras, retail analytics, and other industrial applications.
Diagnostic pipeline in radiological department:
1. Liver cancer patient undergoes an abdominal CT scanning.
2. CT scans are printed out or digitally archived on the cloud server.
3. Radiologists annotate liver tumors with semantic annotations and evaluate the state of liver cancer using the standard RECIST protocol.
Human-AI interactive edge computing that further enables precise detection:
1. The KV260 platform can use smartcam to acquire medical images derived from the different media, such as clinic paper report, monitor, tablet, smart phone, projector, etc.
2. The KV260 platform can pre-process CT scans together with semantic annotations to navigate to the tumor ROIs.
3. The KV260 platform can perform efficient deep learning model using FPGA to locally infer precise segmentation mask for sufficient advice for diagnosis and surgical or radiotherapy planning.
Vitis-AI Program that builds the KV260-based CAD system:
1. On GPU server (or host), engineers train a deep learning-based segmentation model using Tensorflow framework.
2. Vitis AI developement environment quantises the float32-type model into the INT8 type during quantised aware training (QAT), ultimately reducing the model size.
3. Vitis AI development environment enables compiling the quantised model into KV260-deployable model.
Dataset DescriptionIn our case, we utilize the ISBI LiTS 2017 Challenge dataset for liver tumor segmentation, which contains 131 contrast-enhanced 3D abdominal CT scans with different sizes. We further divide the dataset into a train set and a test set. The train set contains 103 volumes while the test set contains 28 volumes. The dataset can be found here:
https://academictorrents.com/details/27772adef6f563a1ecc0ae19a528b956e6c803ce
Response Evaluation Criteria in Solid Tumors (RECIST) [2] was published in 2000, many investigators, cooperative groups, industry and government authorities have adopted these criteria in the assessment of treatment outcomes. In clinical practice, radiologists usually follow RECIST to mark the longest diameter and the perpendicular counterpart of the tumor in its significant slice (the axial slice with the largest tumor area), and assess the tumor size further. The RECIST-mark used in our project are lesion diameters which consist of two lines, one measuring the longest diameter and the second measuring its longest perpendicular diameter in the plane of measurement [3]. Samples of the lesions and RECIST-marks can be found in Fig. 3.
- Linux host PC with Vitis AI installed
- Required packages installed
The first step is to train a DNN model for liver tumor segmentation from scratch in the cloud. Specifically, we input a liver abdominal CT with RECIST-marked tumor, and output the pixel-wise segmentation results of tumor. Here, we choose the widely-used UNet segmentation model with 0.49M parameters based on the public available tensorflow-2.6.0 framework. The input images have the size of 256x256x4.
1. Train the UNet segmentation model:
python train_unet.py
2. The trained network is converted to the format of H5, which is named as “float_model.h5”:
model.save(os.path.join(base_path, model_name+'/float_model.h5'))
Step 2 on Host: Quantizing DNN ModelWe need to convert the 32-bit floating-point weights and activations to 8-bit integer (INT8) format. You can directly use the vai_q_tensorflow2 API
:
python quantize_tf.py
We also itemize the main step in the vai_q_tensorflow2 API
for better understanding:
1. Preparing the Float Model (“float_model.h5
”) obtained in Step.1 and Calibration Set (“calibration dataset
”) which is a subset of the training dataset to represent the input data distribution:
calib_dataset = load_calib_data(data_path=data_path)
2. Quantizing the model
float_model = unet(input_size=(256, 256, 4))
float_model.load_weights(model_path,by_name=True)
quantizer = vitis_quantize.VitisQuantizer(float_model)
quantized_model = quantizer.quantize_model(calib_dataset=calib_dataset, calib_batch_size=2)
3. Saving the quantized model
quantized_model.save('quantized_model.h5')
4. Evaluating the Quantized Model
We replace the float model file with the quantized model in the evaluation script and evaluate the quantized model just as the float model. The evaluation results are shown in Table.1 and Fig.5.
Step 3 on Host: Compiling DNN Model to xmodelSince we are going to embed the model onto a DPU, we need to map the network model to a highly optimized DPU instruction sequence. Specifically, we input the “quantized_model.h5
” obtained in Step.2 and the “arch.json
” which is a configuration file generated during the Vitis flow. Then, we can obtain a compiled model named “compiled_model.xmodel
”.
Compiling for DPU:
vai_c_tensorflow2 -m ./quantized_model/quantized_model.h5 \
-a /opt/vitis_ai/compiler/arch/DPUCZDX8G/KV260/arch.json \
-o ./compiled_model/ \
-n unet_compiled
Prerequisite on Edge- Flash the petalinux into SD card using Balena Etcher.
- Interface access for the KV260, e.g. SD(J11), micro-USB(J4), IAS(J7), Ethernet(RJ45) in Fig.4.
- Set user password:
sudo su -I root
- Internet access for the KV260:
ping 8.8.8.8
1. The camera interface is an important part in our project. This requires installing the Smart camera app (Smartcam).
1) Obtain the Smartcam application package:
sudo xmutil getpkgs
2) Install the SmartCam accelerator package:
sudo dnf install packagegroup-kv260-smartcam.noarch
3) View available application firmware on the system:
sudo xmutil listapps
4) Uninstall the default KV260-DP application:
sudo xmutil unloadapp
5) Load the SmartCam accelerator firmware:
sudo xmutil loadapp kv260-smartcam
2. Print the CT image and focus the camera:
Fix the printed CT image and turn on the camera to display the image on the screen:
sudo smartcam --mipi -W 1920 -H 1080 --target dp
Move the camera to focus until a clear CT image is captured and then fix the camera.
3. Mount the USB:
1) Create a USB drive folder:
mkdir /mnt/usb
2) Mount the USB in root mode:
mount -t vfat/dev/sda1/mnt/usb
4. Use Smartcam to collect images and store them in USB:
sudo smartcam --mipi -W 1920 -H 1080 --target file
5. Use FFMPEG application to segment the collected video stream into pictures:
ffmpeg -r 30 -i /mnt/usb/out.h264 -f image2 /mnt/usb/%03d.jpeg
Step 2 on Edge: Pre-processing Scanned CT Image [HL]Here, we aim to produce region-of-interest (ROI) along with the interactive annotation mark from practitioner, given the medical images scanned by the smartcam. This can be successfully implemented by OpenCV in Python.
The raw abdomen images should go through image warping and cropping process to have orthogonal rectangular view, and next practitioner can interactively annotate with a simple cross to emphasise tumor shape and location. In specific, the following steps are required to perform:
1. Read and resize the raw images.
2. Convert the color image to grayscale.
3. Use Canny function to detect edges in the grayscale.
4. Find the contour that have largest area and longest length.
5. Find and back translate the four vertices matching the original image resolution.
6. Performance four-vertex perspective transform to have rectangular view.
7. Interactively annotate lesions by drawing long and short axes.Ste
All the above processes have been integrated in a jupyter notebook, hence go to its directory, input `jupyter notebook
` and then enter `pre-process-scan.ipynb
` to see the details.
1. We need to first deploy the model files trained on host, script files and scanned CT images to the KV260 platform, and run the test model:
python3 app_mt.py
2. Monitor metrics such as throughput and runtime:
python3 -m vaitrace.py app_mt.py
3. Monitor the power computation of running test model
xmutil platformstats -p
Step 4 on Edge: Benchmarking Output PerformanceIn this project, we use two benchmarks to evaluate our method, according to two famous challenges: 2021 Mobile AI Workshop Challenge and 2021 Low-Power Computer Vision Challenge (LPCVC).
The first benchmark TotalScore1 consists the Dice Coefficient (Dice) to evaluate segmentation performance and the runtime to evaluate algorithm execution time, which can be formulated as:
The second benchmark TotalScore_2 consists the power of system, Dice and throughput, which can be formulated as:
where baseline_Dice=0.6 or 0.7, baseline_throughput=32GB/s on GPU while baseline_throughput=1.7GB/s on DPU. The above two benchmarks can be calculated by:
python3 utils/evaluation.py
Finally, we report the benchmark results (in Table.1) and illustrate the segmentation visualization results (in Fig.5) of vanilla U-Net on host, reduced U-net on host, vanilla U-net on edge and reduced U-net on edge.
Table.1 Benchmark results of Vanilla U-Net and Reduced U-Net on host and edge.
KV260 is an advanced visual application development platform with unparalleled AI performance, preset hardware acceleration and adaptation to future sensor changes. This project successfully designs a KV260-CAD system, which applies the KV260 platform to realize a computer-assisted liver cancer diagnosis. It opens the field for broad applications in the medical diagnostic system. We experimented on the public ISBI LiTS 2017 Challenge dataset for liver tumor segmentation and achieved (results)?????.
I know for a fact that it is not limited in liver tumor segmentation and with some tinkering it should be possible to do tumor detection and diagnose and also suitable for various kinds of medical images.
Feel free to fork our GitHub repos and do some awesome things yourself!
References[1] J. Fu and H. Wang. “Precision diagnosis and treatment of liver cancer in China.” Cancer letters 412, 283-288, 2018
[2] E. A. Eisenhauer et al., “New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1), ” Eur. J. Cancer, vol. 45, no. 2, pp. 228–247, 2009.
[3] Y. Zhang, et al. "DeepRecS: From RECIST diameters to precise liver tumor segmentation." IEEE Journal of Biomedical and Health Informatics, 2021.
AcknowledgmentWe thank ISBI LiTS 2017 Challenge for providing us the liver tumor dataset. Thanks to AMD-Xilinx for providing KV260 AI starter kit for the secondary development of the prototype. We also feel great thanks for the python-based document scanner from github:
https://github.com/aniruddhadave/Document-Scanner/blob/master/scan.py
Comments