Heterogeneous systems such as the ZYNQ MPSoC are ideal for creating image processing and machine learning inference. Of course these systems are deployed in obvious applications such as the higher levels of the SAE automation levels, but they are also popular in less visible systems such as Industrial inspection on the production lines detecting defects in manufactured products.
The Avnet Mezzanine for the Ultra96 provides dual On Semi Imagers and is designed to be flexible, allowing support for a range of cameras due to it's architecture.
The MIPI mezzanine contains two imagers which are connected to a image co processor (ISP) which is then outputs a MIPI stream to the Ultra96 board.
Off the shelf the Mezzanine, comes with two CAV10-000A image sensors these offer a 1 Mega Pixel resolution (1280 pixels by 800 lines) and output a grey scale image. Importantly for high speed operations the sensors offer a global shutter, this prevents artifacts which might appear in the image if a rolling shutter was used to capture high speed images.
Grey scale imaging is very popular for many industrial applications, in fact one of the first operations in many image processing systems is to convert the image from color to grey scale. For example see this edge detection application where the first stage is covert to grey scale.
Grey Scale provides luminance information on the image which is one of the most important factors in finding visual features. Indeed, elements such as brightness, contrast, edges, shape, contours, texture, perspective and shadow can be identified without the need for color image.
For many applications the use of a color sensor can overlay complicate the algorithm due to the three separate channels.
Of course our job as engineers is to select the correct sensor for the job be it a grey scale or color sensor. Global shutter or Rolling shutter or even line scan vs 2D sensor.
Mezzanine ArchitectureThe architecture of the system enables the image coprocessor to configure and output one or both cameras. When dual cameras are selected the ISP stitches together the two cameras into a signal image.
Configuration of the ISP takes place using SPI from the Zynq processor.
The combined image is sent over MIPI to the Ultra96 where an image processing pipeline is implemented in the programmable logic.
Vivado ProjectThe Vivado design created to support the release of the MIPI mezzanine consists of the four main elements
- Zynq MPSOC Block - Zynq configured for operation on the U96
- Live video DP - VDMA read input into the Live DisplayPort interface
- Capture Pipeline - MIPI image capture and processing
- GPIO - General purpose output providing SPI, Triggers and enables.
Starting with the MIPI input block, the design uses the Xilinx MIPI-CSI2 IP block which is now free with Vivado. This is configured to received 4 lanes of MIPI data with a line rate 896 Mbps and the pixels encoded as YUV422 8-bit.
To maintain throughput the MIPI CSI2 IP block outputs two pixels per clock, which equates to two 16 bit pixels. Each pixel contains a Y and U or V element each of which is 8 bits.
YUV is used in place of the RAW pixel format as the down stream Xilinx IP do not support RAW format.
The downstream processing blocks include the video processing sub system configured to operate as a scaler. Along with a write frame buffer to write the image into the PS DDR memory.
To output a image over the DisplayPort interface the Live Video DP block is provided. This block reads the image from the PS DDR memory using the read_buffer IP block.
Live video streams through an on screen display, along with a test pattern generator enables the videos to be layered if desired under control of SW.
This AXI Stream video is then converted back to a parallel video stream which uses a the AXIS to video out and video timing generator to output a parallel video stream into the DisplayPort Live Interface.
The processor system block is contains the PS block configured for the Ultra96 board settings e.g. clocks and DDR etc. Along with providing a range of AXI interconnects for the master and slave AXI Links.
The final block of the system uses a GPIO to provide the necessary triggers, enables and communication link to the ISP on the mezzanine.
This Vivado design provides the basis for the SW developments either baremetal or petalinux. In this application we are going to look at the baremetal project created using Xilinx Vitis.
SW ApplicationThe simplest software application to create is one which performs the configuration of the ISP for one camera. We can do this using a bare metal approach as such once the platform is created we can implement a simple application software.
We can create the platform using the XSA exported from the Vivado project. This platform will contain a board support package. This BSP will provide all of the API and libraries necessary to configure and use the IP in the programmable logic.
If so desired the platform will also build the boot loader enabling a complete solution to be provided from Vitis.
The application software will peform the following tasks
- Initialize the IP blocks in the IP with the desired configuration - this sw configuration enables easy adaption of pipelines if required for different resolutions.
- Initialize the ISP with the configuration settings - This is over SPI and will take several minutes
- Configure the input and output image processing chains such that images received on the MIPI path are processed and output on the DisplayPort
The platform and application projects are provided by Avnet enabling us to hit the ground running pretty fast.
To get started with creating image processing applications, I thought it would be a good idea to outline how a HLS image processing IP block could be created. This IP block will either pass video or flip the pixel value depending on a register setting.
The first thing we need to do to get this going is to create a new project targeting the Ultra96V2 in Vitis HLS
The code needed to create this block is simple, we need to create AXIS type and a AXIS HLS Stream type as the itnerface works on 2 pixels per clock we need a data width of 32 bits on this bus
#include <ap_fixed.h>
#include <ap_axi_sdata.h>
#include "hls_stream.h"
#define WIDTH 32
typedef ap_axiu< WIDTH, 1, 1, 1> AXITYPE;
typedef hls::stream<AXITYPE> AXI_STREAM;
void video_top(AXI_STREAM& vidip, AXI_STREAM& vidop);
The main function of the algorithm can be seen below, it reads in the AXIS and then manipulates the pixel before writing out the pixel.
#include "ip.h"
#include "stdint.h"
void video_top(AXI_STREAM& vidip, AXI_STREAM& vidop, int invert){
#pragma HLS INTERFACE s_axilite port=invert
#pragma HLS INTERFACE s_axilite port=return
#pragma HLS INTERFACE axis port=vidip
#pragma HLS INTERFACE axis port=vidop
AXITYPE dataInA;
AXITYPE dataOutB;
uint16_t pix1, pix2;
while(1){
dataInA = vidip.read();
dataOutB = dataInA; // copy all fields data, dest, keep, tlast, tuser, id, strb,
pix1 = (uint16_t) (dataInA.data >>16); //shift pixels by 16 bit
pix2 = (uint16_t) (dataInA.data);
if (invert ==1){ //do we want to wait for external trigger or free run
dataOutB.data = ((16384 - pix1) <<16) | (16384 - pix2);
}
else {
dataOutB.data = dataInA.data;
}
vidop.write(dataOutB);
}
}
To get the interfacing correct interface pragams are used to declare a Slave AXI lite interface and a master and slave AXI Streams.
Synthesizing and exporting this project, will allow us to import it into Vivado and into the design.
To import the IP into Vivado we need to first create a new IP repo, we can do this by opening the IP Catalog right clicking and selecting Add Repository
This will open a dialog window to enable a directory to be selected. I decided to create a new directory and use that as the Repo
Once the new repository is visible in Vivado we need to add in the IP just exported from Vitis.
Select the newly created repo and right click and select add IP to Repository
From the dialog which opens, select the Zip file of the IP just exported, after a few seconds you should see the IP in your repository ready for use.
Double clicking on the IP will provide the option of adding the IP to the block diagram.
Once the IP block is added to the block diagram the next step is to add it into the image processing chain.
Drag the block into the capture pipeline block and connect it between the MIPI output and the AXIS Subset convertor.
All that remains now is to build the XSA and update the software to enable the newly created IP blocks. This might take a few minutes to build the Vivado design and the Vitis Design.
HW TestTo get started testing the application the application from Vitis was connected to the board and a debug session started. This downloaded the application and started the application running.
NOTE this will take several minutes and may look like it has hung, however it is still working behind the scene.
After a few minutes the output of the terminal window will update and the image will appear on the screen.
Running this on my U96 shows me the resulting images
This has been a fun project to create and explore the new MIPI imager. I am going to work in this a little more and explore the possibility of a PYNQ overlay and how it works with PetaLinux!
Comments