Now that we know our hardware is good (part one) and the algorithms we wish to integrate (part two) we are now ready to accelerate the application from running on the processors system into the programmable logic.
To do this acceleration we need three things
- Vivado project with interfaces made available for acceleration
- PetaLinux project with support for image processing and Xilinx Run Time
- Vitis Acceleration Platform, able to use OpenCL to accelerate functions from the PS to the PL
To create the PetaLinux project, Vitis Platform and to run the Vitis Accelerated application development we will need to use a Linux machine.
Instructions on how to create a Linux VM can be found here
Vivado Platform CreationAgain we can build on the platform we created for part two, this time in Vivado we need to enable specific platform interfaces.
We can enable these interfaces using the platform view in Vivado, this can be enabled under the Window, Platform Interfaces option.
This will open a new tab, in which we can enable the platform interfaces for the design.
Platform Interfaces are the interfaces which Vitis can use for acceleration, as such I am going to enable the following
- S_AXI_HP/1/2/3_FPD - We do not use S_AXI_HP0 as it is already used.
- M_AXI_HPM1_FPD.
- Eight Interrupts - Connected to the AXI Interrupt Controller
- Clock Wizard - Providing Four Clocks for the accelerated hardware
Provision of interrupts for the accelerated application using the AXI Interrupt controller.
To provide the clocks for the accelerated hardware I added in clock wizard ad associated reset system for each clock.
Once implemented these can be provided to the Vitis Platform.
To make use of the Vivado design this time we do not need to build the bitstream as Vitis will do that later on.
However as we have a image processing chain in the programmable logic and we want to ensure the PetaLinux image processing chain is configured correctly I built the bitstream so I could test the Petalinux image.
Building the bitstream will also demonstrate any timing issues, timing issues will prevent Vitis from completing the SD Card build at a later stage.
To create the XSA we need to run a script to correctly configure the platform for generation. Before running this script make sure the block diagram is in a validated state.
set name [get_property NAME [current_project]]
set output_file [format ../hw_platform/%s.xsa $name]
set bd [format "%s.bd" [current_bd_design]]
set_property PFM_NAME [format "xilinx.com:board:%s:1.0" $name] [get_files $bd]
set_property platform.default_output_type "sd_card" [current_project]
set_property platform.design_intent.embedded "true" [current_project]
set_property platform.design_intent.server_managed "false" [current_project]
set_property platform.design_intent.external_host "false" [current_project]
set_property platform.design_intent.datacenter "false" [current_project]
set_property platform.post_sys_link_tcl_hook ./sources/dynamic_postlink.tcl [current_project]
# Get the xlconcat instance and pin number to work on now
set __xlconcat_inst_num 0
set __xlconcat_pin_num 0
set __xlconcat_inst [get_bd_cells -hierarchical -quiet -filter NAME=~xlconcat_${__xlconcat_inst_num}]
set __xlconcat_pin [get_bd_pins -of_objects $__xlconcat_inst -quiet -filter NAME=~In${__xlconcat_pin_num}]
if {[llength $__xlconcat_pin] == 1} {
if {[llength [get_bd_nets /xlconstant_gnd_dout]] == 1} {
puts "Passed verify test"
write_hw_platform -unified -include_bit -force ../hw_platform/u96v2_mipi.xsa
} else {
puts "Missing required name for const net: net should be xlconstant_gnd_dout"
puts "Halting XSA output"
}
} else {
puts "Halting XSA output"
}
Petalinux BuildWith the XSA available we can now create the PetaLinux operating system configured for the Ultra96V2 and the SYSROOT.
By the completion of the Petalinux build we will have the elements required for creation of the Vitis Acceleration platform
We can get started with the project using the commands
petalinux-create --type project --template zynqMP --name auto_s3
cd auto_s3
petalinux-config --get-hw-description=/<location of XSA>
once the project is created, we need to configure the project for the image processing chain and to work with OpenCL.
Starting with the image processing pipeline we need to enable the following in the RootFS
- GStreamer
- X11
- MatchBox
- OpenCV
- V4L
These can all be found under the Petalinux Package Groups.
To configure the image for OpenCL support we also need to enable a number of user applications under the directory auto_s3/project-spec/meta-user/conf/user-rootfsconfig append the lines below into the file.
CONFIG_xrt
CONFIG_xrt-dev
CONFIG_zocl
CONFIG_opencl-clhpp-dev
CONFIG_opencl-headers-dev
CONFIG_packagegroup-petalinux-opencv
We can then enable these packages within the RootFS user applications.
However before we can build the image we need to implement the image processing chain as a device-graph in the device tree. We also need to include the device tree additions for Open CL support.
Device-graphs enable embedded linux solutions to group together functions in the programmable logic which implement a function e.g. image processing pipeline.
Within the pipeline we need to implement a number of endpoints to interconnect the IP cores within the programmable logic. A diagram of these end points can be seen below
This diagram can be implemented in the system-user.dtsi in the met-user layer to update the device tree.
/include/ "system-conf.dtsi"
/ {
amba {
mmc@ff160000 {
u-boot,dm-pre-reloc;
compatible = "xlnx,zynqmp-8.9a", "arasan,sdhci-8.9a";
status = "okay";
interrupt-parent = <0x4>;
interrupts = <0x0 0x30 0x4>;
reg = <0x0 0xff160000 0x0 0x1000>;
clock-names = "clk_xin", "clk_ahb";
xlnx,device_id = <0x0>;
#stream-id-cells = <0x1>;
iommus = <0xd 0x870>;
power-domains = <0xc 0x27>;
clocks = <0x3 0x36 0x3 0x1f>;
clock-frequency = <0xb2d0529>;
xlnx,mio_bank = <0x0>;
no-1-8-v;
disable-wp;
};
};
misc_clk_a: misc_clk_a {
#clock-cells = <0>;
clock-frequency = <25000000>;
compatible = "fixed-clock";
};
cam_reg_1v8: regulator-1v8 {
compatible = "regulator-fixed";
regulator-name = "1v8";
regulator-min-microvolt = <1800000>;
regulator-max-microvolt = <1800000>;
};
cam_reg_2v8: regulator-1v8 {
compatible = "regulator-fixed";
regulator-name = "2v8";
regulator-min-microvolt = <2800000>;
regulator-max-microvolt = <2800000>;
};
cam_reg_1v5: regulator-1v8 {
compatible = "regulator-fixed";
regulator-name = "1v5";
regulator-min-microvolt = <1500000>;
regulator-max-microvolt = <1500000>;
};
chosen {
bootargs = "earlycon console=ttyPS0,115200 loglevel=8 clk_ignore_unused root=/dev/ram rw";
};
sdio_pwrseq: sdio_pwrseq {
compatible = "mmc-pwrseq-simple";
// MIO[7] RESETN for WILC3000 active low
reset-gpios = <&gpio 7 1>;
// requires a patched pwrseq_simple.c for WILC3000
chip_en-gpios = <&gpio 8 1>;
};
// Remove V1 Power ON/OFF controller from U96 V1 DT
/delete-node/ ltc2954;
};
&mipi_csi2_rx_subsyst_0{
xlnx,vc = <0x4>;
csiss_ports: ports {
#address-cells = <1>;
#size-cells = <0>;
csiss_port0: port@0 {
reg = <0>;
xlnx,video-format = <0>;
xlnx,video-width = <8>;
mipi_csi2_rx_0_to_demosaic_0: endpoint {
remote-endpoint = <&demosaic_0_from_mipi_csi2_rx_0>;
};
};
csiss_port1: port@1 {
reg = <1>;
xlnx,video-format = <0>;
xlnx,video-width = <8>;
csiss_in: endpoint {
data-lanes = <1 2>;
remote-endpoint = <&ov5640_to_mipi_csi2>;
};
};
};
};
&v_demosaic_0 {
compatible = "xlnx,v-demosaic";
reset-gpios =<&gpio 86 GPIO_ACTIVE_LOW>;
ports {
#address-cells = <1>;
#size-cells = <0>;
port@0 {
reg = <0>;
xlnx,video-width = <8>;
demosaic_0_from_mipi_csi2_rx_0: endpoint {
remote-endpoint = <&mipi_csi2_rx_0_to_demosaic_0>;
};
};
port@1 {
reg = <1>;
xlnx,video-width = <8>;
demosaic_0_to_fb: endpoint {
remote-endpoint = <&vcap_in>;
};
};
};
};
&v_frmbuf_wr_0{
reset-gpios =<&gpio 85 GPIO_ACTIVE_LOW>;
};
&v_frmbuf_rd_0{
reset-gpios =<&gpio 87 GPIO_ACTIVE_LOW>;
};
&amba_pl {
video_in: video_cap {
compatible = "xlnx,video";
dmas = <&v_frmbuf_wr_0 0>;
dma-names = "port0";
ports {
#address-cells = <1>;
#size-cells = <0>;
port@0 {
reg = <0>;
direction = "input";
vcap_in: endpoint {
remote-endpoint = <&demosaic_0_to_fb>;
};
};
};
};
};
&i2csw_2 {
ov5640: camera@3c {
compatible = "ovti,ov5640";
reg = <0x3c>;
clocks = <&misc_clk_a>;
clock-names = "xclk";
rotation = <180>;
vdddo-supply = <&cam_reg_1v8>;
vdda-supply = <&cam_reg_2v8>;
vddd-supply = <&cam_reg_1v5>;
port {
/* MIPI CSI-2 bus endpoint */
ov5640_to_mipi_csi2: endpoint {
remote-endpoint = <&csiss_in>;
clock-lanes = <0>;
data-lanes = <1 2>;
};
};
};
};
&uart0 {
// Remove TI child node from U96 V1 DT
/delete-node/bluetooth;
};
&gpio {
/delete-property/gpio-line-names;
};
&sdhci1 {
max-frequency = <25000000>;
// cap-power-off-card not compatible with WILC3000
/delete-property/cap-power-off-card;
wilc_sdio@1 {
compatible = "microchip,wilc3000";
reg = <0>;
bus-width = <0x4>;
status = "okay";
};
// Remove TI child node from U96 V1 DT
/delete-node/wifi@2;
};
&spi0 {
is-decoded-cs = <0>;
num-cs = <1>;
status = "okay";
spidev@0x00 {
compatible = "rohm,dh2228fv";
spi-max-frequency = <1000000>;
reg = <0>;
};
};
&spi1 {
is-decoded-cs = <0>;
num-cs = <1>;
status = "okay";
spidev@0x00 {
compatible = "rohm,dh2228fv";
spi-max-frequency = <1000000>;
reg = <0>;
};
};
&amba {
zyxclmm_drm {
compatible = "xlnx,zocl";
status = "okay";
interrupt-parent = <&axi_intc_0>;
interrupts = <0 4>, <1 4>, <2 4>, <3 4>,
<4 4>, <5 4>, <6 4>, <7 4>;
};
};
Once the device tree has been completed are then able to build the project using the command
petalinux-build
once the build is completed we can package the image to create a boot file enabling us to check the image processing pipeline is present.
The final thing we need to do in Petalinux is create a SYSROOT which can be used by the Vitis Platform.
We can build the SYSROOT using the command
petalinux-build --sdk
Once the SDK is built the next step is to install it, we can find the SDK within the images/linux directory of our project.
To install it use the command, and select the directory you want to install the SYSROOT in.
./sdk.sh
Vitis Platform DevelopmentWe are now in a position to create the Vitis Acceleration Platform, we can open Vitis using the command.
vitis -workspace wksp1
Once Vitis opens you will be presented with a welcome page, select Create Platform Project.
Enter a new project name
Select create from XSA hardware specification
Point to the XSA exported from Vivado
For the software specification select Linux Operating System and PSU Cortex A53.
This will open the platform setting options
On the Linux domain, point the tool to the bif file, image, boot elements and the SYSROOT. These will all be under the Petalinux project images/linux directory with the exception of the SYSRoot
Once all the information has been provided, build the platform project, this will take a few minutes and then we are ready to start creating a test application.
To test the acceleration capabilities of the platform, the project we are going to create is a test application. This will allow us to demonstrate the acceleration flow functions as expected on our initial platform from Vivado.
To get started on this we need to create a new application project.
On the platform selection window, we should now see the platform just created and that it is available for embedded application development and acceleration development.
On the next dialog tab, select the Vector Addition Example.
This will create a new application project with everything configured to accelerate the kernel
From the drop down menu, on the top left select hardware and build the application. This may take some time, but once it is completed you can open the Vivado project, inside this project you should see the original base image processing pipeline and the accelerated kernel.
On my virtual machine the build took just under 2 hours.
To finalize the testing we need to run the SD Card on the Ultra96V2.
We can run the application by executing the commands
cd /run/media/mmcblk0p1
export XILINX_XRT=/usr
./auto_s3_app.exe binary_container_1.xclbin
This should show the results below
We can also check the image processing path is still complete using the command
media-ctl -p
We are now in a position that we have created an acceleration patofmr whihc can be used to accelerate our image processing application. What algorithm that is depends upon your needs but the Xilinx XF::OpenCV will provide significant support for accelerating image processing algorithms.
Comments