In our previous exercise, we talked about creating and packaging a traffic generator IP having a stream master interface. What we are going to do now is to use that IP to send data to the DDR memory of a Zybo Z7-10 board using direct memory access (DMA). This project is available in a Github repository. There are two parts in the project; (i) hardware development, (ii) software development. Under hardware development, we create a Vivado project with necessary IP blocks such as the Zynq processor, AXI DMA core, traffic generator IP etc., and then generate a bitstream and export the hardware platform. As for the software development, we use Vitis to create a platform project using the exported hardware and then create an application project in C to drive the FPGA logic. We will program the DMA and traffic generator registers according to our requirements. Once a DMA transaction has completed, an interrupt is generated. We write an interrupt service handler subroutine which we use to clear the interrupt flag and then re-program the DMA parameters to allow for the next transaction. Under the scope of this project, we will only be looking at FPGA to DDR data transfer, and not the other way around.
Vivado ProjectThe tcl file (dma_test.tcl) in the project includes all the commands required from building the project to generating the bitstream. The steps required to generate the bitstream are (i) clone the repository (ii) cd to the directory and lauch Vivado (iii) In the tcl console enter the command "source dma_test.tcl
". We will look at these tcl commands in detail now.
First we create a project and set the board part parameter which in our case is Zybo Z7-10.
create_project trafficgen_dma ./trafficgen_dma -part xc7z010clg400-1
set_property board_part digilentinc.com:zybo-z7-10:part0:1.0 [current_project]
Add the traffic generator IP repository into the project and update the IP catalog.
set_property ip_repo_paths ./ip_repo [current_project]
update_ip_catalog
Create the Vivado block design.
create_bd_design "design_1"
Now we need to insert the Zynq PS into the block design and configure it. We will also need to enable fabric interrupts. This is because the DMA core we are going to include needs to interrupt the processor when a memory write transaction has completed. We will also need to enable the high-performance AXI slave interface of the processor which will allow DMA master interface to transfer data to DDR memory.
startgroup
create_bd_cell -type ip -vlnv xilinx.com:ip:processing_system7:5.5 processing_system7_0
endgroup
set_property -dict [list CONFIG.PCW_USE_S_AXI_HP0 {1} CONFIG.PCW_USE_FABRIC_INTERRUPT {1} CONFIG.PCW_IRQ_F2P_INTR {1}] [get_bd_cells processing_system7_0]
Alternatively, you can use Vivado IDE to configure these parameters. If you double click the PS in the block design, you will see interface configuration options under PS-PL configuration. The general purpose AXI master interface (M AXI GP0
) is enabled by default. This interface allows software programmers to write to/read from AXI Lite registers that belong to various peripherals.
After configuring the PS, you can run connection automation which will connect the Zynq PS with the external interfaces such as DDR and Fixed IO.
# Run connection automation
apply_bd_automation -rule xilinx.com:bd_rule:processing_system7 -config {make_external "FIXED_IO, DDR" apply_board_preset "1" Master "Disable" Slave "Disable" } [get_bd_cells processing_system7_0]
Then we'll add the Vivado AXI DMA IP. Since we are only concerned about using register mode DMA, we need to disable scatter-gather mode. In addition, we will also disable the read channel because that is not going to be used in this project.
#Add AXI DMA IP
startgroup
create_bd_cell -type ip -vlnv xilinx.com:ip:axi_dma:7.1 axi_dma_0
endgroup
#Disable scatter gather DMA and disable read channel
set_property -dict [list CONFIG.c_include_sg {0} CONFIG.c_sg_include_stscntrl_strm {0} CONFIG.c_include_mm2s {0}] [get_bd_cells axi_dma_0]
If you use the GUI to do the same, IP customization interface for DMA configuration is given below for your reference.
Then we will run connection automation again to connect various interfaces in the design. Doing this will automatically insert reset logic, AXI memory and peripheral interconnects to the design. You will also see that clock inputs of these modules have also been connected automatically.
# Run connection automation
startgroup
apply_bd_automation -rule xilinx.com:bd_rule:axi4 -config { Clk_master {Auto} Clk_slave {Auto} Clk_xbar {Auto} Master {/processing_system7_0/M_AXI_GP0} Slave {/axi_dma_0/S_AXI_LITE} ddr_seg {Auto} intc_ip {New AXI Interconnect} master_apm {0}} [get_bd_intf_pins axi_dma_0/S_AXI_LITE]
apply_bd_automation -rule xilinx.com:bd_rule:axi4 -config { Clk_master {Auto} Clk_slave {Auto} Clk_xbar {Auto} Master {/axi_dma_0/M_AXI_S2MM} Slave {/processing_system7_0/S_AXI_HP0} ddr_seg {Auto} intc_ip {New AXI Interconnect} master_apm {0}} [get_bd_intf_pins processing_system7_0/S_AXI_HP0]
endgroup
Also, notice the following.
- S_AXI_LITE interface of the DMA IP is automatically connected to M_AXI_GP0 interface of the processor through the AXI peripheral interconnect
- M_AXI_S2MM interface of the DMA IP is automatically connected to S_AXI_HP0 interface of the processor through the AXI memory interconnect.
However, S_AXIS_S2MM interface of the DMA is not connected yet. This is the interface that we use to send data from the traffic generator IP to DDR. So what we should do now is to insert the traffic generator and make therequired connections.
# Set up traffic generator
startgroup
create_bd_cell -type ip -vlnv user.org:user:trafficgen:1.0 trafficgen_0
endgroup
apply_bd_automation -rule xilinx.com:bd_rule:axi4 -config { Clk_master {/processing_system7_0/FCLK_CLK0 (50 MHz)} Clk_slave {Auto} Clk_xbar {/processing_system7_0/FCLK_CLK0 (50 MHz)} Master {/processing_system7_0/M_AXI_GP0} Slave {/trafficgen_0/S00_AXI} ddr_seg {Auto} intc_ip {/ps7_0_axi_periph} master_apm {0}} [get_bd_intf_pins trafficgen_0/S00_AXI]
Inserting the traffic generator and running connection automation will connect its slave AXI clock and reset signals, but we still have to connect the master side clock and reset signals. We will not be using a separate clock to drive the master. So we can connect the master clock to the same AXI slave clock. The same applies to the reset as well.
# Make connections
connect_bd_net [get_bd_pins trafficgen_0/m00_axis_aclk] [get_bd_pins processing_system7_0/FCLK_CLK0]
connect_bd_net [get_bd_pins trafficgen_0/m00_axis_aresetn] [get_bd_pins rst_ps7_0_50M/peripheral_aresetn]
Then We will connect S_AXIS_S2MM interface of the DMA to M00_AXIS streaming interface of the traffic generator which allows us to send data to DDR memory.
connect_bd_intf_net [get_bd_intf_pins trafficgen_0/M00_AXIS] [get_bd_intf_pins axi_dma_0/S_AXIS_S2MM]
The interrupt of the DMA will be connected to the processor. This is the line through which the DMA informs the processor that a transaction has completed.
connect_bd_net [get_bd_pins processing_system7_0/IRQ_F2P] [get_bd_pins axi_dma_0/s2mm_introut]
save_bd_design
Now that we are done with connecting all the components in the block design, the next step is to create an HDL wrapper and implement the design all the way through bit-stream generation.
# Create wrapper
make_wrapper -files [get_files ./trafficgen_dma/trafficgen_dma.srcs/sources_1/bd/design_1/design_1.bd] -top
add_files -norecurse ./trafficgen_dma/trafficgen_dma.srcs/sources_1/bd/design_1/hdl/design_1_wrapper.v
Or, you can use the GUI to create the wrapper.
Now we are ready to build the bitstream.
launch_runs impl_1 -to_step write_bitstream -jobs 2
Once the bitstream generation is completed, what we should do is export hardware.
We select the platform type as Fixed,
and we will include the bitstream.
Click Finish to save the hardware description (.xsa) file.
The next step in the workflow is to create a Vitis platform project to create an application. In Vivado project select Tools -> Launch Vitis IDE. Then select a preferred workspace and launch the IDE.
Then select the hardware description file you exported previously in Vivado and click Finish.
Once the platform project is created, we can start creating the application project where we write the driver to control the FPGA hardware.
Vitis Application ProjectIn Vitis, select File -> New -> Application Project.
Select the platform that you previously created.
Specify a name for the application project and click Next.
Select standalone_domain and click Next.
Select Hello World application from available templates and click Finish.
After the project is created, we need to import ps7_init.h header file. This header file includes definitions of various functions which are executed upon initialization of the PS. To do this right-click the src directory and click Import Sources.
Then browse to the directory dma_platform/hw
, select ps7_init.h
and ps7_init.c
files and click Finish.
Although we created a Hello World project, we don't really need the helloworld.c file. The reason why I picked the Hello World template is that it's going to automatically include platform.h and platform.c files into the src directory. We will now delete helloworld.c and include our own source files. Similar to what we did to include ps7_init.h and ps7_init.c, select the two source files (i) dmatest.c (ii) addressparams.h. These two files are available in the Github repository. dmatest.c is the file that contains the main program and other subroutines such as the interrupt handler. addressparams.h is the header file that includes offsets of various registers that are used by the program. After selecting the two files click Finish.
Now we have all the source files needed to build the project. Before we do that we'll take a look at dmatest.c and try to understand its functionality.
The main program starts by calling the following two functions. After initializing the processor, ps7_post_config
should be called to activate the level shifters in the boundary between the PS and the PL of the Zynq device, and to enable transferring data between the two regions.
init_platform();
ps7_post_config();
Next, we will need to set up traffic generator parameters. First, enable the core by writing 1 to the enable register, and then write NUM_OF_WORDS to the number of words register. We have picked 64 for NUM_OF_WORDS in this example.
Xil_Out32(XPAR_TRAFFICGEN_0_S00_AXI_BASEADDR, 1);
Xil_Out32(XPAR_TRAFFICGEN_0_S00_AXI_BASEADDR+0x4, NUM_OF_WORDS);
Then we initialize the DMA by setting bit 0 and 12 of the DMA control register. Bit 0 is the Run/Stop bit, and setting it to 1 will start the DMA operation. Setting bit 12 enables interrupts.
The next step is to configure the interrupt controller. We enable this to respond to AXI DMA S2MM interrupts.
Xil_Out32(XPAR_AXI_DMA_0_BASEADDR + OFFSET_S2MM_DMACR, Xil_In32(XPAR_AXI_DMA_0_BASEADDR + OFFSET_S2MM_DMACR) | 0x1001);
GicConfig = XScuGic_LookupConfig(XPAR_PS7_SCUGIC_0_DEVICE_ID);
if (NULL == GicConfig)
{
return XST_FAILURE;
}
int status = XScuGic_CfgInitialize(&InterruptController, GicConfig, GicConfig -> CpuBaseAddress);
if (status != XST_SUCCESS)
{
return XST_FAILURE;
}
status = SetupInterruptSystem(&InterruptController);
if (status != XST_SUCCESS)
{
return XST_FAILURE;
}
status = XScuGic_Connect(&InterruptController, XPAR_FABRIC_AXI_DMA_0_S2MM_INTROUT_INTR, (Xil_ExceptionHandler)InterruptHandler, NULL);
if (status != XST_SUCCESS)
{
return XST_FAILURE;
}
XScuGic_Enable(&InterruptController, XPAR_FABRIC_AXI_DMA_0_S2MM_INTROUT_INTR);
Then we will set up DMA transfer parameters; destination address and transfer length. OFFSET_MEM_WRITE
is the start address of the DDR memory where data needs to be written.
Xil_Out32(XPAR_AXI_DMA_0_BASEADDR+OFFSET_S2MMDA, OFFSET_MEM_WRITE);
Xil_Out32(XPAR_AXI_DMA_0_BASEADDR+OFFSET_S2MM_LENGTH, 4*NUM_OF_WORDS);
When an interrupt fired up, the interrupt handler should first clear the interrupt flag. This is done by setting bit 12 of the DMA status register. In our code, we have a variable to keep track of the number of frames. When this number reached FRAME_COUNT_MAX, we return from the interrupt handler without setting up DMA transfer parameters corresponding to the next transaction. This means that we will not be receiving further interrupts from the DMA after this. If the frame count is less than FRAME_COUNT_MAX, we will configure the DMA for the next transaction. The write address for the next DMA transaction is selected in such a way that it does not overlap with the previous address. In our example, we set this to be equal to 4*NUM_OF_WORDS*frame_count
which will ensure that the first address of the next DMA transaction will be right next to the last address of the current transaction.
void InterruptHandler(void)
{
//xil_printf("Interrupt triggered\n\r");
// Clear the interrupt
Xil_Out32(XPAR_AXI_DMA_0_BASEADDR+OFFSET_S2MM_DMASR, Xil_In32(XPAR_AXI_DMA_0_BASEADDR+OFFSET_S2MM_DMASR) | 0x1000);
if (++frame_count>FRAME_COUNT_MAX) return;
//Reprogram DMA transfer parameters
Xil_Out32(XPAR_AXI_DMA_0_BASEADDR+OFFSET_S2MMDA, OFFSET_MEM_WRITE+4*NUM_OF_WORDS*frame_count);
Xil_Out32(XPAR_AXI_DMA_0_BASEADDR+OFFSET_S2MM_LENGTH, 4*NUM_OF_WORDS);
}
Various registers that we used and their description can be found from the data sheet of the AXI DMA core.
We have set up everything required to run the example now. Connect the Zybo to the host computer using USB-A to micro USB cable. Connect to the serial console using any serial port communication program. I usually use minicom for serial communication. Communication parameters are given below.
Build the project and then click Run As -> Launch on Hardware (Single Application Debug).
You will notice the serial terminal will be populated with numbers generated using the traffic generator.
Notice that the counter runs from 1 to 64, rolls back to 1, and then restarts counting which is the expected behavior of the traffic generator we created.
We successfully implemented and tested a register mode DMA communication application by using the Zynq SoC and the external DDR memory chip of Zybo Z7-10 board. In this application, we performed FPGA to DDR data transfer. In the future, we will demonstrate an application involving DDR to FPGA data transfer, processing data on the FPGA and write back to DDR.
Comments