This is part 4 from a series of tutorials (full list is given at the bottom) describing a fast inter-arrival time pulse counter implemented in FPGA. The series covers all aspects of the design:
- counter hardware;
- streaming;
- packaging the design;
- DMA transfers;
- USB2 transfers by the device;
- receiving USB transfers by the host (a PC application)
Everything here is based on the popular Zynq hardware platform which makes all parts of the project easily interchangeable.
This tutorial describes the simplest possible program that runs on one of the processors in the PS and is capable of controlling our counter (turning it on and off) and receiving streamed data from it. After a prescribed amount of data has been received, the program will check it for accuracy and shut down.
2. Prerequisites- basic knowledge of C
- completed tutorials 1-3
This section deals with creating a simple 'HelloWorld' bare metal project. These steps are described in greater detail in the tutorials that accompany your Z-turn board.
Since this is purely a software project, we'll be using Xilinx Vitis (I am using version 2022.2, an earlier version should be fine as well). Launch Vitis and create a new application project (we'll also create a platform along the way).
After the welcome screen you'll be prompted to select a platform. We don't have one yet, so click on the Create a new platform from hardware (XSA) tab, this is where we'll make use of the.xsa file created in the last tutorial. Click on Browse and navigate to your design file. You should see a screen similar to the one below once the hardware file has been loaded.
Click Next and provide a name for your project. I named mine appDMA (we are going to accept default settings on this and the next screen). On the last screen choose Hello World from Embedded software development templates. Three projects are created as the result: design_iat2ch_wrapper,appDMA, and appDMA_system; they are located in separate folders on your hard drive and can to a degree be built independently.
If you right-click on appDMA_system project and select Build, all projects (design_iat2ch, appDMA, and appDMA_system) will be built in this order. There should be no surprises at this stage (we are just using a standard 'Hello World' template after all). You'll notice that the build process generated a BOOT.BIN file located at Workspace\appDMA_system\Debug\sd_card.
You can copy this file onto a SD card, insert the card into your Z-turn board, connect the board (USB UART port) to your computer using a USB2 cable and receive the 'Hello World' message from your bare metal application in a serial terminal program. I used Tera Term with the following settings; note that the baud rate under Setup -> Serial port has to be set to 115200).
Next, we update the project by deleting helloworld.c, platform_config.c, platform_.c and platform.h files and copying our own project files which you can find below:
- initRxDma.c and initRxDma.h. These files contain functions for setting up DMA transfers.
- customIPs.c and customIPs.h. Custom high level driver for our counter and event simulator.
- testEvSimDMA.c and testEvsimDMA.h. Contain the function for testing simulated data after finishing all transfers.
- main.c. This is the program that runs on the processor and brings all pieces together. It's job is to set up DMA and prepare the counter and the event simulator, start the counter and process (copy) data as it arrives before the data buffer for DMA rolls over, then check the stored data.
A few more words on the origins of this code. initRxDma files are modified examples provided with Vitis. You can get access to the originals by clicking on Import Examples on the Board Support Package page. We are using DMA in the interrupt (rather than poll) mode.
Finally, in addition to the high level driver customIPs we'll need the low level drivers for our IP components. These are available below (eventSimSmart and iatcollector2chSmart archived directories) and should be copied into the following locations:
- design_iat2ch_wrapper\zynq_fsbl\zynq_fsbl_bsp\ps7_cortexa9_0\libsrc
- design_iat2ch_wrapper\ps7_cortexa9_0\standalone_ps7_cortexa9_0\bsp\ps7_cortexa9_0\libsrc
I wrote these drivers based on examples autogenerated by Vivado. and discussions on Xilinx user forum. You get access to these examples when you ask Vivado to create a custom AXI IP, a subject covered in tutorial 2.
The next sections contains explanation of the settings and the code and can be skipped if you are just interested in the result.
5. Notes on setting up and processing DMA transfersOur goal with this project is to continuously receive data from the counter and process it in some fashion. Processing can mean different things (transferring data to the PC is one form of processing) but for now we'll simply copy the received data to a different memory location.
The operation of the DMA engine is described, among other places, in xaxidma.h file included in your design_iat2ch_wrapper project and you should read through the header of this file. In particular, I found this illustration very useful
Briefly, DMA operates with a ring buffer of memory addresses. Each entry in the ring buffer is known as a buffer descriptor (or Bd). The Bds don't store any transferred data, just the addresses for where the DMA engine should put the data (so we'll need to have an additional data buffer for that).
To set everything up for successful operation, you need to set aside a data buffer and then populate all Bds in the ring so that each Bd actually describes a memory location in your data buffer. Then you hand the Bds over to DMA hardware. The DMA engine uses the instructions in the Bds to put data in the appropriate locations. As soon as it's finished with a Bd, this Bd is marked appropriately telling you that it's ok to process that chunk of data (which you have to do before DMA runs out of Bds to use). We are using DMA in the scatter-gather (SG) mode which simplifies all operations considerably. Basically, referencing the picture above, all we have to do is call XAxiDma_BdRingFromHw to get Bds from hardware and then process the data and flush that portion of the data buffer (you don't have to return the Bds to the hardware, that is done automatically in the SG mode).
One more thing: when it comes to data processing, it makes sense to do it in predefined chunks. This is especially true for USB2 transfers, so, keeping this future application in mind, we'll define a certain processing data size.
Let's look at a specific example from initRxDma.h:
- I've defined the processing size to be 128 kB
- BD_SIZE is defined by the size of XAxiDma_Bd
- BUFFER_CAPACITY is how many processing chunks will be in the data buffer
- MAX_PKT_LEN is defined by the hardware setup
- MEM_BASE_ADDRESS was taken from the DMA example cited below
- to calculate how many Bds will be in the ring buffer, we take the size of our data buffer = PROCESSING_SIZE*BUFFER_CAPACITY and divide by MAX_PKT_LEN. To get the space needed for the ring buffer, multiply by the size of the Bd.
- Finally, RX_BUFFER is the space for our data buffer
We are going to take the approach of reacting to DMA interrupts since it frees the processor to do other things while we are waiting for DMA transfers to complete. The logic of this approach is simple: register a function that will be called to process the interrupt:
The job of this function is to retrieve the finished Bds from the hardware, take note of their number and return as soon as possible. Then in the main program we monitor RxDone and, if it exceeds a certain value, we process that part of the data buffer and flush it.
Now that we have all the pieces it is time to test our project. For the first time we'll be receiving real data, even if it does originate from an event simulator.
Go ahead and build your project just like you did with the 'Hello world' example, then transfer the BOOT.BIN file to your SD card. [If you run into errors, you may have to run Clean before rebuilding the project. Also, if you regenerated the board support package at some point you'll have to copy the low level drivers again.]
Before we connect the board to the computer, we should make some connections on the input channels. Even though our counter is receiving signals from an event simulator, it is expected that the signal from channel 2 is low and there are no events on channel 1. You can leave pin 1 on J5 (channel 1) disconnected and ground pin 3 (channel 2) as shown in the picture below.
If you now connect your board to a computer you should see the following output on the serial terminal:
That's it! You have successfully received data from PL in PS and processed it.
7. ConclusionWe have tested our hardware by running it in the event simulation mode, receiving data and comparing it to our expectation. The next step, transferring this data to a computer (a device with more processing power and storage space) will be the subject of the upcoming tutorial.
8. Full list of tutorials in this series1. Pulse counter implemented in FPGA: hardware (VHDL) design
2. Pulse counter streaming using AXI interface and packaging the counter as a custom IP.
3. Pulse counter on Zynq: complete hardware design.
4. Setting up DMA transfers to receive data from the streaming pulse counter.
5. USB2 bulk transfers and interrupts for high data transfer rates.
6. Working event counter with USB2 transfers and communications.
7. External (PC) testing software for receiving data from the counter.
Comments
Please log in or sign up to comment.