Introduction
Hardware design
Device tree
Copy files to Kria KR260
Kernel code
Userspace code
Testing
Finishing up
Final notes

Published December 5, 2024 © GPL3+

Using MCDMA from Linux on Kria KR260

How do you use Multi Channel DMA to send data from the PS (running Ubuntu) to the PL?

IntermediateFull instructions provided2 hours261

Things used in this project

Hardware components

AMD Kria™ KR260 Robotics Starter Kit

Digilent Pmod I2S2

Software apps and online services

Ubuntu iot limerick classic desktop 22.04

AMD Vivado Design Suite

Story

Introduction

MCDMA, or Multi-Channel Direct Memory Addressing, enables the transfer of two (or more) data streams from one resource to another. If you're new to DMA and find it a bit challenging, I recommend starting with this post, it will give you some pointers to learn about DMA and help in getting to know how a Linux Kernel Module (LKM) works.

To use MCDMA on Linux effectively, following "best practices" involves loading a dedicated Linux Kernel Module (LKM) into the kernel. Detailed instructions for creating and using such an LKM can be found here, but I noticed some changes were needed. In this post, I’ll walk you through how I successfully implemented MCDMA on the Kria KR260, utilizing a modified version of the user-space program mentioned on that site to interface with the LKM. I also made minor adjustments to the hardware design to get everything up and running.

The primary goal of this setup is to send two audio streams (in WAV format) to the Programmable Logic (PL) for processing (low pass/high pass filters, reverb, gain), and then output them to speakers via I2S.

To achieve this, I’ll use Vivado 2024.1 to design and export the hardware, and XSCT to generate the device tree. The device tree requires slight modifications, which I’ll also cover.

Hardware design

I wanted the design to receive 2 AXI streams via MCDMA from the PS. Then those streams need to go to the I2S output. I reused the AXIS to I2S transmitter which I also used in the previous project. When creating the hardware design as stated in the Linux MCDMA resource (see introduction), I noticed the AXI4-Stream Switch did not do what it is supposed to do. The TDEST field, which is set by the MCDMA and should be used to split the stream to two destinations, was not being used. I ended up changing the AXI4-Stream Switch IP to an AXI4-Stream Interconnect which does use the destination field correctly. This way the two dma channels get sent via one dma stream and are being separated in the PL as I wanted.

Let's start, create a new project, select the KR260 starter kit and set the custom slots, make it an extensible Vitis platform. You can set both settings afterwards if needed.

Create a block design with the following IPs (refer to the previous post to find Whitney Knitter's article that describes which settings you need to set in the Zynq Ultrascale+ IP):

2x Processor System Reset, you need one for each clock domain, we have a clock domain for I2S and one for the rest of the system.
2x AXI Memory Mapped to Stream Mapper, these will control the configuration and reloading of the FIR compiler.
AXI Smartconnect
2x Constants to config coherency of the zynq
Concat to combine interrupts
Zynq Ultrascale+
AXI Interconnect
AXI BRAM Controller used to control our audioplayer from Linux
Block Memory Generator with memory type: True Dual Port RAM
AXI Multi Channel Direct Memory Access
AXI4-Stream Interconnect
2x AXI4-Stream Subset Converter
FIR Compiler
Slice
2x AXIS_I2S

I used two clock domains, one for the I2S @12.288MHz and one for the rest @100MHz:

Set the properties of the AXI MCDMA:

1 / 2

Set the properties of the AXI4-Stream-Interconnect

1 / 3

Set the properties of the AXI4-Stream Subset Converter:

Build the bitstream and export the Platform (include the bitstream). Save the XSA as kr260_filter.xsa in the KR260_base_dma folder, so one above the project folder.

Device tree

Next we use xsct to create the devicetree.

This is my folder structure (on my laptop):

yuri@desktop:~/KR260_base_dma$ tree
.
├── bd
│   ├── kria_bd
├── dma_file_transfer
├── extracted
│   ├── kria_bd_wrapper
├── KR260_base
│   ├── KR260_base.cache
│   ├── KR260_base.gen
│   ├── KR260_base.hw
│   ├── KR260_base.ip_user_files
│   ├── KR260_base.runs
│   ├── KR260_base.sim
│   └── KR260_base.srcs
├── kr260_custom_platform
│   ├── dtg_output_filter
├── src
├── tb
└── xdc

In a terminal run these commands to start creating the devicetree

$ cp KR260_base/KR260_base.runs/impl_1/kria_bd_wrapper.bin dma_file_transfer/kr260_filter.bit.bin
$ cd kr260_custom_platform/
$ rm -rf dtg_output_filter/
$ source /tools/Xilinx/Vitis/2024.1/settings64.sh  
$ xsct

In xsct run these 3 commands:

xsct% hsi::open_hw_design ../kr260_filter.xsa
xsct% createdts -hw ../kr260_filter.xsa -zocl -platform-name kr260_filter -git-branch xlnx_rel_v2024.1 -overlay -compile -out ./dtg_output_filter/
xsct% hsi::open_hw_design ../kr260_filter.xsa
xsct% exit

Next cd into the folder where the source pl.dtsi is located and change it:

$  cd dtg_output_filter/dtg_output_filter/kr260_filter/psu_cortexa53_0/device_tree_domain/bsp/
$  micro pl.dtsi

Add the following snippet right after the mcdma block:

dma_proxy {   
    compatible ="xlnx,dma_proxy";   
    dmas = <&axi_mcdma_0 0  &axi_mcdma_0 1>;
    dma-names = "dma_proxy_tx_0", "dma_proxy_tx_1";
};

This makes sure the dma_proxy LKM will recognize the 2 dma channels to transmit data.

Next compile the devicetree using dtc, you may have to use Linux (in a virtualbox) to do this (no idea if dtc works on Windows), share your project folder with the virtualbox in such a case :

$  dtc -I dts -O dtb -o pl.dtbo pl.dtsi
$  cp pl.dtbo /data/vakken/2425/S1/RND/git/KR260_base_dma/dma_file_transfer/kr260_filter.dtbo
$  cd ../../../../../../..

Create shell.json :

$ micro dma_file_transfer/shell.json

Copy/paste this content:

{
"shell_type" : "XRT_FLAT",
"num_slots" : "1"
}

Copy files to Kria KR260

Now copy those 3 files to the Kria into a folder for the hardware design. These have to be located in a subfolder in /lib/firmware/xilinx/

So login to the Kria via ssh (you might need to copy your public key in .ssh/authorized_keys first and start ssh using sudo systemctl start ssh):

$ sudo mkdir /lib/firmware/xilinx/filter
$ sudo scp yuri@10.42.0.1:/KR260_base_dma/dma_transfer_files/* /lib/firmware/xilinx/filter/

And now load the hardware design:

$ sudo xmutil listapps
$ sudo xmutil unloadapp
$ sudo xmutil loadapp filter

Kernel code

Clone the git repository and copy the dma-proxy.c to dma_audio_driver.c (in case you would like to change it, this way we also have a nice name for our module):

$ mkdir ~/Programming
$ cd ~/Programming
$ git clone https://github.com/Xilinx-Wiki-Projects/software-prototypes.git
$ cd software-prototypes/linux-user-space-dma/Software/Kernel
$ cp dma-proxy.c dma_audio_driver.c

Create a Makefile :

$ micro Makefile

Copy/Paste this content into the Makefile :

INC = /home/ubuntu/Programming/software-prototypes/linux-user-space-dma/Software/Common/
EXTRA_CFLAGS += -I$(INC)

obj-m += dma_audio_driver.o
all:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

Check that when you copy/paste, the spaces before the make command is a <tab> and not <space>'s. Otherwise you get an error.

Compile the kernel code :

$ make
$ ls -al

You should see a dma_audio_driver.ko file now. Load the driver:

$ sudo insmod dma_audio_driver.ko
$ lsmod | grep dma

Userspace code

The code needed some small changes to be able to stream to wav files.

In main() the arguments are being checked and then the list of wav-files in the given folders is populated.

Two threads are being started, one for each channel. We read each wav-file in chunks of 2048 bytes, put those into memory where the dma controller can find them and we start the dma controller.

I also wanted our application to listen to control signals to be able to Start/Stop/Pause or switch to Next/Previous songs (not all of these have been implemented in the code attached below, but you'll be able to add the code). So I added an AXI BRAM controlled 'audiocontroller', in the loop of the TxThread you will see some code that checks /dev/mem this reads BRAM at 0x90050000 to see if some bits are set, if so it executes a command (next chan0/next chan1/stop) and clears that bit.

Create a file audioplayer.c and copy the contents of the file at the bottom of this article into it.

Testing

Make sure the correct hardware design and the kernel module are loaded:

$ sudo xmutil listapps
$ lsmod | grep dma_proxy

Time to collect some sample wav-files. Create a wav folder with two folders in your home directory and copy files in them.

$ mkdir -p ~/wav/chan0
$ mkdir -p ~/wav/chan1

You will have to find some wav-files yourself.

Build and run the application, it requires 4 parameters, the two folders with files for each channel and two numbers which are indices of which number to start playing:

$ gcc audioplayer.c -o audioplayer -I ../Common/
$ sudo ./audioplayer "~/wav/chan0" "~/wav/chan1" 3 8

Write to BRAM to control the audioplayer, following controls are currently implemented:

"1": next track on channel0
"2": next track on channel1
"4": stop/exit

$ sudo devmem2 0x90050000 w 2

This could be another program or even the PL controlling the player.

Finishing up

Each time you update your hardware design and want to renew it on the Kria I suggest to :

xmutil unloadapp your previous hardware design
rmmod dma_audio_driver the kernel driver
copy the files via scp
xmutil loadapp your new hardware design
insmod dma_audio_driver.ko the kernel again
start your application

Final notes

That's it, it is not hard at all! Small note on the /dev/mem being used in the audioplayer code, this is not best practice as it requires us to start the audioplayer with sudo. We should use UIO for this... next time! Thanks for reading the story.

audioplayer.c

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <pthread.h>
#include <time.h>
#include <sys/time.h>
#include <stdint.h>
#include <signal.h>
#include <sched.h>
#include <time.h>
#include <errno.h>
#include <string.h>
#include <dirent.h>
#include <sys/types.h>
#include <sys/param.h>

#include <string.h>
#include "dma-proxy.h"
#include <stdbool.h>

#define TX_CHANNEL_COUNT 2
#define RX_CHANNEL_COUNT 0
#define TX_BUFFER_COUNT 2
#define WAV_HEADER_SIZE         44              // WAV header size (standard 44 bytes)
#define AUDIO_SAMPLE_RATE       48000   // Play at 48KHz
#define AUDIO_BIT_DEPTH         16              // 16 bit audio
#define AUDIO_CHANNELS          2               // Stereo audio
#define BUFFER_SAMPLES          2048    // Buffer and transfer 1 second of audio each time
#define SINE_FREQUENCY          440     // frequency to play an A note

#define MAX_FILES 100 // Adjust as needed
#define MAX_PATH_LENGTH 1024 // Adjust as needed

const char *file_list_chan0[MAX_FILES];
const char *file_list_chan1[MAX_FILES];
int file_count_chan0;
int file_count_chan1;
int current_song_chan0;
int current_song_chan1;

#define BRAM_BASE_ADDR 0x90050000
#define BRAM_SIZE 0x2000 // Adjust as needed
typedef struct {
    int fd;
    uint32_t* bram;
} BRAMReader;

BRAMReader audiocontroller;

// Function to initialize BRAMReader
int initBRAMReader(BRAMReader* reader) {
    reader->fd = open("/dev/mem", O_RDWR | O_SYNC);
    if (reader->fd == -1) {
        perror("open");
        return -1;
    }

    reader->bram = (uint32_t*)mmap(NULL, BRAM_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, reader->fd, BRAM_BASE_ADDR);
    if (reader->bram == MAP_FAILED) {
        perror("mmap");
        close(reader->fd);
        return -1;
    }

    return 0;
}

// Function to clean up BRAMReader
void cleanupBRAMReader(BRAMReader* reader) {
    if (reader->bram != MAP_FAILED) {
        munmap(reader->bram, BRAM_SIZE);
    }
    if (reader->fd != -1) {
        close(reader->fd);
    }
}

// Function to read data from BRAM
int readBRAMData(BRAMReader* reader, size_t offset, uint32_t* data) {
    if (offset >= (BRAM_SIZE / sizeof(uint32_t))) {
        fprintf(stderr, "Offset out of bounds\n");
        return -1;
    }
    *data = reader->bram[offset];
    return 0;
}

// Function to write data to BRAM
int writeBRAMData(BRAMReader* reader, size_t offset, uint32_t data) {
    if (offset >= (BRAM_SIZE / sizeof(uint32_t))) {
        fprintf(stderr, "Offset out of bounds\n");
        return -1;
    }
    reader->bram[offset] = data;
    return 0;
}

const char *tx_channel_names[] = { "dma_proxy_tx_0","dma_proxy_tx_1", /* add unique channel names here */ };
const char *rx_channel_names[] = { "dma_proxy_rx", /* add unique channel names here */ };

/* Internal data which should work without tuning */

struct channel {
        struct channel_buffer *buf_ptr;
        int fd;
        pthread_t tid;
        const char* filename;
};

static int verify;
static int test_size;
static volatile int stop = 0;
int num_transfers;

struct channel tx_channels[TX_CHANNEL_COUNT], rx_channels[RX_CHANNEL_COUNT];

/**
 **
 ** Functions to Load a wav file
 **
 **/

/* WAV header structure (44 bytes total) */
struct WavHeader {
    char chunkID[4];       // "RIFF"
    uint32_t chunkSize;    // Overall size of the file - 8 bytes
    char format[4];        // "WAVE"
    char subchunk1ID[4];   // "fmt "
    uint32_t subchunk1Size; // Size of the fmt chunk (16 for PCM)
    uint16_t audioFormat;  // Audio format (1 for PCM)
    uint16_t numChannels;  // Number of channels (1 for mono, 2 for stereo)
    uint32_t sampleRate;   // Sampling rate (e.g., 44100 Hz)
    uint32_t byteRate;     // Byte rate = SampleRate * NumChannels * BitsPerSample / 8
    uint16_t blockAlign;   // Block align = NumChannels * BitsPerSample / 8
    uint16_t bitsPerSample; // Bits per sample (e.g., 16 bits)
    char subchunk2ID[4];   // "data"
    uint32_t subchunk2Size; // Size of the data chunk
};

/* Function to print WAV header information */
void print_wav_header(const struct WavHeader *header) {
    printf("Chunk ID: %.4s\n", header->chunkID);
    printf("Chunk Size: %u\n", header->chunkSize);
    printf("Format: %.4s\n", header->format);
    printf("Subchunk1 ID: %.4s\n", header->subchunk1ID);
    printf("Subchunk1 Size: %u\n", header->subchunk1Size);
    printf("Audio Format: %u\n", header->audioFormat);
    printf("Number of Channels: %u\n", header->numChannels);
    printf("Sample Rate: %u\n", header->sampleRate);
    printf("Byte Rate: %u\n", header->byteRate);
    printf("Block Align: %u\n", header->blockAlign);
    printf("Bits Per Sample: %u\n", header->bitsPerSample);
    printf("Subchunk2 ID: %.4s\n", header->subchunk2ID);
    printf("Subchunk2 Size: %u\n", header->subchunk2Size);
}
/* We need to know the total filesize if we are going to read chunks of the wav file */
int get_file_size(const char* filename, int *size) {
    FILE *file = fopen(filename, "rb");
    if (!file) {
        perror("Error opening file");
        return -1;
    }
    fseek(file, 0, SEEK_END);
    *size = ftell(file);
    fclose(file);
    return 0;
}

/* Read a chunk of the wav file into memory */
int get_wav_chunk(const char *filename, size_t start_byte, size_t chunk_size, bool *last, size_t *read_size, uint32_t *data_buf) {
    FILE *file = fopen(filename, "rb");
    if (!file) {
        perror("Error opening file");
        return -1;
    }

    // Move to the specified start byte
    fseek(file, start_byte, SEEK_SET);

    // Read data into buffer
    *read_size = fread(data_buf, 1, chunk_size, file);

    // Check if this is the last chunk
    *last = (*read_size < chunk_size);

    fclose(file);
    return 0;
}

/* Read the Header information of the wav-file */
int load_wav_header(const char *filename, struct WavHeader *header, size_t *data_size) {
    FILE *file = fopen(filename, "rb");
    if (!file) {
        perror("Error opening file");
        return -1;
    }

    // Read the WAV header
    if (fread(header, sizeof(struct WavHeader), 1, file) != 1) {
        perror("Error reading WAV header");
        fclose(file);
        return -1;
    }

    // Verify the "RIFF" and "WAVE" format
    if (strncmp(header->chunkID, "RIFF", 4) != 0 || strncmp(header->format, "WAVE", 4) != 0) {
        fprintf(stderr, "Invalid WAV file format.\n");
        fclose(file);
        return -1;
    }

    // Retrieve data size
    *data_size = header->subchunk2Size;

    fclose(file);
    return 0;
}

/*******************************************************************************************************************/
/* Handle a control C or kill, maybe the actual signal number coming in has to be more filtered?
 * The stop should cause a graceful shutdown of all the transfers so that the application can
 * be started again afterwards.
 */
void sigint(int a)
{
        stop = 1;
}

/*******************************************************************************************************************/
/* Get the clock time in usecs to allow performance testing
 */
static uint64_t get_posix_clock_time_usec ()
{
    struct timespec ts;

    if (clock_gettime (CLOCK_MONOTONIC, &ts) == 0)
        return (uint64_t) (ts.tv_sec * 1000000 + ts.tv_nsec / 1000);
    else
        return 0;
}
// Function to check if a specific bit is on
bool isBitOn(uint32_t data, int bit_position) {
    // Create a mask for the desired bit
    uint32_t mask = (1 << bit_position);

    // Check if the bit is on
    return (data & mask) != 0;
}
/*******************************************************************************************************************/
/*
 * The following function is the transmit thread to allow the transmit and the receive channels to be
 * operating simultaneously. Some of the ioctl calls are blocking so that multiple threads are required.
 */
void tx_thread(struct channel *channel_ptr)
{
        int i, counter = 0, buffer_id, in_progress_count = 0;
        int stop_in_progress = 0;
        size_t start_byte = 0;
        size_t chunk_size = BUFFER_SAMPLES;
    size_t read_size = 0;                      // The actual number of bytes read
    bool last = false;
        // Start all buffers being sent

        for (buffer_id = 0; buffer_id < TX_BUFFER_COUNT; !last && (buffer_id += BUFFER_INCREMENT)) {

                /* Set up the length for the DMA transfer and initialize the transmit
                 * buffer to a known pattern.
                 */
                channel_ptr->buf_ptr[buffer_id].length = chunk_size;

                printf("Start stream wav-file: %s", channel_ptr->filename);
                get_wav_chunk(channel_ptr->filename, start_byte, chunk_size, &last, &read_size, channel_ptr->buf_ptr[buffer_id].buffer);
                start_byte+=read_size;

                /* Start the DMA transfer and this call is non-blocking
                 *
                 */
                ioctl(channel_ptr->fd, START_XFER, &buffer_id);
        }

        /* Start finishing up the DMA transfers that were started beginning with the 1st channel buffer.
         */
        buffer_id = 0;

        while (!last && !stop) {
                //usleep(500);
                /* Perform the DMA transfer and check the status after it completes
                 * as the call blocks til the transfer is done.
                 */
                ioctl(channel_ptr->fd, FINISH_XFER, &buffer_id);
                if (channel_ptr->buf_ptr[buffer_id].status != PROXY_NO_ERROR)
                        printf("Proxy tx transfer error\n");

                // Read data from BRAM to control the audioplayer
            size_t offset = 0;
            uint32_t data;
            if (readBRAMData(&audiocontroller, offset, &data) == 0) {
                if (isBitOn(data, 0)){ // xxxx1 first bit, so sending 1 to BRAM_BASE_ADDRESS will trigger this
                        printf("Received command: next song on channel 0.\n");
                        current_song_chan0+=1;
                        current_song_chan0%=file_count_chan0;
                                tx_channels[0].filename = file_list_chan0[current_song_chan0];start_byte = 0;read_size = 0;
                                writeBRAMData(&audiocontroller, offset, data^0x0001);
                        }
                        if (isBitOn(data, 1)){ // xx1x second bit, value 2 will trigger this
                        printf("Received command: next song on channel 1.\n");
                        current_song_chan1+=1;
                        current_song_chan1%=file_count_chan1;
                                tx_channels[1].filename = file_list_chan1[current_song_chan1];start_byte = 0;read_size = 0;
                                writeBRAMData(&audiocontroller, offset, data^0x0002);
                        }
                        if (isBitOn(data, 2)){ // x1xx third bit, value 4 will trigger this stop command
                                printf("Received command stop.\n");
                                stop=1;
                                writeBRAMData(&audiocontroller, offset, data^0x0004);
                        }

            }

                /* Restart the completed channel buffer to start another transfer and keep
                 * track of the number of transfers in progress
                 */
                get_wav_chunk(channel_ptr->filename, start_byte, chunk_size, &last, &read_size, channel_ptr->buf_ptr[buffer_id].buffer);
                start_byte+=read_size;

                ioctl(channel_ptr->fd, START_XFER, &buffer_id);

end_tx_loop0:

                /* Flip to next buffer and wait for it treating them as a circular list
                 */
                buffer_id += BUFFER_INCREMENT;
                buffer_id %= TX_BUFFER_COUNT;
        }
}

/**
 ** NOT USED FOR NOW
*/
void rx_thread(struct channel *channel_ptr)
{
        int in_progress_count = 0, buffer_id = 0;
        int rx_counter = 0;

        // Start all buffers being received
        printf("Start all buffers\n");
        for (buffer_id = 0; buffer_id < RX_BUFFER_COUNT; buffer_id += BUFFER_INCREMENT) {

                /* Don't worry about initializing the receive buffers as the pattern used in the
                 * transmit buffers is unique across every transfer so it should catch errors.
                 */
                channel_ptr->buf_ptr[buffer_id].length = test_size;

                ioctl(channel_ptr->fd, START_XFER, &buffer_id);

                /* Handle the case of a specified number of transfers that is less than the number
                 * of buffers
                 */
                if (++in_progress_count >= num_transfers)
                        break;
        }

        buffer_id = 0;

        /* Finish each queued up receive buffer and keep starting the buffer over again
         * until all the transfers are done
         */
        while (1) {

                ioctl(channel_ptr->fd, FINISH_XFER, &buffer_id);

                if (channel_ptr->buf_ptr[buffer_id].status != PROXY_NO_ERROR) {
                        printf("Proxy rx transfer error, # transfers %d, # completed %d, # in progress %d\n",
                                                num_transfers, rx_counter, in_progress_count);
                        exit(1);
                }

                /* Verify the data received matches what was sent (tx is looped back to tx)
                 * A unique value in the buffers is used across all transfers
                 */
                if (verify) {
                        unsigned int *buffer = &channel_ptr->buf_ptr[buffer_id].buffer;
                        int i;
                        for (i = 0; i < 1; i++) // test_size / sizeof(unsigned int); i++) this is slow
                                if (buffer[i] != i + rx_counter) {
                                        printf("buffer not equal, index = %d, data = %d expected data = %d\n", i,
                                                buffer[i], i + rx_counter);
                                        break;
                                }
                }

                /* Keep track how many transfers are in progress so that only the specified number
                 * of transfers are attempted
                 */
                in_progress_count--;

                /* If all the transfers are done then exit */

                if (++rx_counter >= num_transfers)
                        break;

                /* If the ones in progress will complete the number of transfers then don't start more
                 * but finish the ones that are already started
                 */
                if ((rx_counter + in_progress_count) >= num_transfers)
                        goto end_rx_loop0;

                /* Start the next buffer again with another transfer keeping track of
                 * the number in progress but not finished
                 */
                ioctl(channel_ptr->fd, START_XFER, &buffer_id);

                in_progress_count++;

        end_rx_loop0:

                /* Flip to next buffer treating them as a circular list, and possibly skipping some
                 * to show the results when prefetching is not happening
                 */
                buffer_id += BUFFER_INCREMENT;
                buffer_id %= RX_BUFFER_COUNT;

        }
}

/*******************************************************************************************************************/
/*
 * Setup the transmit and receive threads so that the transmit thread is low priority to help prevent it from
 * overrunning the receive since most testing is done without any backpressure to the transmit channel.
 */
void setup_threads(int *num_transfers)
{
        pthread_attr_t tattr_tx;
        int newprio = 20, i;
        struct sched_param param;

        printf("Setup threads...\n");
        /* The transmit thread should be lower priority than the receive
         * Get the default attributes and scheduling param
         */
        pthread_attr_init (&tattr_tx);
        pthread_attr_getschedparam (&tattr_tx, &param);

        /* Set the transmit priority to the lowest
         */
        param.sched_priority = newprio;
        pthread_attr_setschedparam (&tattr_tx, &param);

// NOT USED FOR NOW
//      for (i = 0; i < RX_CHANNEL_COUNT; i++)
//              pthread_create(&rx_channels[i].tid, NULL, rx_thread, (void *)&rx_channels[i]);

        for (i = 0; i < TX_CHANNEL_COUNT; i++)
                pthread_create(&tx_channels[i].tid, &tattr_tx, tx_thread, (void *)&tx_channels[i]);
}

void list_wav_files(const char *dir_path, const char **file_list, int *file_count) {
    struct dirent *entry;
    DIR *dir = opendir(dir_path);

    if (!dir) {
        perror("opendir");
        exit(EXIT_FAILURE);
    }

    *file_count = 0;

    while ((entry = readdir(dir)) != NULL) {
        // Check if the file has a ".wav" extension
        const char *name = entry->d_name;
        const char *ext = strrchr(name, '.');
        if (ext && strcmp(ext, ".wav") == 0) {
            if (*file_count >= MAX_FILES) {
                fprintf(stderr, "Too many files. Increase MAX_FILES.\n");
                break;
            }

            // Allocate memory for the full path
            char *full_path = malloc(MAX_PATH_LENGTH);
            if (!full_path) {
                perror("malloc");
                exit(EXIT_FAILURE);
            }

            snprintf(full_path, MAX_PATH_LENGTH, "%s/%s", dir_path, name);
            file_list[*file_count] = full_path;
            (*file_count)++;
        }
    }

    closedir(dir);
}


/*******************************************************************************************************************/
/*
 * The main program starts the transmit thread and then does the receive processing to do a number of DMA transfers.
 */
int main(int argc, char *argv[])
{
        int i;
        uint64_t start_time, end_time, time_diff;
        int mb_sec;
        int buffer_id = 0;
        int max_channel_count = MAX(TX_CHANNEL_COUNT, RX_CHANNEL_COUNT);

        printf("DMA proxy test\n");

        signal(SIGINT, sigint);


        // Initialize the BRAM reader
    if (initBRAMReader(&audiocontroller) != 0) {
        fprintf(stderr, "Failed to initialize BRAMReader\n");
        return EXIT_FAILURE;
    }

        if (argc != 5) {
                printf("Usage: audioplayer <path to folder with wav files CHAN0> <path to folder with wav files CHAN1> <# to start from on CHAN0> <# to start from on CHAN1>\n");
                 exit(EXIT_FAILURE);
        }


        /* Get the wav files in the provided folders */
    list_wav_files(argv[1], file_list_chan0, &file_count_chan0);
    list_wav_files(argv[2], file_list_chan1, &file_count_chan1);
    current_song_chan0 = atoi(argv[3]);
    current_song_chan1 = atoi(argv[4]);


        /* Cleanup folder lists */
    for (int i = 0; i < file_count_chan0; i++) {
        printf("%s\n", file_list_chan0[i]);
        // free((void *)file_list_chan0[i]); // Free allocated memory
    }
    for (int i = 0; i < file_count_chan1; i++) {
        printf("%s\n", file_list_chan1[i]);
        // free((void *)file_list_chan1[i]); // Free allocated memory
    }
    printf("Folders given: %s %s \n",
                file_list_chan0[current_song_chan0], file_list_chan1[current_song_chan1]);
        printf("Arguments given: %d %d, total files in each folder : chan0 %d chan1 %d\n",
                current_song_chan0, current_song_chan1, file_count_chan0, file_count_chan1);


        /* Open the file descriptors for each tx channel and map the kernel driver memory into user space */
        printf("Open TX file descriptors\n");

        for (i = 0; i < TX_CHANNEL_COUNT; i++) {
                char channel_name[64] = "/dev/";
                strcat(channel_name, tx_channel_names[i]);
                tx_channels[i].fd = open(channel_name, O_RDWR);
                if (tx_channels[i].fd < 1) {
                        printf("Unable to open DMA proxy device file: %s\n", channel_name);
                        exit(EXIT_FAILURE);
                }
                printf("Opened channel %s, id: %d",channel_name, tx_channels[i].fd);
                tx_channels[i].buf_ptr = (struct channel_buffer *)mmap(NULL, sizeof(struct channel_buffer) * TX_BUFFER_COUNT,
                                                                                PROT_READ | PROT_WRITE, MAP_SHARED, tx_channels[i].fd, 0);
                if (tx_channels[i].buf_ptr == MAP_FAILED) {
                        printf("Failed to mmap tx channel\n");
                        exit(EXIT_FAILURE);
                }
        }

        tx_channels[0].filename = file_list_chan0[current_song_chan0];
        tx_channels[1].filename = file_list_chan1[current_song_chan1];

        /* Open the file descriptors for each rx channel and map the kernel driver memory into user space */
        printf("Open RX file descriptors\n");

        for (i = 0; i < RX_CHANNEL_COUNT; i++) {
                char channel_name[64] = "/dev/";
                strcat(channel_name, rx_channel_names[i]);
                rx_channels[i].fd = open(channel_name, O_RDWR);
                if (rx_channels[i].fd < 1) {
                        printf("Unable to open DMA proxy device file: %s\r", channel_name);
                        exit(EXIT_FAILURE);
                }
                rx_channels[i].buf_ptr = (struct channel_buffer *)mmap(NULL, sizeof(struct channel_buffer) * RX_BUFFER_COUNT,
                                                                                PROT_READ | PROT_WRITE, MAP_SHARED, rx_channels[i].fd, 0);
                if (rx_channels[i].buf_ptr == MAP_FAILED) {
                        printf("Failed to mmap rx channel\n");
                        exit(EXIT_FAILURE);
                }
        }

        /* Grab the start time to calculate performance then start the threads & transfers on all channels */

        start_time = get_posix_clock_time_usec();
        setup_threads(&num_transfers);

        /* Do the minimum to know the transfers are done before getting the time for performance */

//      for (i = 0; i < TX_CHANNEL_COUNT; i++)
//              pthread_join(tx_channels[i].tid, NULL);

        /* Grab the end time and calculate the performance */

        end_time = get_posix_clock_time_usec();
        time_diff = end_time - start_time;
        mb_sec = ((1000000 / (double)time_diff) * (num_transfers * max_channel_count * (double)test_size)) / 1000000;

        printf("Time: %ld microseconds\n", time_diff);
        printf("Transfer size: %lld KB\n", (long long)(num_transfers) * (test_size / 1024) * max_channel_count);
        printf("Throughput: %d MB / sec \n", mb_sec);

        /* Clean up all the channels before leaving */
        printf("Cleanup TX\n");
        for (i = 0; i < TX_CHANNEL_COUNT; i++) {
                pthread_join(tx_channels[i].tid, NULL);
                munmap(tx_channels[i].buf_ptr, sizeof(struct channel_buffer));
                close(tx_channels[i].fd);
        }
        printf("Cleanup RX\n");
        for (i = 0; i < RX_CHANNEL_COUNT; i++) {
                munmap(rx_channels[i].buf_ptr, sizeof(struct channel_buffer));
                close(rx_channels[i].fd);
        }
        /* Cleanup folder lists */
    for (int i = 0; i < file_count_chan0; i++) {
        printf("%s\n", file_list_chan0[i]);
        free((void *)file_list_chan0[i]); // Free allocated memory
    }
    for (int i = 0; i < file_count_chan1; i++) {
        printf("%s\n", file_list_chan1[i]);
        free((void *)file_list_chan1[i]); // Free allocated memory
    }
        printf("DMA proxy test complete\n");
        // Clean up resources
    cleanupBRAMReader(&audiocontroller);
        return 0;
}

Credits

Yuri Cauwerts

2 projects • 3 followers

The thoughts I write are mine alone, not borrowed from anyone or any company—unless they happen to be cleverly disguised as common sense.

Contact

Comments

Please log in or sign up to comment.

Using MCDMA from Linux on Kria KR260

Things used in this project

Hardware components

Software apps and online services

Story

Introduction

Hardware design

Device tree

Copy files to Kria KR260

Kernel code

Userspace code

Testing

Finishing up

Final notes

Code

audioplayer.c

Xilinx software prototypes

Credits

Yuri Cauwerts

Comments

Embed the widget on your own site

Using MCDMA from Linux on Kria KR260

Using MCDMA from Linux on Kria KR260

Things used in this project

Hardware components

Software apps and online services

Story

Introduction

Hardware design

Device tree

Copy files to Kria KR260

Kernel code

Userspace code

Testing

Finishing up

Final notes

Code

audioplayer.c

Xilinx software prototypes

Credits

Yuri Cauwerts

Comments

Related channels and tags