In this simple tutorial I will cover, how to create Vitis-HLS project and integrate it with custom Verilog module as RTL blackbox.
There will be two blackbox functions - first in pipelined region (wire interfaces, ap_none), second in dataflow region (FIFO interfaces, ap_ctrl_chain).
1. Create C/C++ source files (C based HLS model + Testbench).We need to create C/C++ model of our module for C simulation, that includes function source code (module expected behaviour) and testbench (io stimulus and result checking).
According to ug1399-vitis-hls rtl black box, we are limited by several factors:
- Should be Verilog (.v) code.
- Must have a unique clock signal, and a unique active-High reset signal.
- Must have a CE signal that is used to enable or stall the RTL IP.
- Can use either the ap_ctrl_chain or ap_ctrl_none block level control protocols.
- Supports only C++.
- Cannot connect to top-level interface I/O signals.
- Cannot directly serve as the design-under-test (DUT).
- Does not support struct or class type interfaces.
main.cpp - C/C++ Testbench.
#include "add.hpp"
int main (void) {
static uint32_t a[1024];
static uint32_t b[1024];
static uint32_t c[1024];
static uint32_t c_stream[1024];
for (uint32_t i = 0; i < 1024; ++i) {
a[i] = i;
b[i] = i;
c[i] = 0;
c_stream[i] = 0;
}
top_module(a, b, c, c_stream);
for (uint32_t i = 0; i < 1024; ++i) {
if (c[i] != a[i] + b[i]) {
printf("Data does not match. %d vs %d\n", c[i], a[i] + b[i]);
return -1;
}
if (c[i] != c_stream[i]) {
printf("Data does not match. %d vs %d\n", c[i], c_stream[i]);
printf("Add modules have different results.\n");
return -2;
}
}
printf("Test succesfull.\n");
return 0;
}
add.hpp - Functions declarations.
#ifndef ADD_HPP
#define ADD_HPP
#include <cstdint>
#include <hls_stream.h>
void add(uint32_t a, uint32_t b, uint32_t &c);
void add_stream(
hls::stream<uint32_t> &a,
hls::stream<uint32_t> &b,
hls::stream<uint32_t> &c
);
void scalar_to_stream(uint32_t a, hls::stream<uint32_t> &a_stream);
void stream_to_scalar(hls::stream<uint32_t> &a_stream, uint32_t &a);
void wrap(uint32_t a, uint32_t b, uint32_t &c);
void top_module(uint32_t *a, uint32_t *b, uint32_t *c, uint32_t *c_stream);
#endif
add.cpp - Functions sources codes.
#include "add.hpp"
void add(uint32_t a, uint32_t b, uint32_t &c) {
c = a + b;
};
void add_stream(
hls::stream<uint32_t> &a,
hls::stream<uint32_t> &b,
hls::stream<uint32_t> &c
) {
c.write(a.read() + b.read());
};
void scalar_to_stream(uint32_t a, hls::stream<uint32_t> &a_stream) {
a_stream.write(a);
};
void stream_to_scalar(hls::stream<uint32_t> &a_stream, uint32_t &a) {
a = a_stream.read();
};
void wrap(uint32_t a, uint32_t b, uint32_t &c) {
#pragma HLS DATAFLOW
hls::stream<uint32_t> c_s;
hls::stream<uint32_t> a_s;
hls::stream<uint32_t> b_s;
scalar_to_stream(a, a_s);
scalar_to_stream(b, b_s);
add_stream(a_s, b_s, c_s);
stream_to_scalar(c_s, c);
};
void top_module(uint32_t *a, uint32_t *b, uint32_t *c, uint32_t *c_stream) {
#pragma HLS INTERFACE mode=m_axi port=a depth=1024 bundle=first
#pragma HLS INTERFACE mode=m_axi port=b depth=1024 bundle=second
#pragma HLS INTERFACE mode=m_axi port=c depth=1024 bundle=first
#pragma HLS INTERFACE mode=m_axi port=c_stream depth=1024 bundle=second
#pragma HLS INTERFACE mode=s_axilite port=return
main_loop_pipeline: for (uint32_t i = 0; i < 1024; ++i) {
uint32_t c_o;
uint32_t const a_t = a[i];
uint32_t const b_t = b[i];
add(a_t, b_t, c_o);
c[i] = c_o;
}
main_loop_stream: for (uint32_t i = 0; i < 1024; ++i) {
wrap(a[i], b[i], c_stream[i]);
}
};
2. Create config file for VitisVitis needs config file to build the project. Basic config file should contain
- Part - FPGA part number.
- syn.top - top function name.
- tb.file - test bench file.
- syn.file - file that is used in HLS.
In this example minimal version of cfg file could look like this:
part=xc7z007sclg225-1
[hls]
syn.top=top_module
tb.file=main.cpp
syn.file=add.cpp
syn.file=add.hpp
package.output.format=ip_catalog
flow_target=vivado
3. Create and build minimal project.Start Vitis, select workspace and click "Create HLS Component".
Change component location and name, click next.
Select creation from existing configuration file, click next.
Project structure should look like this:
There is no need to add additional flags, just double check if top function is "top_module", click next.
Select part (default part should be that written in cfg file), click next
Confirm flow_target and package.output.format, click next.
Check summary and click finish.
Lastly run all the steps to ensure that all is configured and working properly.
In this step we will replace our add function with blackbox verilog code. In the pipeline region:
Right click hls_component and click "Create RTL blackbox" that will generate JSON file, that describes connection between verilog module and its C function.
Select file that contains C module description.
Select port direction and fill RTL group configuration (port names in verilog module).
Select verilog file, fill other boxes if necessary, click next.
Remove ap_ctrl_chain_protocol strings, leave it with blanks. Click finish.
Repeat all these steps for add_stream function.
Input fifo:
Output fifo:
Summary:
Do not modify ap_ctrl_chain signals, as this module will use ap_ctrl_chain protocol.
After this two json files should be generated in our hls_component folder.
add.json
{
"c_files": [
{
"c_file": "add.cpp",
"cflag": ""
}
],
"c_function_name": "add",
"rtl_files": [
"add.v"
],
"c_parameters": [
{
"c_name": "a",
"c_port_direction": "in",
"rtl_ports": {
"data_read_in": "a"
}
},
{
"c_name": "b",
"c_port_direction": "in",
"rtl_ports": {
"data_read_in": "b"
}
},
{
"c_name": "c",
"c_port_direction": "out",
"rtl_ports": {
"data_write_out": "c",
"data_write_valid": "c_vld"
}
}
],
"rtl_top_module_name": "add",
"rtl_performance": {
"II": "0",
"latency": "0"
},
"rtl_resource_usage": {
"BRAM": "0",
"DSP": "0",
"FF": "0",
"LUT": "0",
"URAM": "0"
},
"rtl_common_signal": {
"module_clock": "ap_clk",
"module_reset": "ap_rst",
"module_clock_enable": "ap_ce",
"ap_ctrl_chain_protocol_idle": "",
"ap_ctrl_chain_protocol_start": "",
"ap_ctrl_chain_protocol_ready": "",
"ap_ctrl_chain_protocol_done": "",
"ap_ctrl_chain_protocol_continue": ""
}
}
add_stream.json
{
"c_files": [
{
"c_file": "add.cpp",
"cflag": ""
}
],
"c_function_name": "add_stream",
"rtl_files": [
"add_stream.v"
],
"c_parameters": [
{
"c_name": "a",
"c_port_direction": "in",
"rtl_ports": {
"FIFO_empty_flag": "a_empty_flag",
"FIFO_read_enable": "a_read_enable",
"FIFO_data_read_in": "a"
}
},
{
"c_name": "b",
"c_port_direction": "in",
"rtl_ports": {
"FIFO_empty_flag": "b_empty_flag",
"FIFO_read_enable": "b_read_enable",
"FIFO_data_read_in": "b"
}
},
{
"c_name": "c",
"c_port_direction": "out",
"rtl_ports": {
"FIFO_full_flag": "c_full_flag",
"FIFO_write_enable": "c_write_enable",
"FIFO_data_write_out": "c"
}
}
],
"rtl_top_module_name": "add_stream",
"rtl_performance": {
"II": "0",
"latency": "0"
},
"rtl_resource_usage": {
"BRAM": "0",
"DSP": "0",
"FF": "0",
"LUT": "0",
"URAM": "0"
},
"rtl_common_signal": {
"module_clock": "ap_clk",
"module_reset": "ap_rst",
"module_clock_enable": "ap_ce",
"ap_ctrl_chain_protocol_idle": "ap_idle",
"ap_ctrl_chain_protocol_start": "ap_start",
"ap_ctrl_chain_protocol_ready": "ap_ready",
"ap_ctrl_chain_protocol_done": "ap_done",
"ap_ctrl_chain_protocol_continue": "ap_continue"
}
}
Main folder should look similar to this:
hls_config.cfg file should have two new lines added syn.blackbox.file
part=xc7z007sclg225-1
[hls]
flow_target=vivado
csim.code_analyzer=0
syn.top=top_module
syn.blackbox.file=add.json
syn.blackbox.file=add_stream.json
tb.file=main.cpp
syn.file=add.cpp
syn.file=add.hpp
5. Create Verilog blackbox function.Function "add" must have wire interfaces, and ap_none as module interface. (Please check https://docs.amd.com/r/en-US/ug1399-vitis-hls/JSON-File-for-RTL-Blackbox for more information about module interfaces.)
According to UG1399 port a and b is input port of 32bit width, and output c port that is also 32bit width, but with extra valid signal, let's call it c_vld. Module also need ap_clk, ap_ce, ap_rst ports.
Verilog equivalent could look like this:
add.v
`timescale 1ns/1ps
module add (
input [31:0] a,
input [31:0] b,
output [31:0] c,
output c_vld,
input ap_ce,
input ap_rst,
input ap_clk
);
reg [31:0] c_d;
reg c_vld_d;
assign c = c_d;
assign c_vld = c_vld_d;
always @(posedge ap_clk) begin
if (ap_rst == 1'b1) begin
c_d <= 32'b0;
c_vld_d <= 1'b0;
end else begin
c_d <= (a + b) & {32{ap_ce}};
c_vld_d <= ap_ce;
end
end
endmodule
Run C synthesis and C/RTL cosimulation. You should be able to see packed add.v file in HLS module.
Click hls_config.cfg file and change with the help of Vitis GUI cosim.trace_level to all and run cosimulation.
Click on wave viewer. Vivado should pop up with XSIM.
Add grp_add_fu_134 signals to wcfg
The function behavior is weird... lets change blackbox function II in json and see how it impacts the simulation. Open add.json and change II to 10. Run C synthesis again and rerun C/RTL cosimulation.
Is add.v a good and working module? is it behaving correctly? what dictates if module is working correctly? How does "fixing" the module impact resource usage?
What about add_stream? function is placed in dataflow region and must contain fifo ports and ap_ctrl_chain protocol.
add_stream.v
`timescale 1ns/1ps
module add_stream (
input [31:0] a,
input a_empty_flag,
output a_read_enable,
input [31:0] b,
input b_empty_flag,
output b_read_enable,
output [31:0] c,
input c_full_flag,
output c_write_enable,
output ap_idle,
input ap_start,
output ap_ready,
output ap_done,
input ap_continue,
input ap_ce,
input ap_rst,
input ap_clk
);
reg a_read_enable_d;
reg b_read_enable_d;
reg c_write_enable_d;
reg [31:0] c_d;
assign a_read_enable = a_read_enable_d;
assign b_read_enable = b_read_enable_d;
assign c_write_enable = c_write_enable_d;
assign c = c_d;
assign ap_idle = !ap_start;
assign ap_ready = ap_start;
assign ap_done = ap_start;
//Flags are negated...
assign flags_good = a_empty_flag && b_empty_flag && c_full_flag;
assign hs_good = ap_start && ap_continue;
always @(posedge ap_clk) begin
if (ap_rst == 1'b1) begin
a_read_enable_d <= 0;
b_read_enable_d <= 0;
c_write_enable_d <= 0;
c_d <= 0;
end else if (ap_ce == 1'b1) begin
a_read_enable_d <= flags_good && hs_good;
b_read_enable_d <= flags_good && hs_good;
c_write_enable_d <= flags_good && hs_good;
c_d <= a + b;
end
end
endmodule
It seems that module placed in dataflow region is working:
Open add_stream.json and change latency to 10. Run C synthesis again and rerun C/RTL cosimulation. Does it affect simulation in any way?
I hope that this tutorial will be helpful to you, try and create your own modules, functions and have fun!
Comments
Please log in or sign up to comment.