Lab: AXI4-Burst Mode (m_axi)

Simple example of AXI4-Burst Mode

This lab is an example of AXI4 data transfer in burst mode. It takes in a given sample of values and provides the square root.

1) Vitis HLS: Generating RTL code from C/C++ code

In this section you learn how to create a project in Vitis HLS, synthesis your code, and generate RTL.

1.1) Download code and create a Vitis HLS project

This is the code we will be using:

#include "axi4_sqrt.hpp"
#include <string.h>
#include <math.h>

void axi4_sqrt(float *in, float *out, int len)
{
#pragma HLS INTERFACE s_axilite port=return bundle=sqrt
#pragma HLS INTERFACE s_axilite port=len bundle=sqrt
#pragma HLS INTERFACE m_axi depth=50 port=out offset=slave bundle=output
#pragma HLS INTERFACE m_axi depth=50 port=in offset=slave bundle=input
#pragma HLS INTERFACE s_axilite port=in bundle=sqrt
#pragma HLS INTERFACE s_axilite port=out bundle=sqrt

        float buff[100];
        memcpy(buff, (const float*) in, len * sizeof(float));

        for(int i = 0; i < len; i++)
                buff[i] = sqrt(buff[i]);

        memcpy(out, (const float*) buff, len * sizeof(float));
}

Note that we had to include string.h to be able to use memcpy. Additionally, we use memcpy instead of a for-loop (as used in AXI-streaming) to force Vitis HLS to infer an AXI4-Burst. Sometimes Vitis HLS will not infer this from a for-loop, but will for memcpy.

Also note that we had to set the bundles of s_axilite ports for in and out to the same bundle as the return port. This is required since Vitis HLS will otherwise assign the m_axi ports to a separate interface (instead of a single interface for all variables, which is easier to handle in Jupyter). Please see this Xilinx docs page for more information.

Download and unzip axi4_burst.zip that contains the above code. The .cpp file in the zip file actually does not match with above code, so please update it with the above code. Generate your project using the provided script.tcl file:

Linux: open a terminal, make sure your environment is set, navigate to streamMul folder, and run the following

$ vitis_hls -f script.tcl

Windows: open vitis_hls command line and run the following

$ vitis_hls -f script.tcl

Now you can open your project in Vitis HLS. It should look like this:

https://i.imgur.com/iHkVmWE.png

1.2) Generate RTL code and export it

Before running C Synthesis, in Vitis HLS, go to the Solution > Solution Settings, then under the “General” tab, in config_interface, de-select the m_axi_addr64 option (see screenshot below, specifically the row with the red cross). This is required because 64-bit addresses are used by default when m_axi` is used in the pragmas. And in Jupyter with Pynq, it is not clear how to write the values to 64 bit addresses. So de-selecting this option will ensure that 32-bit addresses are used.

https://i.imgur.com/o6gpaSv.png

Click on Run C Synthesis to generate RTL code. After it is done, you can check your resource utilization and timing report. Your latency is unknown (?) because your loop size (len) is a variable.

https://github.com/KastnerRG/pp4fpgas/raw/master/labs/images/pynq19.png

Note: you can also check the addresses of variables (that you need to write to in Jupyter) in the Synthesis summary under “S_AXILITE Registers”.

Now you can export your RTL code by clicking on Export RTL:

https://bitbucket.org/repo/x8q9Ed8/images/582121524-pynq3.png

After exporting is done, you can close and exit from Vitis HLS.

2) Vivado: Generating bitstream from RTL code

In this section we import our RTL code from last section, add some required IPs, and generate our bitstream.

2.1) Create a new Vivado project

Open your Vivado tool and create a new project. Select an appropriate location for your project and leave the default project name as is (project_1).

Select RTL Project and check Do not specify sources at this time.

Select xc7z020clg400-1 for your part:

https://bitbucket.org/repo/x8q9Ed8/images/3090594305-pynq4.png

2.2) Import RTL code

Under Flow Navigator, click on IP Catalog. Right-click on the opened window and select Add Repository. Navigate to your Vitis HLS project > solution1 > impl > ip and select it:

https://github.com/KastnerRG/pp4fpgas/raw/master/labs/images/pynq20.png

2.3) Add IPs to your design

Under Flow Navigator, click on Create Block Design. Leave the design name as is (design_1). In the newly opened window, you can add IPs by clicking on the plus sign.

Add ZYNQ7 Processing System to your design:

https://bitbucket.org/repo/x8q9Ed8/images/3814633603-pynq6.png

Double click on ZYNQ7 IP to customize it. In the opened window, double click on High Performance AXI 32b/64b Slave Parts:

https://bitbucket.org/repo/x8q9Ed8/images/148617913-pynq7.png

Select and check S AXI HP0 interface:

https://github.com/KastnerRG/pp4fpgas/raw/master/labs/images/pynq21.png

Add the axi4_sqrt IP to the design.

https://github.com/KastnerRG/pp4fpgas/raw/master/labs/images/pynq22.png

2.4) Automatic connections

  1. Click on Run Block Automation

  2. Click on Run Connection Automation and select all. Click OK.

https://github.com/KastnerRG/pp4fpgas/raw/master/labs/images/pynq23.png
  1. Click on Run Connection Automation again and select all. Click OK.

https://github.com/KastnerRG/pp4fpgas/raw/master/labs/images/pynq24.png

This is how the final design should look. Note: in Vivado 2023, it uses AXI Interconnect block instead of AXI SmartConnect, but it works.

https://github.com/KastnerRG/pp4fpgas/raw/master/labs/images/pynq25.png

2.5) Generate bitstream

  1. Save your design CTRL+S or File > Save Block Design.

  2. Validate your design: Tools > Validate Design.

  3. In Sources, right click on design_1, and Create HDL Wrapper. Now you should have design_1_wrapper.

  4. Generate bitstream by clicking on Generate Bitstream in Program and Debug.

2.6) Post bitstream Generation

In sources, expand design_1_wrapper::design_1_i::design_1::axi4_sqrt_0::design_1_axi4_sqrt_0_0::inst : axi4_sqrt, double click on axi4_sqrt_sqrt_s_axi_U , and note the address for in_r , out_r , len as 0x10 , 0x18 and 0x20 respectively. We need this addresses in our host program. These addresses can also be found in the Vitis HLS C Synthesis summary.

https://github.com/KastnerRG/pp4fpgas/raw/master/labs/images/pynq26.png

You can close and exit the Vivado tool.

Copy your project directory > project_1 > project_1.runs > impl_1 > design_1_wrapper.bit to your project directory > project_1 and rename it to axi4_sqrt.bit

Copy your project directory > project_1 > project_1.gen > sources_1 > bd > design_1 > hw_handoff > design_1.hwh to your project directory > project_1 and rename it to axi4_sqrt.hwh

These files need to have the same name (except for their file extension).

3) Host program

In this section, we use Python to test our design.

3.1) Move your files

Create a new folder in your PYNQ board and move both axi4_sqrt.hwh and axi4_sqrt.bit into it.

3.2) Python code

Create a new Jupyter notebook and run the following code to test your design:

from pynq import Overlay
from pynq import Xlnk # replace with allocate for Pynq >= 2.7
import numpy as np

ol = Overlay('axi4_sqrt.bit')
sqrt_ip = ol.axi4_sqrt_0 # if you can't find the IP, type 'ol.' and hit Tab to see what options are available
length = 40
inpt = Xlnk().cma_array(shape=(length,), dtype=np.float32)
outpt = Xlnk().cma_array(shape=(length,), dtype=np.float32)
a = [i*i for i in range(length)]
np.copyto(inpt, a)
soft_op = np.sqrt(inpt)

sqrt_ip.write(0x20, length)
sqrt_ip.write(0x10, inpt.physical_address)
sqrt_ip.write(0x18, outpt.physical_address)
sqrt_ip.write(0x00, 1)

print("Hardware Output", "Software Output \n")
for i in range(length):
    print(outpt[i], "\t\t  ", soft_op[i])