Lab: Pynq Memory Mapped IO (s_axilite)
This lab will guide you through the basics of using Pynq to develop an application on the Zynq / Ultrascale SoC. The application performs a simple hardware accelerated function on the programmable logic. We first create the IP core that performs the function \(f(x) = 2x\) using high level synthesis. We synthesize it to the programmable logic using the Vivado tools. Using the PYNQ infrastructure, we talk to the IP core from ARM processor using memory mapped I/O. We develop a Pynq notebook that sends data to the IP core, executes the core, and receives the computed results.
To simplify the steps and increase reproducibility, we will replace most GUI operations with command line scripts (so you can simple run one command instead of clicking 100 buttons).
You do have the option to use the GUI of Vitis Unified IDE. We’ll cover it here.
If you’re curious about GUI-based classic Vitis HLS, check here for a the legacy version of this lab using GUI with Xilinx Vivado and Vitis HLS 2022.2.
0) Vivado Design suite installation
Check here for installation and UCSD license server guide, though you probably would not need the license server for the projects in this course.
We provide a docker for you in case your machine is not supported by AMD FPGA tools. Check here for the docker image and instructions.
If you wish to use the download the software yourself, please refer to the official AMD guide here. For export control compliance, you will need to create an account if you do not already have one.
Both Vivado and Vitis are needed for the class. You should select the Vitis as the product to install, which includes Vivado design suite. AMD provides an installtion guide here.
UCSD students
Campus provides UCSD Linux cloud to run AMD tools. Log in using your AD username and password here. You can select a machine under ieng6 Linux Mint Remote Desktops. One server corresponds to a physical machine.
To solve some current package issues on ieng6 machines, run these commands (you only need to run this once, no need to run this every time you log in):
mkdir -p ~/xlnx_compat_fix
ln -s /usr/lib/x86_64-linux-gnu/libtinfo.so.6 ~/xlnx_compat_fix/libtinfo.so.5
Add the following lines to your ~/.bashrc file, these include loading AMD tools, a quick path to resolve some package issues, and beautify your terminal.
force_color_prompt=yes
PS1='\e[33;1m\u@\h: \e[31m\W\e[0m\$ '
module load xilinx-vitis
export LD_LIBRARY_PATH=~/xlnx_compat_fix/:$LD_LIBRARY_PATH
Remember to source ~/.bashrc for the changes to take effect.
1) Vitis HLS: C/C++ to RTL
Check the HLS source code here. This contains:
mul.cpp - Implements top-level function
mul.h - header file
mul_test.cpp - test bench
__hls_config__.ini - Record important HLS project settings including target clock period and board part. Also specify the name of the top function.
Makefile - Makefile to run the HLS tool from the command line.
Project 1 already showed you how to run a HLS project. This lab will focus on its integration with other components to form a proper FPGA system.
The circuits generated by your HLS design will not work on its own. Data has to be transferred in and out of the HLS block. Therefore, the interfaces of your block (i.e., the top level C++ function in mul.cpp must follow some protocol. In this lab, we use a very simple on-chip communication protocol AXI-Lite. Note lines 19-21 in mul.cpp:
#pragma HLS INTERFACE mode=s_axilite port=return
#pragma HLS INTERFACE mode=s_axilite port=in
#pragma HLS INTERFACE mode=s_axilite port=out
The code is already functional and synthesizable. You should be able to run C simulation and synthesis just like in project 1. Simply do:
make report
Our next step is to export our design as an IP core, which can be imported into Vivado later. You can run this command:
make ip
The IP core is a .zip file located at:
mul.comp/hls/impl/ip/xilinx_com_hls_mul_1_0.zip
At this point, you can exit and close Vitis HLS.
2) Vivado: RTL to bitstream
In this section, you will import your IP core to Vivado, build the system, and generate the bitstream.
2.1) Create a new project
Open Vivado and create a new project. If you are using Linux, it is recommended to launch Vivado from the same directory of the source files.
Select RTL Project and check Do not specify sources at this time
Set default part to xc7z020clg400-1
Under IP Integrator, click on Create Block Design
2.2) Import your design
Under Project Manager, click on IP Catalog. Right click inside the newly open ‘IP Catalog’ tab and select Add Repository. In the open window navigate to your Vitis HLS project folder and select <path_to_vivado_hls_folder>/hls/impl/ip/
You can see Mul under IP Catalog.
Click Open block design, then click +, add Mul_test IP block into our block design.
The IP block will appear in the block diagram:
Note that there are no more wires called “in” or “out”. Instead, there is a bus port named s_axi_control and in and out become addressable registers, as we set the interface through HLS pragmas. This AXI-Lite bus include all the hand-shaking signals and the actual data channels. You can expand the bus to see all the ports.
2.3) Add connections
In the same window, search for “zynq” and add ZYNQ7 Processing System to your block design.
Your diagram should look like the following:
On top of Diagram window, first click and complete Run Block Automation and then Run Connection Automation with default settings. Your diagram should change and show connections and a couple of extra IPs:
2.4) Generate bitstream
In Sources, right click on design_1 and select Create HDL Wrapper
Under Program and Debug, click on Generate Bitstream and follow instructions to complete synthesis, implementation and bitstream generation.
2.5) Bitstream, .hwh, and addresses
Before closing Vivado, we need to note our IP and its ports addresses.
Under Sources, open mul_test_control_s_axi.v (the exact name may vary across different versions of Vivado), scroll down and note addresses for in and out ports. We need these addresses for our host program.
In the example below for the streamMul, the addresses to pay attention to are 0x00 (control bus ap_ctrl), 0x10 (output), and 0x20 (input). These are the addresses you will need to use to write data to the fabric from the ARM core, start the fabric to run your design and generate your outputs, and then read your outputs from the fabric into the ARM core on the Pynq board.
The addresses above are within our IP. However, for the CPU to interact with our IP, it will also need the base address of our IP. You can find it under Address Editor.
The next step will be to run the design on an FPGA board. The software need 2 neseccary files;
Bitstream file (.bit) - to be flashed to the FPGA
Hardware handoff file (.hwh) - to be read by the Pynq software to understand the architecture hardware design.
3) PYNQ board and Host program
Download an appropriate image for your board from here and write it to your micro SD Card (PYNQ-Z2 setup instructions). It is recommended to use a dedicated software, e.g. Win32 Disk Imager to burn the .img file to the SD card instead of a simple copy-paste operation, if you find the board cannot booted properly. If you set the jumpers correctly, the boot process should not take more than a few minutes.
Use the ethernet cable to connect the board to your machine, and set the static address as stated in the PYNQ-Z2 tutorial. Optional: Connect the JTAG port on the board with your machine using MicroUSB line, and use serial communication software (like PuTTY or Serial Port Unility) to access the command-line-tools (The picture below is for demo only, you don’t need to run those commands). This is especially useful for fixing Linux-related issues of the board.
You can access jupyter notebook through the ipv4 address via a web browser. Create a new folder and notebook. Upload design_1_wrapper.bit from vivado_project_path/project_name.runs/impl1 and copy design_1.hwh from vivado_project_path/project_name.gen/sources_1/bd/design_1/hw_handoff to the folder you just created in Jupyter.
Make sure the .bit file and the .hwh file have the same name. In this case, we name them “design_1_wrapper.bit” and “design_1_wrapper.hwh”.
In the notebook, run the following code to test your IP
from pynq import Overlay
from pynq import MMIO
ol = Overlay("./design_1_wrapper.bit") # designate a bitstream to be flashed to the FPGA
ol.download() # flash the FPGA
mul_ip = MMIO(0x40000000, 0x10000) # (IP_BASE_ADDRESS, ADDRESS_RANGE), told to us in Vivado
inp = 5 # number we want to double
mul_ip.write(0x20, inp) # write input value to input address in fabric
print("input:", mul_ip.read(0x20)) # confirm that our value was written correctly to the fabric
mul_ip.write(0x00, 1) # set ap_start to 1 which initiates the process we wrote to the fabric
print("output:", mul_ip.read(0x10)) # read corresponding output value from the output address of the fabric
4) Kria board
If you are working with a Kria board, there are several necessary changes you have to make. Instead of selecting xc7z020clg400-1 as the part or pynq-z2 as the board, you will have to select xck26-sfvc784-2LV-c as the part or select Kria KV260 Vision AI Starter Kit SOM as the board. This is necessary both for Vitis HLS and Vivado. As the EDA tools must know what hardware they are targeting. The hardware must also match the board we are eventually using. If the bitstream and hwh generated by Vivado (in which the board is specified) is used in a different kind of board, the pynq software system will have trouble recognizing it.
In Vivado, the steps for adding IPs are the same, but Kria has a different PS (processing system, the ARM core on board) with pynq-z2. In “Add IP” window, select Zynq Ultrascale+ MPSoC instead of ZYNQ7 Processing System.
Then follow the green designer assistance and let the tool do “block automation” and “connection automation”. You probably have to run connection automation twice. Your block diagram should look like this (from project 2: CORDIC):
Note that there should be no ports named x, y, r, theta, as they all become some address in the s_axi_control bus.
You should also be able to find the module named control_s_axi_U under the file tree, and locate the address as the lab tutorial.
Setting up a Kria board for pynq is different, and a bit more complex. Please refer to the following resources: Pynq-supporting boards (find KV260) , Basic steps, Kria pynq repo
Basics of FPGA & PS-PL interaction
At architecture level, an FPGA is divided into 2 domains: PS and PL.
PS, or processing system, is an Arm core, in charge of controlling everything, managing memory, creating clock, etc. Consider this as the CPU. The big IP block in your diagram starting with “Zynq” is the PS.
PL, or programming logic, is basically everything else. The most important one is the IP you just designed in Vitis HLS, an efficient hardware dedicated for some task, or usually referred to as the “accelerator”. Some are auxiliary modules that are typically auto-managed by tools.
The accelerator cannot access data directly. The PS has to move the data between the memory and your accelerator. Thus the accelerator and the PS must be connected by some on-chip bus protocol. The easiest protocol is axi_lite. If you wish to put an accelerator on an FPGA, you must specify its port type during the design phase in Vitis HLS. Check Step 1 for the commands.