This is the fourth part of the Peripheral Control With HLS series of posts. You can head to Chapter 3 to start using the Xilinx tools and flash some LEDs or Chapter 5 to get started with some numerical computing on both the PS and PL.
In this chapter, we'll be trying to control the LEDs from PYNQ, much like
what the base overlay allows us to do. We'll also try to make it a little more
convoluted for the sake of flexing our muscles. As in the previous chapter,
all of the source code for this post can be found in the leds subdirectory of
the hls repo.
Flashing LEDs. Again.
HLS
In hls/pynq_01/vitis_hls/src, you'll find a new synthesisable function:
#include "../include/leds.hpp"
void pynq_leds(
const int input[4], ap_uint<4>& leds
) {
#pragma HLS INTERFACE ap_ctrl_none port=return
#pragma HLS INTERFACE s_axilite port=input
#pragma HLS INTERFACE ap_none port=leds
static ap_uint<4> led_state = 0b0000;
for (int i=0; i<4; ++i) {
#pragma HLS unroll
led_state[i] = input[i] > 0 ? 0b1 : 0b0;
}
leds = led_state;
return;
}
We have a new argument type; good old int*! The purpose of this function is
to accept some argument from the PS via the PYNQ interface; since we're
going to be writing python code, we need to accept a variable that python
can work with, hence four integers -- one integer for each LED that we're
trying to control. If the integer is larger than zero, then the LED will be
lit up, otherwise it will be turned off. Notice that we're still assigning
leds from led_state; QQQQQQ
Now we can go ahead and add this block to Vivado as we did in the previous
chapter. Although we're not making input external this time. This time, we're
going to be connecting the input port to the PS. So in Vivado add another
block in the designer window: the "ZYNQ7 Processing System", an interface to the
PS that's instantiated in the PL, and click through the helpful suggestions in
the green bar of the block designer window.
QQ: Need to add image here.
You'll see that a couple of new blocks get added and connected here; the
"Processor System Reset" and the "AXI Interconnect". The former takes reset
signals and logic and outputs reset signals for various different modules.
In our case, we only need a peripheral reset so only that output is
connected to the AXI interconnect block. This AXI interconnect block acts
as a bridge between the PS and the IP core we've instantiated on the PL.
All of the talking between the PS and PL takes place over AXI interfaces;
in this case, input of our synthesised core is an AXILite interface, and
the AXI interconnect block performs the conversion from a generic AXI output
from the PS into the register-based interface that AXILite uses. Anyway,
go ahead and generate the bitstream of the design as we did in the previous
chapter.
We now arrive at the part where we can start playing with PYNQ. When we
generated the bitstream, a bunch of files get created. The ones we're
interested in are the actual bitstream (the .bit file which can be found
at vivado/vivado.runs/impl_1/design_1_wrapper.bit) and a hardware
handoff file (with extension .hwh, found at
vivado/vivado.gen/sources_1/bd/design_1/hw_handoff1/design_1_wrapper.hwh)
). This hardware handoff file is used by PYNQ to identify the Zynq
configuration, interrupts, resets, etc.; effectively a description of the core
we've synthesised for the PS to use.
PYNQ
Go ahead and boot up the PYNQ-Z2 with the SD card containing the PYNQ image and access the filesystem from your browser.
Now we finally get to play around with Jupyter notebooks and feel like we're
learning how to use python for the first time! We want to load our bitstream
onto the PL, which can be done through the pynq module:
from pynq import Overlay
overlay = Overlay("pynq_leds_bitstream.bit")
pynq_leds_core = overlay.pynq_leds_0
We can append a ? to any object or function that we may wish to invoke to
get the associated documentation with that instance. First of all, let's
take a look at some information about our pynq_ip_core object:
overlay.ip_dict["pynq_leds_0"]
You'll see a dictionary is returned with properties of the object, much of it not particularly important for us. However, quite a lot of this is understandable without too much background information:
{
"addr_range": 65536,
"phys_addr": 1073741824,
"registers": {
"Memory_input_r": {
"address_offset": 16,
"size": 16,
"access": "read-write"
}
}
}
This details everything we need to know about the memory-mapping of our
core, i.e. those addresses in DRAM that the IP core has its addresses
mapped to. Note that all of these values are given in bytes.
From address 1073741824, the core has 65536 bytes allocated to
it. Recall that our IP core is very simple with a single input argument:
input. This is conveniently documented by the "Memory_input_r" dictionary;
starting at an offset of 16 from the start of the memory-mapped range, 16
bytes are used for the input argument of our core (looking back at our
HLS, we see that the input argument is an array of four integers, so
16 bytes total as the dictionary entry tells us). Consequently, if we
write some integer greater than zero at the locations 16, 20, 24 and 28
offset from the start of the memory-mapped address range, we should turn
on the LEDs!
This is most easily accomplished using the mmio member object of
our pynq_leds_core:
pynq_leds_core.mmio.write(16, 1)
pynq_leds_core.mmio.write(20, 1)
pynq_leds_core.mmio.write(24, 1)
pynq_leds_core.mmio.write(28, 1)
All four of the LEDs should light up at this point. We've basically duplicated
the function of the LED class that PYNQ exposes in its base overlay at this
point, but much more excitingly gone from generating an FPGA IP core to
controlling it with python -- pretty cool stuff!
You may be curious about why our memory map for the registers begins at byte
16 of the valid address range. This is because the Vitis tool uses the first
16 bytes for various control signals; you can see the list of these
here.
In terms of playing around with them, we have a read/write register located at
0x0 containing the ap_start signal, i.e. whether the core has started or
not. This signal needs to be high for the core to do anything:
pynq_leds_core.mmio.read(0)
$>
For the sake of verifying this register acts how we'd expect, you can go ahead and disable the core and try setting some LEDs, which should now be unresponsive:
pynq_leds_core.mmio.write(0, 1)
pynq_leds_core.mmio.write(16, 0)
Controlling The Core
We can obtain finer control over the IP core that we've instantiated on the PL with PYNQ. This will become particularly useful in Chapter 5, but hopefully the reader will see the merits in what we're about to discuss before needing to move onto the next chapter.
HLS and Vivado
Let's modify our LED example for the final time:
void pynq_leds(
const int input[4], ap_uint<4>& leds
) {
#pragma HLS INTERFACE mode=ap_ctrl_hs port=return bundle=BUS_A
#pragma HLS INTERFACE mode=s_axilite port=return bundle=BUS_A
#pragma HLS INTERFACE mode=s_axilite port=input bundle=BUS_A
#pragma HLS INTERFACE mode=ap_none port=leds
...
We've modified the return port of the block from its previous ap_ctrl_none
mode and now introducing a block-level control protocol whereby we're able
to control and query whether the core is active, idling, interrupted, etc.
from PYNQ. First of all, we've changed the protocol to ap_ctrl_hs, where
hs stands for "handshake" involving the signals ap_start, ap_idle,
ap_ready and ap_done. To start the core, we'll have to set ap_start
high from PYNQ; the core will then obligingly set its output ap_idle low.
ap_ready will then be output high at some point indicating that the
block is ready for input/ output. Once the core has done its processing,
ap_done will be asserted with ap_ready and ap_idle deasserted.
We've also indicated that return should use the s_axilite interface,
so that we can interact with all of the signals above using the memory-mapped
registers we used in the previous section. The other new syntax we've used
is the bundle option. Bundles allow us to use the same interface block
in Vivado for signals that use the same interface. In this case, our input
argument uses s_axilite, so we can tell the synthesis tool to use a single
AXI-lite block in our design, and both input and the control signals can
be bundled to use this single block. This allows us to save on hardware
resources, since otherwise return and inputs would have separate AXI-lite
interfaces instantiated on the PL.
That's it -- go ahead and synthesise this design, then generate a bitstream with Vivado as we have done previously.
PYNQ
Let's take a look at the memory-mapping of our core now:
overlay.ip_dict["pynq_leds_0"]["registers"]
{
'CTRL': {
'address_offset': 0,
'size': 32,
'access': 'read-write',
'description': 'Control signals',
'fields': {
'AP_START': {
'bit_offset': 0,
'bit_width': 1,
'description': 'Control signals',
'access': 'read-write'
},
'AP_DONE': {
'bit_offset': 1,
'bit_width': 1,
'description': 'Control signals',
'access': 'read-only'
},
'AP_IDLE': {
'bit_offset': 2,
'bit_width': 1,
'description': 'Control signals',
'access': 'read-only'
},
'AP_READY': {
'bit_offset': 3,
'bit_width': 1,
'description': 'Control signals',
'access': 'read-only'
}
}
},
'Memory_input_r': {
'address_offset': 16,
'size': 16,
'access': 'read-write',
'description': 'Memory input_r',
'fields': {}
}
}
We've chopped out quite a lot of the output and kept only the stuff we're
interested in for this particular section. We already encountered
Memory_input_r in the previous section so needn't concern ourselves with that.
The CTRL though has the signals we've just been discussing -- we can read
from/ write to these registers and control the core. The core will be idle
before we actually try and do anything:
ip_core = overlay.pynq_leds_0
ip_core.mmio.read(0x0)
$> 4
So the AP_IDLE signal is high while the rest are low. We can load up the
inputs register so that the core will be ready to ready for them when we start
it
ip_core.mmio.write(0x10, 0x1)
ip_core.mmio.write(0x14, 0x1)
ip_core.mmio.write(0x18, 0x1)
ip_core.mmio.write(0x1C, 0x1)
and now start the core:
ip_core.mmio.write(0x1, 1)
which will result in all of our LEDs turning on. Checking the status of the control signals:
ip_core.mmio.read(0x0)
$> 14
So AP_IDLE and AP_READY are high, i.e. the core has nothing to do, but is
ready to accept new inputs, while AP_START and AP_DONE are low (AP_DONE
will only be asserted for a single cycle once the core finishes, so we don't
expect to see this). We can keep on loading the Memory_input_r registers and
setting AP_START high to control when the core actually becomes active.
Final Thoughts
Of course, for our example of turning on LEDs, it doesn't make a great deal of sense to perform this control signal handshaking -- we just want to either set the LEDs on or off. We'll find in our next lesson though that these control signals allow us to do some pretty nifty things, and are quite critical for numerical computing.
Comments