Chapter 4: Using PYNQ

This is the fourth part of the Peripheral Control With HLS series of posts. You can head to Chapter 3 to start using the Xilinx tools and flash some LEDs or Chapter 5 to get started with some numerical computing on both the PS and PL.

In this chapter, we'll be trying to control the LEDs from PYNQ, much like what the base overlay allows us to do. We'll also try to make it a little more convoluted for the sake of flexing our muscles. As in the previous chapter, all of the source code for this post can be found in the leds subdirectory of the hls repo.

Flashing LEDs. Again.

HLS

In hls/pynq_01/vitis_hls/src, you'll find a new synthesisable function:

#include "../include/leds.hpp"

void pynq_leds(
    const int input[4], ap_uint<4>& leds
) {
#pragma HLS INTERFACE ap_ctrl_none port=return
#pragma HLS INTERFACE s_axilite    port=input
#pragma HLS INTERFACE ap_none      port=leds

    static ap_uint<4> led_state = 0b0000;

    for (int i=0; i<4; ++i) {
#pragma HLS unroll
        led_state[i] = input[i] > 0 ? 0b1 : 0b0;
    }

    leds = led_state;
    return;

}

We have a new argument type; good old int*! The purpose of this function is to accept some argument from the PS via the PYNQ interface; since we're going to be writing python code, we need to accept a variable that python can work with, hence four integers -- one integer for each LED that we're trying to control. If the integer is larger than zero, then the LED will be lit up, otherwise it will be turned off. Notice that we're still assigning leds from led_state; QQQQQQ

Now we can go ahead and add this block to Vivado as we did in the previous chapter. Although we're not making input external this time. This time, we're going to be connecting the input port to the PS. So in Vivado add another block in the designer window: the "ZYNQ7 Processing System", an interface to the PS that's instantiated in the PL, and click through the helpful suggestions in the green bar of the block designer window.

QQ: Need to add image here.

You'll see that a couple of new blocks get added and connected here; the "Processor System Reset" and the "AXI Interconnect". The former takes reset signals and logic and outputs reset signals for various different modules. In our case, we only need a peripheral reset so only that output is connected to the AXI interconnect block. This AXI interconnect block acts as a bridge between the PS and the IP core we've instantiated on the PL. All of the talking between the PS and PL takes place over AXI interfaces; in this case, input of our synthesised core is an AXILite interface, and the AXI interconnect block performs the conversion from a generic AXI output from the PS into the register-based interface that AXILite uses. Anyway, go ahead and generate the bitstream of the design as we did in the previous chapter.

We now arrive at the part where we can start playing with PYNQ. When we generated the bitstream, a bunch of files get created. The ones we're interested in are the actual bitstream (the .bit file which can be found at vivado/vivado.runs/impl_1/design_1_wrapper.bit) and a hardware handoff file (with extension .hwh, found at vivado/vivado.gen/sources_1/bd/design_1/hw_handoff1/design_1_wrapper.hwh) ). This hardware handoff file is used by PYNQ to identify the Zynq configuration, interrupts, resets, etc.; effectively a description of the core we've synthesised for the PS to use.

PYNQ

Go ahead and boot up the PYNQ-Z2 with the SD card containing the PYNQ image and access the filesystem from your browser.

Now we finally get to play around with Jupyter notebooks and feel like we're learning how to use python for the first time! We want to load our bitstream onto the PL, which can be done through the pynq module:

from pynq import Overlay

overlay = Overlay("pynq_leds_bitstream.bit")
pynq_leds_core = overlay.pynq_leds_0

We can append a ? to any object or function that we may wish to invoke to get the associated documentation with that instance. First of all, let's take a look at some information about our pynq_ip_core object:

overlay.ip_dict["pynq_leds_0"]

You'll see a dictionary is returned with properties of the object, much of it not particularly important for us. However, quite a lot of this is understandable without too much background information:

{
    "addr_range": 65536,
    "phys_addr": 1073741824,
    "registers": {
        "Memory_input_r": {
            "address_offset": 16,
            "size": 16,
            "access": "read-write"
        }
    }
}

This details everything we need to know about the memory-mapping of our core, i.e. those addresses in DRAM that the IP core has its addresses mapped to. Note that all of these values are given in bytes. From address 1073741824, the core has 65536 bytes allocated to it. Recall that our IP core is very simple with a single input argument: input. This is conveniently documented by the "Memory_input_r" dictionary; starting at an offset of 16 from the start of the memory-mapped range, 16 bytes are used for the input argument of our core (looking back at our HLS, we see that the input argument is an array of four integers, so 16 bytes total as the dictionary entry tells us). Consequently, if we write some integer greater than zero at the locations 16, 20, 24 and 28 offset from the start of the memory-mapped address range, we should turn on the LEDs!

This is most easily accomplished using the mmio member object of our pynq_leds_core:

pynq_leds_core.mmio.write(16, 1)
pynq_leds_core.mmio.write(20, 1)
pynq_leds_core.mmio.write(24, 1)
pynq_leds_core.mmio.write(28, 1)

All four of the LEDs should light up at this point. We've basically duplicated the function of the LED class that PYNQ exposes in its base overlay at this point, but much more excitingly gone from generating an FPGA IP core to controlling it with python -- pretty cool stuff!

You may be curious about why our memory map for the registers begins at byte 16 of the valid address range. This is because the Vitis tool uses the first 16 bytes for various control signals; you can see the list of these here. In terms of playing around with them, we have a read/write register located at 0x0 containing the ap_start signal, i.e. whether the core has started or not. This signal needs to be high for the core to do anything:

pynq_leds_core.mmio.read(0)
$>

For the sake of verifying this register acts how we'd expect, you can go ahead and disable the core and try setting some LEDs, which should now be unresponsive:

pynq_leds_core.mmio.write(0, 1)
pynq_leds_core.mmio.write(16, 0)

Controlling The Core

We can obtain finer control over the IP core that we've instantiated on the PL with PYNQ. This will become particularly useful in Chapter 5, but hopefully the reader will see the merits in what we're about to discuss before needing to move onto the next chapter.

HLS and Vivado

Let's modify our LED example for the final time:

void pynq_leds(
    const int input[4], ap_uint<4>& leds
) {
#pragma HLS INTERFACE mode=ap_ctrl_hs port=return bundle=BUS_A
#pragma HLS INTERFACE mode=s_axilite  port=return bundle=BUS_A
#pragma HLS INTERFACE mode=s_axilite  port=input  bundle=BUS_A
#pragma HLS INTERFACE mode=ap_none    port=leds
...

We've modified the return port of the block from its previous ap_ctrl_none mode and now introducing a block-level control protocol whereby we're able to control and query whether the core is active, idling, interrupted, etc. from PYNQ. First of all, we've changed the protocol to ap_ctrl_hs, where hs stands for "handshake" involving the signals ap_start, ap_idle, ap_ready and ap_done. To start the core, we'll have to set ap_start high from PYNQ; the core will then obligingly set its output ap_idle low. ap_ready will then be output high at some point indicating that the block is ready for input/ output. Once the core has done its processing, ap_done will be asserted with ap_ready and ap_idle deasserted.

We've also indicated that return should use the s_axilite interface, so that we can interact with all of the signals above using the memory-mapped registers we used in the previous section. The other new syntax we've used is the bundle option. Bundles allow us to use the same interface block in Vivado for signals that use the same interface. In this case, our input argument uses s_axilite, so we can tell the synthesis tool to use a single AXI-lite block in our design, and both input and the control signals can be bundled to use this single block. This allows us to save on hardware resources, since otherwise return and inputs would have separate AXI-lite interfaces instantiated on the PL.

That's it -- go ahead and synthesise this design, then generate a bitstream with Vivado as we have done previously.

PYNQ

Let's take a look at the memory-mapping of our core now:

overlay.ip_dict["pynq_leds_0"]["registers"]

{
  'CTRL': {
    'address_offset': 0,
    'size': 32,
    'access': 'read-write',
    'description': 'Control signals',
    'fields': {
      'AP_START': {
        'bit_offset': 0,
        'bit_width': 1,
        'description': 'Control signals',
        'access': 'read-write'
      },
      'AP_DONE': {
        'bit_offset': 1,
        'bit_width': 1,
        'description': 'Control signals',
        'access': 'read-only'
      },
      'AP_IDLE': {
        'bit_offset': 2,
        'bit_width': 1,
        'description': 'Control signals',
        'access': 'read-only'
      },
      'AP_READY': {
        'bit_offset': 3,
        'bit_width': 1,
        'description': 'Control signals',
        'access': 'read-only'
      }
    }
  },
  'Memory_input_r': {
    'address_offset': 16,
    'size': 16,
    'access': 'read-write',
    'description': 'Memory input_r',
    'fields': {}
  }
}

We've chopped out quite a lot of the output and kept only the stuff we're interested in for this particular section. We already encountered Memory_input_r in the previous section so needn't concern ourselves with that. The CTRL though has the signals we've just been discussing -- we can read from/ write to these registers and control the core. The core will be idle before we actually try and do anything:

ip_core = overlay.pynq_leds_0

ip_core.mmio.read(0x0)
$> 4

So the AP_IDLE signal is high while the rest are low. We can load up the inputs register so that the core will be ready to ready for them when we start it

ip_core.mmio.write(0x10, 0x1)
ip_core.mmio.write(0x14, 0x1)
ip_core.mmio.write(0x18, 0x1)
ip_core.mmio.write(0x1C, 0x1)

and now start the core:

ip_core.mmio.write(0x1, 1)

which will result in all of our LEDs turning on. Checking the status of the control signals:

ip_core.mmio.read(0x0)
$> 14

So AP_IDLE and AP_READY are high, i.e. the core has nothing to do, but is ready to accept new inputs, while AP_START and AP_DONE are low (AP_DONE will only be asserted for a single cycle once the core finishes, so we don't expect to see this). We can keep on loading the Memory_input_r registers and setting AP_START high to control when the core actually becomes active.

Final Thoughts

Of course, for our example of turning on LEDs, it doesn't make a great deal of sense to perform this control signal handshaking -- we just want to either set the LEDs on or off. We'll find in our next lesson though that these control signals allow us to do some pretty nifty things, and are quite critical for numerical computing.

Comments

Chapter 4: Using PYNQ

Flashing LEDs. Again.

HLS

PYNQ

Controlling The Core

HLS and Vivado

PYNQ

Final Thoughts

Comments

Published

Category

Get In Touch