Example - FIR Filter

A Finite Impulse Response filter performs a convolution on an input sequence with a fixed set of coefficients

The HLS tool will analyse the code and produce a functionally equivalent RTL circuit

Note: This course doesn't go in detail of how the transpilation works - but is beneficial to understand to take advantage of memory layouts, pipelines, etc


Vivado HLS generates an optimised, but largely sequential architecture

  • Loops and branches are transformed into control logic
  • Conceptually similar to the execution of a RISC processor
    • But the program needs to be converted to an FSM in the RTL rather than being fetched from memory
  • A sequential architecture tends to limit the number of functional units in a design with a focus on resource sharing over massive parallelism

The modern FPGAs can only operate at 1GHz speeds? Why use a FPGA then????

Think of performance in 'task time in seconds' rather than clock speed.
Higher clock speeds do not mean that it is faster.

In FPGA designs we can parallelise things more than a generic CPU instruction can.
See below


One Tap Per Clock


Critical Path: 1 mult + 1 adder
Task Latency: 4 cycles

One Sample Per Clock


Critical Path: 1 mult + 2 adder
Task Latency: 1 cycle


Speed Limits

The task interval is limited by recurrences (feedback loops) and resource limits

Recurrence - where a computation by a component depends on the previous computation by the same component

  • i.e. see the above multiplier and adder example
  • These limit the throughput even when pipelining

It's important to restructure the code to maximise the performance of the HLS tool.