Hi, I'm working on implementing an FIR Filter on a FPGA (Spartan 3E), here's what i want to accomplish -->

The FIR Filter coefficients are generated on a host system using LabView, these coefficients are written to a RAM / PROM on a DSP card , the number of taps is constant but other parameters like sampling frequency and cut off frequencies can change according to requirements.

The FPGA reads these coefficients from the RAM / PROM and implements the FIR Filter.There should be a single bit file that is downloaded to configure the FPGA.

Any pointers in the right direction would be appreciated.

Your post looks like a previous post, but perhaps you didn't get the response you were looking for ...

So, here's a more detailed response.

Spartan 3E ranges from :

4-36 multipliers
2K-33K logic cells

Your choices for FIR implementation are : Distributed Arithmetic FIR Filters Multiplier Based FIR Filters

Distributed Arithmetic (DA) will tend to provide small size, high speed operations, but are more difficult to change coefficients (you have to calculate ROM LUT values from the coefficients).

Multiplier based FIR structures can take advantage of the built-in multipliers, and are much easier to reload, but there's a limited number of multipliers. Multiplier

more flexible than DA FIR structures. Multiplier structures (MAC) range (in area vs performance) from N multipliers - 1 clock cycle per computation, to N clocks and 1 multiplier (where N=the number of coefficients).

DA fir structures range in computation rate from 1 clock (fully parallel) to M clocks (where M = input bit width). Usually, the input bit width may be as high as 16 bits, so we're usually ranging up to 16 clocks.

Naturally, there's some wiggle room in both of the above paragraphs, as it's possible to take advantage of symmetry to decrease the number of multipliers in half for the MAC based FIR filters. Symmetry can also add another clock cycle to serial distributed arithmetic FIR filters while decreasing the number of ROM LUT's by half. There are other tricks such as polyphase decomposition for interpolation and decimation which can also reduce multiplier and ROM LUT usage.

Xilinx provides a distributed arithmetic FIR filter generator as part of ISE, and it produces good results, but since it's basically a black box, you'll be dependent on the vendor and may have to perform gate level simulation.

You may be interested in looking at a new clear text human readable Verilog based FIR filter generator from Optunis (

formatting link

), which also generates a testbench for impulse, step, and random response. It's new (still in Beta) and utilizes the hard multipliers built into Spartan 3E.

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.