Streamlining FIRs in System Generator

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
    I'm working on an FPGA implementation of a digital reciever.  I
decided to use Xilinx's Systen Generator for MATLAB Simulink since I
have little experience programming FPGAs and just want to get something
working.  In my design, I used the Xilinx FIR block, which implements a
distributed arithmetic filter, for a couple of low-pass filters with 16
to 64 taps. The design simulates well but I seem to be utilizing a ton
of FPGA resources and want to trim the design down a bit. What is the
most efficient implementation of a FIR for this application?  The part
is an XC2VP50 running with a 100MHz clock rate.  I'd like to keep the
sample rate at 100MHz as well. One thing I noticed is that the input
data type is 32.30 but the output data type is 50.47, which I then cast
back down as a 32.30.  Obviously this seems like a waste of resources
but I don't see anywhere to specify the interal precision for the
accumulators/multipliers.  Thanks for your time,
-Ira Thorpe
 UF Physics

Re: Streamlining FIRs in System Generator


I don't believe you will be able to specify the internal intermediate
results precision, the only way to do this would be to design the
complete filter from scratch. I don't believe Xilinx would provide you
with synthesizable RTL code to do a modification of the core.

Also, in the case of Spartan devices,  there is a MAC (multiply
accumulate) and DA (distributed arithmetic) version of the FIR core ,
althought I don't know if this is true for the XC2VP series of devices.
You may like to experiment with both cores to compare resource

Finally, in the Spartan 3 device I would try enabling coefficient
storage in BRAM instead of LUTs to save slices - there is likely a
similar option for your core.

Good luck!


Re: Streamlining FIRs in System Generator
Quoted text here. Click to load it

If you have 100 Mhz clock rate and 100 Mhz sample rate, then won't every
FIR tap will require its own hardware multiplier and adder, plus lots of
pipelining? Sounds like a ton of resources to me. What numbers are you

If you can clock at X * sample rate then theoretically at least, you
would only need 1/X that many multipliers. Ray Andraka's web site has
stuff you may be interested in:


Site Timeline