How to optimize my design area to fit?

Hi all,

I am currently building a Digital Down Converter on Xilinx System Generator 9.1 platform which unfortunately overshot resources provided by mt target Virtex 4 chip by 400%. Is there any optimization way to shrink it down.?

Reply to
hilo_pupu
Loading thread data ...

It's unlikely that any back-end "pushbutton" optimizations will reduce your final size by a factor of 4.

If it is possible to reduce your size, you'll need to do the tweaking in the System Generator. Make sure you aren't over- specifying the speed requirements. Sometimes you can gain a lot of area using time-multiplexing to share resources. Not knowing more details of your design it's hard to say exactly how to accomplish this.

HTH, Gabor

Reply to
Gabor

Yes, suitable optimization functions can only be found in some minds of engineers thesedays. I would be suprised if anybody finds a set of switches that "optimize this DDC design by 400% in area".

Reply to
filter001

That is a bad overshoot ! The first thing that I would look at is memory usage ie are you using block RAMs ? Make sure you don't have some memories being turning into flops ! I have had some versions of Synplicity do this on pretty benign code, where previous versions synthesized the memory as Block RAM. Look at your synthesis report.

Other things you can do is logic sharing ie use one block to serve multiple parts of the design by clocking it at a higher frequency than the rest of the design and multiplex the input and outputs.

You need to get a handle on which the biggest blocks are and tackle those first if they are not 3rd party IP.

Ultimately You may have to face that your target FPGA is far too small of course. You wouldn't be the first one ...

Regards

--

Dunstan Power

ByteSnap Design Ltd,
Web:	www.bytesnap.co.uk
Reply to
Dunstan Power

Not knowing anything about the system generator, but knowing a thing or two about DDC, my suspicion is that you have run into one of two issues (assuming this is the usual DDS-CIC-FIR type downconverter):

- Your sin/cos LUT is not in RAM or has more entries than needed. If you want, I can send you a matlab script that will show you the tradeoff between various LUT table lengths, wordlengths vs. noise that may help you design this better.

- Your FIR filter is attempting to do many parallel multiplies, this is very resource intensive. One of the reasons for downsampling is that you can reduce the data rate in the FIR and not have 20 parallel multipliers in a 20 tap FIR, but rather do them sequentially in between new samples coming into the FIR stage.

Chris

Reply to
Chris Maryan

The resources of Xilinx FPGAs - BlkRAMSs - multipliers - LUTS - FFs - DCMs all can be the first to max out. You need to look at your .mrp report to get an idea of which resoures max'd out first. Cutting back an implementation by 75%, and still have identical functionallity sounds pretty tough.

However, Basic Debug 101 should tell you to go back to the core generator, and selectively enable/disable certain features of your DDS so you can get an idea of the resource size of each block element. Then you can explore whether parts of design might be reduced in performance to give area reductions. ie, the DSP Slice in V4 contains multipliers that are 18x18. You might want to choose tap coefficients 18 bits or less. Otherwise multiple slices may be used for single multiply.

I think the FIR is usually the resource hog in the downconverter. You might want to temporarily skip the DDS, and compile the FIR with different attribute sizes to get an idea of what the resource usage might be.

Good luck.

--

Regards,
John Retta
Owner and Designer
Retta Technical Consulting Inc.

Colorado Based Xilinx Consultant

phone : 303.926.0068
email : jretta@rtc-inc.com
web   :  www.rtc-inc.com
Reply to
John Retta

If your DDS is synthesizing a frequency that divides the main clock evenly, you can implement it as a small lookup table. Otherwise, I'd advise using a first-order Taylor series sin/cos calculation. This takes only a single BRAM and a I think three multipliers (which may be shared based on the folding factor).

To ensure that you are using the minimum number of filters in your filter chain, read Crochiere and Rabiner's 1975 paper "Optimum FIR Digital Filter Implementations for Decimation, Interpolation, and Narrow-Band Filtering". And read this appnote, which includes a SysGen design that you can modify:

formatting link

-Kevin

Reply to
Kevin Neilson

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.