Distributed Arithmetic

Apr 10, 2006 5 Replies

Rate this thread:

nimayshah 20 years ago

Hello everybody, am new to the group and have a question. Right now i am undergoing final semester training where my project is " Design and Implementation of IP Core for Generic FIR Filter using distributed arithmetic". I've already made the code, simulated it and tested it on xilinx virtex xcv 1000. The results agree with those of the DA MATLAB module i've designed. I've compared the synthesis reports of my core with that of xilinx coregen DA FIR V9.0. The area usage is pretty much the same but the frequency is almost half. Also a stark difference in the synthesis report is that my core's lut synthesizes into a Block RAM and the Xilinx core uses nothing like that. So my questions are:

What does the core use for storing LUT contents?
What can i do for speed optimization? Please send in your replies as time is running out real fast. Regards, Nimay Shah

Vote

nimayshah 20 years ago

Also i forgot to inform, the HDL used is Verilog and the synthesis tool used is XST. Could the frequency problem be due to the fact that i am using both the clock edges for different processes?

Vote

Peter Alfke 20 years ago

If you use both clock edges, any path between the two clock domains has less than half a clock period available ( since it also must accomodate any duty-cycle difference from 50%.) That's not a smart design decision, if you want to achieve high performance. Peter Alfke

Vote

nimayshah 20 years ago

Thank you peter, but do you have any idea about the block ram issue? That is more important to me right now. Nimay

Vote

Symon 20 years ago

Hi Nimay, Here's an answer to 2)

The BlockRAMs (BRAMS) are slower than the CLB based RAMS. Check out the clock-to-output times for the BRAMs in the data sheet. Tcko. Or something like that. So, use two BRAMs and interleave between them. You could use both ports, but I guess you're using one port for dynamic loading? Cheers, Syms.

Vote

Nial Stewart 20 years ago

Are you using 'LUT' here as a general abbreviation for Look Up Table, ie where you store co-efficients etc?

In FPGA parlance a LUT is a (usually) 4 input single output combinatorical logic module.

If you want to store hard assigned values in the FPGA you can use the flip- flops, these can be initialised to specific values on power up. These will operate much faster than Blockrams, but will use a lot more of the FPGA fabric. What do the P&R reports of the two cores show the logic usage as?

The speed your design runs at can also be down to the way you've structured the design, the amount of combinatorial logic between registers etc. This is the sort of thing a more experienced designer will take into account when doing the initial hardware architecture design. (It's a Hardware Description Language remember, not software).

Sorry if this is teaching you to suck eggs but it's not clear exactly what you mean in your question.

Nial

Vote

Join the Discussion

Have something to add? Share your thoughts — no account required.

Didn't find your answer?

Ask the community — no account required