Pipelining can reduce the slice usage

Nov 14, 2006 2 Replies

Rate this thread:

Patrick Dubois 19 years ago

Hello all,

I realized something interesting today, adding registers in your design can actually reduce the slice usage (in Virtex-II Pro). For example, I started a design with minimal pilelining to keep it simple at first.

Before pipelining, I had the following usage: Slices: 2433 Flip Flops: 2287 LUT: 2981 (seems a bit high, might be a typo in my notebook)

After pipelining, I get: Slices: 1973 Flip Flops: 2615 LUT: 2069

I knew from the begining that pipelining would be needed but I didn't realize that it could save me some slices (on top of the obvious max frequency increase).

I might as well throw in a question while I'm at it. Now I need to pipeline further (need to go from 78 MHz to 100 MHz) but it gets more complicated, I would need to pipeline the function "to_signed (from float32)" from the VHDL-200X float_pkg. Any suggestions on how to do that? I read somewhere that one can add an extra level of registers, and let the tool figure out how to do the register retiming. I would use xst with the "register_balancing yes" flag, but I'm not sure how good xst is at register balancing.

Thanks.

Patrick Dubois

Vote

Daniel S. 19 years ago

Hi,

LUT usage goes down when some pipelining is added to large combinationnal functions because synthesis does not need to do as much logic replication to improve timings or use as many CLB resources (like LUTs) for signal buffering on high fanout paths.

As for register balancing, if you enable moving the first/last FFs along with register balancing, XST usually does a pretty decent job relocating up to two registers - here, I mean two extra registers besides the input and output ones. Beyond that, results tend to vary wildly and are mostly underwhelming.

After experimenting with pipelining some of my oldish code using XST's balancing, I feel like anything that needs more than two levels of automatic pipelining should be rewritten to be more explicitly pipelined

- otherwise, it becomes difficult to figure out exactly what should be happening exactly when and this makes debugging more painful than necessary.

Semi-automatic pipelining (up to two intermediate FFs) usually works well with XST and I have been consistently happy(*) with results ever since I started sticking to this as the upper limit of pipelining automation.

(*) Well, as happy as > Hello all,

Vote

Patrick Dubois 19 years ago

Thanks for the tips Daniel, I'll keep my extra registers to two levels then.

I finally achieved timing, by using register balancing but also by using truncated rounding instead of the "correct" rounding for the floating point operations. I also had my fair share of xst crashes (even with v8.2 SP3) with the new VHDL-200x librairies...

Patrick

Vote

Join the Discussion

Have something to add? Share your thoughts — no account required.

Didn't find your answer?

Ask the community — no account required