Guidelines for Timing Closure on FPGAs

- K
- Kiran
  
  Contact options for registered users
posted
19 years ago

Wed, Aug 4, 2004 6:15 AM

Hi, I would like to know if there is any literature from Xilinx, Altera or anywhere that gives the designer a set of design guidelines in order to achieve timing closure on FPGAs. For example, I have seen some people use a guideline which says that the number of combinatorial levels in the design should not exceed say 7 levels or so. Another example is that the combinatorial delay should not exceed 50% of the clock period so that there is enough margin for routing delays. I have come across these through word of mouth. I would really like to get my hands on some literature to back these up. Thanks, Kiran.

- A
- Allan Herriman
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Wed, Aug 4, 2004 6:59 AM

Ouch! I'm currently writing something that will run at > 300MHz. I'm restricting it to 1 level of combinatorial logic and limiting the fanout to 10 ('cause I'm in a slow speed grade part).

Obviously rules of thumb are based around certain assumptions. You need to know how fast your chip is (w.r.t. your clock rate), whether you are doing floorplanning, etc.

Note that FPGA flip flops are basically free. Don't be afraid to use them to make the routing easier.

This one is fairly good, although floorplanning (good or bad) can make a difference. Note that a tightly packed part will often produce a few excessively long delays due to routing congestion.

Never seen this in writing. Experience is the key.

Regards, Allan.

- G
- Guy Eschemann
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Wed, Aug 4, 2004 7:37 AM

Check out the "Timing Closure - 6.1i" TechXclusive on the Xilinx support site.

Have fun, Guy.

- S
- Subroto Datta
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Wed, Aug 4, 2004 1:56 PM

Hi Kiran,

The chapter on "Design Optimization for Altera Devices" in the Quartus Handbook can answer some of your questions. It covers techniques for both resource closure (fitting with the smallest number of resources) and timing closure. This can be accessed from:

formatting link

The timing reports in the Quartus tool are very useful in giving you the delay of the critical paths, and in their breakup between Logic cell delay and Routing Delay. Quartus II 4.1 has two tools: the Resource Optimization Advisor and the Timing Optimization Advisor, which help you identify which tool capabilities are applicable in achieving your optimization goals. These are accessed from the Tools menu, and should be used after you have compiled your design.

Hope this helps.

- Subroto Datta Altera Corp.

- K
- Kiran
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Thu, Aug 5, 2004 6:27 AM

Hi Allan,

You are right. It depends on device, required clock rate, design etc.

How do you control fan-out? Is it through a tool-setting or through coding style itself?

This is not very clear. Do you mean that we should insert registers in the paths in the RTL code?

Is there a thumb rule with respect to device occupancy? Should it be restricted to say 75% of the device so that routing does not get congested?

Thanks, Kiran.

- A
- Allan Herriman
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Thu, Aug 5, 2004 7:43 AM

I use coding style. For example, if my source code has a signal feeding eight other LUTs, I know that the fanout is eight.

Note that for most designs (which aren't pushing the technology to the limits) it's quite reasonable to ignore fanout in your source code and rely on the fanout limit in your synthesiser. This should have a default of 50-100 or so. It will replicate logic to maintain this limit. E.g. if the fanout limit is 100 and you have 200 loads on a flip flop, the synthesiser will replicate the flip flop (so that there are two flip flops fed with identical inputs) and each flip flop will drive 100 loads.

My experience is that automatic replication makes floorplanning harder.

Yes.

The obvious hard limit is 100% (although I once had Maxplus make some FFs out of comb. logic in a CPLD so that I actually used more than

100% of the flip flops in the device!). A practical limit is 50% to 100%. I've never seen anyone suggest a utilisation of less than 50%.

There is also a tradeoff with development time. Lower utilisation means faster build times. I've seen projects which have had larger FPGA on the prototypes (to speed code development) and smaller FPGAs on the production units (to lower costs). This is also important if the requirments for your project are not fully understood- you have room to manoeuver during development.

A lot depends on your design. It may end up being flip flop limited (which would be unusual in an FPGA, but more common in a CPLD), or it may be block ram limited, etc.

My current chip is using most of the block ram but only about 40% of the FFs and logic. YMMV.

Regards, Allan.