Disabling Xilinx clock enable usage...

- J
- johnp
  
  Contact options for registered users
posted
18 years ago

Tue, Nov 22, 2005 6:19 PM

I'm working on a high speed design in a Xilinx V2Pro and I'm running into a timing problem. Instead of packing logic into LUTs, XST wants to use the Enable signal in the CLB. To use the Enable, it needs to use an extra LUT to create the Enable signal, so I get routing delays and an extra CLB delay.

Here's some sample code:

req [3:0] sig4; wire [3:0] sig3;

always @(posedge clk) if (sig1 & ~sig2) sig4

- A
- allanherriman
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Nov 22, 2005 6:26 PM

Your coding style exactly matches the template for clock enable synthesis:

always @(posedge clk) if (condition) sig4

- A
- allanherriman
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Nov 22, 2005 6:32 PM

Another approach could be to pipeline the (sig1 & ~sig2) calculation, so the enable path doesn't need to pass through a LUT. (You may need to make other changes to get the logic correct.)

Regards, Allan

- A
- Antti Lukats
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Nov 22, 2005 6:37 PM

"johnp" schrieb im Newsbeitrag news: snipped-for-privacy@g44g2000cwa.googlegroups.com...

Hi John,

in your example XST does exactly what it should do given your code.

if you want the synthesis to avoid using clock enable then you should rewrite your code

antti

- J
- John_H
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Nov 22, 2005 8:46 PM

I would respectfully disagree.

A decent synthesizer should *not* produce an extra level of logic with an actual increase in area unless - and it's hard to see this as the case - the extra fanout for a heavily loaded signal causes timing problems elsewhere in the design.

In a properly constrained design, a decent synthesizer should *not* produce logic that violates the timing constraints if there's an available solution that meets the timing. Unfortunately we have to spend much of our time tuning things manually to get the "obvious" to happen.

- A
- Antti Lukats
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Nov 22, 2005 10:53 PM

"John_H" schrieb im Newsbeitrag news:ZsLgf.36$ snipped-for-privacy@news-west.eli.net...

Dear John (and John),

I am glad to see someone to disagree with me once in a while, but the issue isnt that simple

the way XST does synthesize the example in the original posting DOES NOT add extra delay and is in most cases the most effective coding. The flip flops are feed either by direct connect bypassing the LUT in their slices, or from feeding logic that is packed into the slice where the FF is, in what case the delay before the FF is absolutly minimal (LUT to FF in same slice). In most cases the timing delays in clock enable and data path will somewhat overlay and cancel out a bit from timing budget so the clock enable version would be faster, that is implementing the clock enable emulation in the D input would make one delay path longer and overall timing worse.

OTOH in some cases the no clock enable version may yield to better overall timing depending where the critical path is, but here my bet is that there is no "decent" synthesizer that would optimize the clock enable out from the sample code based on critical path analyze alone. It would be possible, yes - but I would be surprised to see some synthesis tool to actually do that without explicit coding or constraining. Hm, maybe am wrong and some synthesis tool is as smart already :)

I do AGREE that the syntesis tools do not the best and in cases where solution to meet timing is available, that solution is not used automatically and needs manual 'tuning'.

Antti

- J
- johnp
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Nov 22, 2005 11:22 PM

All -

Of course, I realize that my code sample REALLY wants to map to the Enable pin: always @(posedge clk) if (condition) sig4

- D
- Duane Clark
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Nov 23, 2005 12:20 AM

That would require that the synthesis tool specifically look for the default value on the right be the same signal as is being assigned to. While I suppose it is possible that a synthesis tool might do that, I kind of doubt it. That would seem to me to require that the tool designer deliberately coded the tool to de-optimize (is that a word ;) the code.

- J
- JustJohn
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Nov 23, 2005 1:57 AM

I'd bet it would. I am continually amazed at how good the synthesis optimizers are getting.

OP can probably force the results he's asking for by using a 'keep' attribute, if he really wants to. I don't know how to express it in Verilog, but the VHDL code follows. See 'KEEP' in the constraints guide for Verilog syntax.

I couldn't resist on this one, had to do the experiment, and Antti is

100% right in this instance. P/R into an XC2V40-5 with CE logic in the same LUT along with the CE MUX gives _slower_ results than putting only the CE logic into a LUT and using the CE pin and and its built-in CE MUX:

Using CE pin: under 2 ns. Using LUT : over 2 ns.

An odd thing is that XST infers 2 FFs for the LUT version. (Did someone say pushing a rope?)

Nice that the synthesis tools keep getting better, and there are less opportunities to second guess them.

Regards, John

entity CE_Inferral is Port ( clk : in std_logic; rst : in std_logic; a_in : in std_logic; b_in : in std_logic; c_in : in std_logic; q_out : out std_logic); end CE_Inferral;

architecture Behavioral of CE_Inferral is signal q : std_logic; signal a : std_logic; signal b : std_logic; signal c : std_logic; signal d_lut : std_logic; attribute keep : string; attribute keep of d_lut : signal is "true"; begin q_out

- A
- allanherriman
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Nov 23, 2005 3:16 AM

The OP is using XST 6.2.03. It's not that smart.

Regards, Allan.

- J
- johnp
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Nov 23, 2005 3:25 AM

Since my 'sig3' vector is four bits wide, the signal from the CE logic needs to fan out to the 4 flip flops. Now we get routing delay.

Antti's example may be correct, but for the 4 bit wide destination, I think I get a performance penalty.

I love synthesis, but... It sure would be nice to have any easier way to direct it! In any event, it sure beats schematics.

John Providenza

- M
- Mike Treseler
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Nov 23, 2005 7:59 PM

It not only beats them, It draws them for me:

formatting link

-- Mike Treseler