Disabling Xilinx clock enable usage...

I'm working on a high speed design in a Xilinx V2Pro and I'm running into a timing problem. Instead of packing logic into LUTs, XST wants to use the Enable signal in the CLB. To use the Enable, it needs to use an extra LUT to create the Enable signal, so I get routing delays and an extra CLB delay.

Here's some sample code:

req [3:0] sig4; wire [3:0] sig3;

always @(posedge clk) if (sig1 & ~sig2) sig4

Reply to
johnp
Loading thread data ...

Your coding style exactly matches the template for clock enable synthesis:

always @(posedge clk) if (condition) sig4

Reply to
allanherriman

Another approach could be to pipeline the (sig1 & ~sig2) calculation, so the enable path doesn't need to pass through a LUT. (You may need to make other changes to get the logic correct.)

Regards, Allan

Reply to
allanherriman

"johnp" schrieb im Newsbeitrag news: snipped-for-privacy@g44g2000cwa.googlegroups.com...

Hi John,

in your example XST does exactly what it should do given your code.

if you want the synthesis to avoid using clock enable then you should rewrite your code

antti

Reply to
Antti Lukats

I would respectfully disagree.

A decent synthesizer should *not* produce an extra level of logic with an actual increase in area unless - and it's hard to see this as the case - the extra fanout for a heavily loaded signal causes timing problems elsewhere in the design.

In a properly constrained design, a decent synthesizer should *not* produce logic that violates the timing constraints if there's an available solution that meets the timing. Unfortunately we have to spend much of our time tuning things manually to get the "obvious" to happen.

Reply to
John_H

"John_H" schrieb im Newsbeitrag news:ZsLgf.36$ snipped-for-privacy@news-west.eli.net...

Dear John (and John),

I am glad to see someone to disagree with me once in a while, but the issue isnt that simple

the way XST does synthesize the example in the original posting DOES NOT add extra delay and is in most cases the most effective coding. The flip flops are feed either by direct connect bypassing the LUT in their slices, or from feeding logic that is packed into the slice where the FF is, in what case the delay before the FF is absolutly minimal (LUT to FF in same slice). In most cases the timing delays in clock enable and data path will somewhat overlay and cancel out a bit from timing budget so the clock enable version would be faster, that is implementing the clock enable emulation in the D input would make one delay path longer and overall timing worse.

OTOH in some cases the no clock enable version may yield to better overall timing depending where the critical path is, but here my bet is that there is no "decent" synthesizer that would optimize the clock enable out from the sample code based on critical path analyze alone. It would be possible, yes - but I would be surprised to see some synthesis tool to actually do that without explicit coding or constraining. Hm, maybe am wrong and some synthesis tool is as smart already :)

I do AGREE that the syntesis tools do not the best and in cases where solution to meet timing is available, that solution is not used automatically and needs manual 'tuning'.

Antti

Reply to
Antti Lukats

All -

Of course, I realize that my code sample REALLY wants to map to the Enable pin: always @(posedge clk) if (condition) sig4

Reply to
johnp

That would require that the synthesis tool specifically look for the default value on the right be the same signal as is being assigned to. While I suppose it is possible that a synthesis tool might do that, I kind of doubt it. That would seem to me to require that the tool designer deliberately coded the tool to de-optimize (is that a word ;) the code.

Reply to
Duane Clark

I'd bet it would. I am continually amazed at how good the synthesis optimizers are getting.

OP can probably force the results he's asking for by using a 'keep' attribute, if he really wants to. I don't know how to express it in Verilog, but the VHDL code follows. See 'KEEP' in the constraints guide for Verilog syntax.

I couldn't resist on this one, had to do the experiment, and Antti is

100% right in this instance. P/R into an XC2V40-5 with CE logic in the same LUT along with the CE MUX gives _slower_ results than putting only the CE logic into a LUT and using the CE pin and and its built-in CE MUX:

Using CE pin: under 2 ns. Using LUT : over 2 ns.

An odd thing is that XST infers 2 FFs for the LUT version. (Did someone say pushing a rope?)

Nice that the synthesis tools keep getting better, and there are less opportunities to second guess them.

Regards, John

entity CE_Inferral is Port ( clk : in std_logic; rst : in std_logic; a_in : in std_logic; b_in : in std_logic; c_in : in std_logic; q_out : out std_logic); end CE_Inferral;

architecture Behavioral of CE_Inferral is signal q : std_logic; signal a : std_logic; signal b : std_logic; signal c : std_logic; signal d_lut : std_logic; attribute keep : string; attribute keep of d_lut : signal is "true"; begin q_out

Reply to
JustJohn

The OP is using XST 6.2.03. It's not that smart.

Regards, Allan.

Reply to
allanherriman

Since my 'sig3' vector is four bits wide, the signal from the CE logic needs to fan out to the 4 flip flops. Now we get routing delay.

Antti's example may be correct, but for the 4 bit wide destination, I think I get a performance penalty.

I love synthesis, but... It sure would be nice to have any easier way to direct it! In any event, it sure beats schematics.

John Providenza

Reply to
johnp

It not only beats them, It draws them for me:

formatting link

-- Mike Treseler

Reply to
Mike Treseler

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.