Simulation deltas

Hi,

This question deals both with an actual problem, and with some more concept ual thoughts on simulation deltas and how an RTL entity should behave with regards to this.

This post regards the case of a simulation with ideal time - that is, no de lays (in time) modelled, rather trusting only simulation deltas for the ord ering of events.

*Conceptual*

I would argue that for a well-behaved synchronous RTL entity, the following must be true:

*All readings of the input ports must be made *on* the delta of the rising flank of the clock - not one or any other number of deltas after that.*

Would people agree on that?

It follows from the possibility of other logic, hierarchically above the en tity in question, altering the input ports as little as one delta after the rising flank. That must be allowed.

*My actual problem*

After a lot of debugging of one of my simulations, I found a Xilinx simulat ion primitive (IDELAYE2 in Unisim) *not* adhering to the statement in the p revious section, which had caused all the problems.

See the signals plotted here:

formatting link

It's enough to focus on the "ports" section. The ports are:

- c: in, the clock

- cntValueIn: in

- ld: in, writeEnable for writing cntValueIn to an internal register

- cntValueOut: out, giving the contents of that register

As can be seen, my 'ld' operation is de-asserted one delta after the rising flank. I argue this should be OK, but it is obvious that the data is never written (cntValueOut remains 0). If I delay the de-assertion of 'ld' just one more delta, the write *does* take effect as desired.

I would argue this is a (serious) flaw of the Xilinx primitive. Would peopl e agree on that as well?

(The following is not central for the above discussion, may be skipped.)

I have checked the actual reason for the problem. See the "internals" secti on of the signals. First, Xilinx delays both the clock and the ports to the *_dly signals. Fully OK, if from now on operating on the delayed signals. The problem is that the process writing to the internal register is not clo cked by c_dly, but by another signal, c_in, which is delayed *one more* del ta. This causes my requested 'ld' to be missed. (c_in is driven from c_dly in another process, inverting the the clock input if the user has requested that.)

I argue that synchronous entities must be modelled in such a way that all p rocesses reading input ports *must* be clocked directly by the input clock port - not by some derived signal that is lagging (if only by one delta). I f this is not possible, the input ports being read must be delayed accordin gly. In this case, if Xilinx wishes to conditionally invert the clock like this, causing another delta of delay, the input ports must also be delayed the corresponding number of deltas.

Cheers, Carl

Reply to
Carl
Loading thread data ...

ng

g

I would not agree, conceptual reasoning is as follows:

- The clock causes something to happen

- Something that causes 'something else' to happen must precede 'something else' because this is a causal world we live in.

entity

sing

Hierarchy does not alter signals. You can go through as many levels of hie rarchy as you want and it will not change the time (including simulation de lta time) that a signal changes. What *will* change that time are statemen ts such as 'clk_out

Reply to
KJ

wing

ing

g else' because this is a causal world we live in.

I don't really get what your two points mean in this context. I do understa nd and agree on the literal meaning of them.

I don't think those points necessariyl adress my issue. My issue doesn't on ly relate to causality. Then main problem is to determine *exactly when som ething is sampled*.

Since you don't agree with the statement however; how then should synchrono us elements communicate with each other? If I clock a unit with 'clk', and I can't expect that unit to sample the input ports (which I drive) on (exac tly on, without any delta delays) the rising edge of 'clk', then how long a fter the edge must I hold the input data stable? One delta? Two, ten? One p s, one ns?

(If the answer is anything more than deltas, e.i. involving time, we are no longer in functional modelling, which was an assumption for this question. )

Or how would you suggest the problem I illustrated should be avoided?

e entity

rising

ierarchy as you want and it will not change the time (including simulation delta time) that a signal changes. What *will* change that time are statem ents such as 'clk_out

Reply to
Carl

tand

only

thing

nous

can't

,

he

ns?

Actually, I misread a bit your actual question, I do agree that inputs shou ld get sampled on only one simulation delta cycle...and they do. For some reason, I thought you were talking about outputs being generated.

In any case, your conceptual question doesn't relate to the problem that yo u are seeing with the Xilinx primitive. I have no idea whether it correctl y models the primitive or not, but let's assume for a moment that it is cor rect. Since that primitive is attempting to model reality, there very well would be a delay between the input clock to that primitive and when that p rimitive actually samples input signals. If that is the situation, then inp uts must also model reality in that they cannot be changing instantaneously either. Inputs to such a model must meet the setup/hold constraints of th e design.

When you're performing functional simulation, there can be an assumption th at you can ignore setup/hold time issues. This is an invalid assumption if you include parts into your model that model reality where delays do occur . The model is not wrong in that case, it is your usage of that model.

Just like on a physical board, on the input side to such a model, you need to insure that you do not violate setup or hold constraints. If you do, th en a physical board will not always work, in a simulation environment your simulation will fail (which is what you're experiencing). On the output si de of a model, you need to make sure that you're not sampling too early (i. e. sooner than the Tco min).

Kevin Jennings

Reply to
KJ

Then perhaps the error in the xilinx case is that they are applying a physical model when you call up a behavioral simulation. I remember that the BRAM models (at least for VHDL) had a similar issue causing the behavioral simulation to look as if the readout was not registered unless you had some delay on the address inputs.

--
Gabor
Reply to
GaborSzakacs

ptual thoughts on simulation deltas and how an RTL entity should behave wit h regards to this.

delays (in time) modelled, rather trusting only simulation deltas for the o rdering of events.

ng must be true:

g flank of the clock - not one or any other number of deltas after that.*

entity in question, altering the input ports as little as one delta after t he rising flank. That must be allowed.

ation primitive (IDELAYE2 in Unisim) *not* adhering to the statement in the previous section, which had caused all the problems.

ng flank. I argue this should be OK, but it is obvious that the data is nev er written (cntValueOut remains 0). If I delay the de-assertion of 'ld' jus t one more delta, the write *does* take effect as desired.

ple agree on that as well?

tion of the signals. First, Xilinx delays both the clock and the ports to t he *_dly signals. Fully OK, if from now on operating on the delayed signals . The problem is that the process writing to the internal register is not c locked by c_dly, but by another signal, c_in, which is delayed *one more* d elta. This causes my requested 'ld' to be missed. (c_in is driven from c_dl y in another process, inverting the the clock input if the user has request ed that.)

processes reading input ports *must* be clocked directly by the input cloc k port - not by some derived signal that is lagging (if only by one delta). If this is not possible, the input ports being read must be delayed accord ingly. In this case, if Xilinx wishes to conditionally invert the clock lik e this, causing another delta of delay, the input ports must also be delaye d the corresponding number of deltas.

I would agree with Kevin's assessment and offer an easy solution. As soon a s you involve vendor supplied models you might as well just assume that the y are not purely behavioral in the sense you are describing. The easy way t o deal with this is to move edges of stimulus signals in test benches to th e falling edge of the clock, and to ensure your clock is running in simulat ion at an appropriate time period as it would in the real hardware.

Reply to
matt.lettau

The problem with that approach is that the vendor IP is driven by user IP and not the test bench directly. You certainly don't want the user IP (for synthesis) working on the opposite clock edge. In the past I have worked around the Xilinx model issues by adding unit delays in the code that instantiates it, but even that leaves a bad taste in my mouth, as it shouldn't be necessary for behavioral simulation.

--
Gabor
Reply to
GaborSzakacs

I didn't see anything in the OP indicating whether the driving signals were testbench or design...but you could be right.

Again the way to fight a model that tries to model reality is with more 're ality' of your own. Make the assignments that assign to signals that conne ct with the primitive be delayed by 1 ns (i.e. "a

Reply to
KJ

This is a specious argument. Delta delays are not in any way related to physical delays and are intended to deal with issues in the logic of simulation, not real world physics. If the Xilinx primitive is trying to model timing delays it has done a pretty durn poor job of it since a delta delay is zero simulation time.

This model is clearly *not* modeling timing delays. Just read his description of the problem and you will see that.

This discussion is not at all about setup or hold times. The OP is performing functional simulation which is very much like unit delay simulation. The purpose of delta delays are to prevent the order of evaluating sequential logic from affecting the outcome. So the output of all logic gets a delta delay (zero simulation time, but logically delayed only) so that the output change is indeed causal and can not affect other sequential elements on that same clock edge.

In fact, this is the classic problem where a logic element is inserted into the clock path for some sequential elements and not others creating the exact problem the OP is observing. Normally, designers know not to do this. I guess someone at Xilinx was out that day in the training class.

--

Rick
Reply to
rickman

t you

tly > > models the primitive or not, but let's assume for a moment that it is

ery

n
,

d > > > constraints of the design.

Nothing at all specious, it is correct. If you're connecting to a block th at models delays (and the OP's does), then the solution is to model reality as well on the inputs in order to meet setup/hold time as well as to not s ample outputs before Tco max. Whether those delays are caused by the model using delta delays or real time delays does not change the fact that the s olution I provided is correct. It will be correct if the offending model u ses delta delays or actual post-route delays.

n > > > that you can ignore setup/hold time issues. This is an invalid ass umption

I did read the post, and there are timing delays. Just because the delays are simulation deltas does not make them 'not a delay'. Since the model he is using implements these delays, the user needs to account for that. If you don't want to account for it, then you should use a different model.

eed

,

too

I agree that the OP's problem is not about setup or hold times. The work a round/solution I suggested was to add delays in order to conform with setup or hold times, "Just like on a physical board...". My solution has a dire ct connection with reality (i.e. a physical board with the design programme d in), other solutions might not.

If you're adding something to work around some problem, you're on much firm er ground if there is an actual basis that can be traced back to specificat ions. On the assumption that the external thing connected to the part bein g worked around is a physical part, ask yourself if adding Tpd and Tco dela ys to that model makes it closer or farther away from a 'true' model of tha t part.

Someone else posted that they typically worked around this by changing the inputs to be driven by the opposite edge of the clock. That probably works also, but again ask yourself does that make the simulation model closer to reality? Don't think so.

Of course, there is also the possibility that the stuff connecting to the X ilinx primitive is itself internal to the device in which case I suggested adding a 1 ns (or really whatever small non-zero time delay you want). Aga in, inside a real device, the output of a flop will not change in zero time so adding a small nominal delay as a work around can be justified as model ing reality.

In any case, the work around you use should have a rational basis for being the way it is. If the only justification is that 'it was the only way I c ould get the sim to run' then there is probably a design error that is bein g covered up, rather than a model limitation that is being worked around.

Kevin Jennings

Reply to
KJ

I'm not going to argue with you about this. The models are wrong by conventions of VHDL. I have seen no evidence that the models are trying to simulate timing delays. A delta delay is *zero* time in the simulation. If they wanted to model timing delays they would use a time delay, not delta delays. The problem with using delta delays is that they don't even approximate timing values and they corrupt functional simulation as the OP is seeing. It is a bit absurd to expect users to insert delta delays in their code to fake out imagined timing delays of

0 ns. There is no utility to this concept.

But this is not relevant. I would prefer to add the delta delays where needed and to document them as being required to deal with the errors in the Xilinx models which is why they are there, not to add timing information to a functional simulation which is a bit absurd.

I would consider this to be adding an error to work around the Xilinx error.

Now you are starting to understand delta delays. That is what VHDL does in the simulation. The output of a sequential element changes 1 delta delay after the clock edge. You are proposing that additional delta delays be added by the user to compensate for the delta delays being introduced in the clock path by the corrupt Xilinx model. This is in conflict with best design practices.

I feel that Xilinx should have added those delays to the input data path so that the rest of the simulation can be written like a standard VHDL design.

The rational basis is not "it was the only way I could get the sim to run", it is "this is the best way to work around the Xilinx model problems". Ideally the fixes would be added to a wrapper around the offending Xilinx code if possible.

--

Rick
Reply to
rickman

the

and

Uh huh...when I say to add delays as a work around, you see it as 'not rele vant' and 'absurd', but when you suggest adding delta delays you think you' re relevant...OK...gotcha.

If you had actually put *any* thought into the problem you would see that a ll of the 'as being required' places that one would need to add delays woul d be the inputs (as I suggested) and the delays...well, you never suggested any amount for a delay (where I did). Good tip!

he

sted

ro

as

Ah yes, the 'conflict with best design practice' canard. I suggested using a different model if available, and if you're stuck with the model, then h ere is the way to work around it. What I suggested can be traced back to s pecifications, what you suggest...well, not so much. Just how many 'delta delays' do you think you can add and trace that code back to a specificatio n?

Is what you 'fee' supposed to be relevant?

So are you suggesting that one should do nothing until the Xilinx model is fixed? When I encounter a bug, I submit it to the vendor and work on a wor k around since I can't depend on them to field a fix in a time frame usable by me. I guess you live in a different world where it is OK to say develo pment has stopped while you wait for a supplier to fix something.

eing

I

So rather than accepting a solution that I suggested that has a basis that can be traced back to specification, can be reused regardless of how many d elta delays get added sometime in the future (seems that you forgot about t hat possibility) you're into

- Railing on Xilinx

- Waiting for them to fix the model

- Or add a magic wrapper being apparently clueless that no wrapper will fix the problem and dismissing my work around as being 'not relevant', 'absurd ', etc.

Gotcha.

I'm done with this thread, catch you in the future in some other thread.

Kevin Jennings

Reply to
KJ

You might extract some useful info from this discussion:

formatting link

Delta delays avoid a lot of simulation nasties like race conditions but still suffers from some real world implementation issues as you have discovered.

Good luck, Hans

formatting link

Reply to
HT-Lab

I once got bitten by this sort of thing. Turned out that the default modelsim timing granularity was too big and the simulation rounded delays down to zero.

Colin

Reply to
colin

Just to clarify, this is not a post-route simulation. This is a simulation of a larger custom RTL design. In various parts of it, some primitives from the Xilinx Unisim library are used.

There are numerous workarounds of course, they are obvious to all of us, an d which someone would choose is much a matter of taste - for me, this is no t the central discussion here. I rather seek the lesson to learn (if any) a fter having spent half a day of debugging, finally having found the behavio ur of this primitive to be the cause of the problem.

What the discussion boils down to is if functional models may behave like t his. If the answer is yes, there should be a general design practice, that should always be used when interfacing to RTL logic or functional models yo u haven't developed yourself.

I see from the discussion that the arguments regarding this differs. My ori ginal post suggested me leaning towards the Xilinx primitive being flawed, and also after having taken in the arguments above, this is still my opinio n. The Unisim library contains simulation primitives. For functional simula tion (there's the Simprim library for timing simulations) they should follo w design practice of the interfacing logic only being required to hold the input signals valid *on* the active edge of the input clock. Not longer (al so not in terms of deltas).

One effect of the user being required to hold inputs active any longer (say , adding 'after 1 ns' to any interfacing logic signals) would be a (sometim es) significant increase of simulation time. One of the powers of functiona l simulations are that any changes only happens on the clock flanks, and ch anges around the clock flanks being separated only by deltas, not by time. (Remember, VHDL signals are expensive in this regard. Reducing signal chang es means everything to efficient simulations.)

There is another side of this discussion, that is not about how to interfac e to models/logic by others, but rather how to select your own design rules to avoid these problems within the code you develop on your own. However, I believe a designer seldomly has a legal reason to mess with the clock pat h in RTL code. Typically, vendor primitives are instantiated for any such f unctionality (clock muxing etc.). There might be situations where you rathe r _infer_ than instantiate though, and then this *does* become a problem. H owever, I never came across such a situation.

Reply to
Carl

=537

A very related topic, yes.

If you have several clocks in your design, you must make sure any edges sup posed to occur simultaneous *do* occur simultaneous, also in regards to del tas. This requires some care when generating your test bench clocks. I make sure to generate them from within one and the same process, keeping the de sired phase relationship. Generating one, and dividing the second from the first, is doomed to fail. Logic interfacing to both clocks then have a big risk of missing signal transactions.

Reply to
Carl

he simulation rounded delays down to zero.

This though must have been due to time delays rather than delta delays. (I know Xilinx states its primitives requires 1 ps precision, whereas default precision for ModelSim is 1 ns.)

Reply to
Carl

at

you

- May they, 'yes'

- Should they, 'no'

- Design practice to add delays should be applied when using models that ha ppen to need it. Since most do not, applying it to all models that you hav en't developed is a waste.

wed,

ion.

Submit a bug report.

ay,

es)

Whether this is true or not depends on how you work around the problem. A simulator will schedule a signal change to occur at a certain time. Whethe r that time is on the next simulation delta or 1 ns into the future doesn't change the fact that a new event will need to be evaluated at *some* time in the future. Now it could come down to how you implement that delay:

  1. x by deltas, not by time. (Remember, VHDL signals are expensive in this reg ard.

Not sure what functional simulators you're talking about here. Modelsim ce rtainly doesn't work this way (i.e. changes only happen on the clock edges) . Changes on any signal cause events to be scheduled on others.

Kevin Jennings

Reply to
KJ

I would not class this as a problem with delta delays. The problem is the design of a module which fails in ordinary usage. I remember learning a long time ago that you *never* run the clock through anything that will delay it more than the data, including delta delays. Obviously the designer of the Xilinx module forgot that rule and added logic to the clock path that needs to be compensated for in the data paths to any sequential elements on that delayed clock.

Just as adding delay to a clock in a real world design can cause the design to fail, adding delta delays to a clock in a functional simulation can cause the design to fail in simulation.

--

Rick
Reply to
rickman

ke

that

s you

happen to need it. Since most do not, applying it to all models that you h aven't developed is a waste.

I intended "should" rather than "may" so I'm with you here.

(say,

imes)

A simulator will schedule a signal change to occur at a certain time. Whet her that time is on the next simulation delta or 1 ns into the future doesn 't change the fact that a new event will need to be evaluated at *some* tim e in the future. Now it could come down to how you implement that delay:

he source where I suggested it belongs which has tracability back to specif ications.

rce as being pristine and you just want to add a delay line.

My mental picture was that there is a significant difference of signal chan ges separated by delta's versus changes separated by time - in terms of sim ulation performance. However, I was probably wrong here and after having co nsidered it I no longer see no good reason why there should be such a difff erence.

appens

d only

egard.

certainly doesn't work this way (i.e. changes only happen on the clock edge s). Changes on any signal cause events to be scheduled on others.

Yes of course; I don't mean the _simulator_ has any influence on this. But the user can, for a simulation run, make sure changes only appear on clock flanks (within delta's). But this of course depends on the design simulated and the stimulus.

Reply to
Carl

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.