VHDL Synchronization- two stage FF on all inputs?

- H
- hvo
  
  Contact options for registered users
posted
9 years ago

Wed, Dec 10, 2014 12:22 AM

Hello,

I know this topic is beaten to death but I am a bit unlcear some things.

I've recently encountered metastability issues that caused my FPGA to do unpredictable things. Someone suggested that I synchronize my inputs to the clock domain and that seemed to solve the issue. Googling this topic showed that a two stage Flip Flop is sufficient to increase MTBF for metastability. My question is do I need to do this for all input signals? How would one do this with a design containing 30 to 40 input signals? Which types of inputs can I get away with not using two stage FF?

So in the sample code below, both INA or INB can change state while the clk is transitioning and this could lead to metastability. Synchronizing INA and INB would help, but what about INC and IND?

Would the experts just synchronize everything and forget about it?

-----------------------------------------------------------------------------------------

-- Sample code with asynchronous inputs

----------------------------------------------------------------------------------------- entity blackbox is port ( Clk : in std_logic; RST : in std_logic; INA : in std_logic; INB : in std_logic; INC : in std_logic; IND : in std_logic; Out : out std_logic ); end blackbox;

architecture Behavioral of blackboxis begin

BB: process(Clk, RST) is begin

if(RST = '1') then Out

- R
- Rob Gaddi
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Dec 10, 2014 1:35 AM

Congratulations, you're asking a fundamentally difficult question. Expect to struggle with this answer.

What you're looking for is what I call a "single point" of clock domain crossing. What that means is that, for any set of signals that are theoretically time-aligned, you need to have a single point (i.e. one flip-flop) where those signals all neck down to, then all fan back out from in your internal logic.

Take a UART receiver. You've got several things inside of the state machine that all need to have the same simultaneous opinion of the state of the RX line. So you resynchronize the RX line with a dual flop synchronizer, and everything inside of your synchronous state machine works off of this synchronized version. If your clock rate is, for instance, 100 MHz, then everything on your synchronous side knows that whatever state the sync_rx signal is in, it'll be constant for 10 ns after each clock. Since everyone is only looking at that signal at the clock edge, and the timing analysis tools make sure that the signal is settled at each destination in under 10 ns, you're guaranteed that everything using sync_rx believes it's got the same state.

Now let's say you hadn't resynchronized it. RX can change any time it likes, let's say 5 ns before your clock edge. If it takes 3 ns to get to point A in your state machine, point A will see the new value when the clock edge hits. But if it takes 6 ns to get to point B, it will still see the old value. You've violated the fundamental rule of synchronous design, which is that come the clock edge all data signals are static, and as a result your circuit will behave unpredictably, and unsimulateably.

So, given your example, you don't need to resynchronize any of that necessarily. However, given two copies of your example and the desire to have them always have the same output, you would need to resynchronize those inputs, then split the resyncronized signals off to be the inputs to your logic.

Though keep in mind that your reset is asynchronous as well. For all the same reasons that any other asynchronous signal can be squirrely around the clock edges, you may well find that one of your two copies of that logic comes out of reset one clock cycle before the other does. Does that matter? That's fundamentally a question of your application. But it's why you see people generate reset signals that are asynchronous on their assertion, but synchronous on their deassertion.

There's a lot more to the topic than just this, but that's at least a quick introduction to the sorts of things that come up.

--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com 
Email address domain is currently out of order.  See above to fix.

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Dec 10, 2014 6:55 AM

(snip)

Yes.

While metastability can be a problem, much more common is the multiple signals crossing time domains without appropriate synchronization.

Also, metastability decreases exponentially with decreasing clock rate.

Multiple signal crossing a time domain can fail easily even at low clock rates.

There is also the case when multiple signals have to cross a clock domain and remain consistent. One favorite case is the counter for a RAM based FIFO. Converting to gray code (or counting in gray code) means that only one signal changes at any count transition. Either the before or after value will be seen on the other side, but not any other miscombination of bits.

-- glen

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Dec 10, 2014 7:17 AM

Can you explain this claim? I can't think of any reason why metastability would be anything but linear with clock rate if the same slack time is available.

--

Rick

- A
- Allan Herriman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Dec 10, 2014 10:30 AM

formatting link

Note the straight lines on the log/linear graph in Figure 2.

For this particular family (V2P) the constants are such that with only a modest number of ns slack, the probability of failure drops to less than once in the lifetime of the universe.

Regards, Allan

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Dec 10, 2014 11:17 AM

That's what I thought. You are confusing the time allowed for the metastable event to settle with clock speed. The two are not the same thing at all. I can lay out a circuit with a design goal of 100 MHz and end up with 2 ns slack on the critical route while I lay it out with a design goal of 50 MHz and end up with 1 ns slack on the critical path. The *only* thing that is important is the slack time where the output has time to settle.

It is interesting that the labeling of the two graphs and the text in the body of the document all describe this data differently and *none* of them can be interpreted as "slack" time by what I am reading. But I'm sure that is what they mean.

"Metastable measurement results are listed in Table 1 and are plotted in Figure 2. The time plotted on the horizontal axis includes the clock-to-out delay of QA, plus a short interconnect delay, plus the setup time at the input of the QC flip-flop"

"Cloc-to-Q + Setup + Metastable Delay (ns)"

"Cloc-to-Q + Setup Metastable Delay (ns)"

I expect the label in Figure 3 is just a typo. But given that the label doesn't make sense either way I'm not sure I care.

--

Rick

- K
- KJ
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Dec 10, 2014 3:00 PM

Actually, it is much more likely that the problem you saw has absolutely no thing to do with metastability. What you described and how you fixed it so unds like either a clock domain crossing or possibly clock skew between the part generating the input signal and your internal design clock...in short it is a setup/hold time problem. What you added to fix your design sounds like a single flip flop per input signal, then that also suggests the prob lem is not metastability. The point here is not to belabor terminology, bu t to help you understand it.

Any of your inputs that lead through logic that ends in a flip flop (which will be every input signal in nearly any design) will have setup and hold t ime requirements that must be met. To find out what your requirements are review the timing analysis report for your design where the setup and hold time requirements for each input will be listed. Those timing requirements are relative to some clock signal in your design. Do those signals actual ly get generated in the same clock domain? If so, is Tco + PCB prop delay

Setup time requirement + Clock Skew less than the clock period? If those signals are not generated in the same clock domain, then ask yourself how is that input going to be able to meet the setup and hold time requirements ? (Hint: The answer is that it will not). This is the situation where yo u are crossing clock domains and the synchronizing flip flop that you added is needed.

The setup/hold time requirements will still not be met at the input to that flip flop but that is 'OK', since the rest of your design uses the output of the flip flop, not the input. Now that synchronizing flip flop might mi sbehave (i.e. take a longer than normal time to settle) because the inputs did not meet the setup/hold time requirements, so the solution there is to add a second flip flop and then the rest of your design uses the output of the second flip flop. The first flip flop 'misbehavior' is what is called metastability. Generally, a two flip flop chain is all that is required to cross the clock domain and reduce metastability to an acceptably low numbe r.

The short answer to your question though is that whenever an input signal d oes not meet the setup and hold time requirements of your design, you will need to synchronize it first before using it elsewhere in your design.

Kevin Jennings

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Dec 10, 2014 6:05 PM

If you clock a FF at time t, and you don't use the result until time t + Ts (Ts being your clock period), then the FF has that whole Ts-long period to resolve the metastability one way or another. That part will be exponential with Ts.

But yes, you would expect a linear part, too.

--

Tim Wescott 
Wescott Design Services 
http://www.wescottdesign.com

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Dec 10, 2014 6:10 PM

By having lots of such inputs -- but see the other, more detailed answers for a better idea.

Any input that is guaranteed to be settled when you clock it in.

--

Tim Wescott 
Wescott Design Services 
http://www.wescottdesign.com

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Dec 10, 2014 7:21 PM

You are forgetting about the time required for the signal to propagate to the output, through the logic, and the setup time for the next FF. These times all need to be subtracted from the clock cycle time yielding the slack time. This is the only number important to resolving metastability.

The linear aspect comes from the probability of having a metastable event that needs to be resolved. Think of them as edge collisions between the clock and the data input. The more of them you have in a second the more likely a metastable event.

--

Rick

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Dec 11, 2014 12:26 AM

(snip, I wrote)

I will claim that it is ambiguous enough not to be wrong, but ...

In the long time (slow clock) limit, that is pretty insignificant.

People well say "My clock is only 10kHz, metastability can't be a problem" which is pretty much right.

But clocking multiple signals with different delays, they can end up on different sides of a clock pulse, even for very slow clocks.

-- glen

- H
- hvo
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Dec 11, 2014 1:50 AM

I think by popular opinion, my issue is not metastability but rather clock domain crossing as many have pointed out. This explains why adding a single synchronizing FF fixed my issue as Kevin pointed out. Also, an interesting point on the conclusion of Xilinx's XAPP094 stating that "Modern CMOS circuits are so fast that this metastable delay can safely be ignored for clock rates below 200 MHz." This also support why I Also don't think its a metastable problem since my CLK rate is 20MHz.

I guess what I am taking out of all this is that not all signals need to be synchronized with a FF. Only those who fan out to multiple processes that are time aligned. This ensures two identical process to have the same output given the same input.

Now the synchronized FF output can be metastable, in which case a second FF will reduce it probability significantly.

Am I on the right path? or completely out on left field?

PS: is there a way to attach a picture in this forum?

Thanks HV.

--------------------------------------- Posted through

formatting link

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Dec 11, 2014 2:56 AM

??? Not being wrong doesn't sound the same as being right....

Not sure what that means in the real world.

It may be right, but is not relevant to the original issue.

--

Rick

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Dec 11, 2014 3:34 AM

That is an interesting load of BS.... er, I mean "opinion". The question is how do you establish "Modern CMOS circuits"? They are talking about their own specific FPGA circuits which may have been true at that time. Someone pointed out to me in another thread that since that app note was written in 1997, Xilinx processes have changed considerably and the gain/bandwidth product has actually dropped. So please take the 200 MHz figure with a grain of salt and verify your metastable MTBF before deciding to ignore it.

Besides, the extra FF almost never makes a difference to the design.

Metastability is the same thing as the multiple destination problem, but can happen one FF later. If the FF being fed the async signal only feeds one other FF you don't have a problem so much. But if it feeds multiple FFs then it is a bigger problem more likely to bite you. Still, it is easier to fix something that may not be broken than to analyze the issue to death.

BTW, the issue is *not* fanning out to multiple processes. It is driving multiple FFs. If the input is driving a FF that controls the enable on a bunch of logic in the same process, this can be a problem.

No, but you can post a picture somewhere and give a link.

The way I look at metastability is to imagine the output of each FF that may go metastable as oscillating for a few ns after the clock. So either add another FF in front of this to resolve the metastability or verify there is sufficient settling time in the existing path by adding a timing constraint. This slack time is the parameter that is in the exponential term for MTBF. So a little bit goes a long way.

There are tons of other references to explain this stuff. Do a web search and you'll find more than you can read.

--

Rick

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Dec 11, 2014 5:21 AM

(snip)

Yes it is exponential, off of a pretty small time. If you are careful, you might get your slack time within about 5% of the clock period. Most likely, there is at least 5% already in the times given.

Much of the delay changes with process, voltage, and temperature, so they have to build in some margin for all those. If you are within 5%, then the time for an additional FF to resolve is about 20 times longer.

Now, consider increasing the clock period by a factor of 10. If you still have a slack time within 5%, the slack time is now also 10 times longer.

About the only way it could fail is using the favorite trick of overclockers: turn up the clock frequency until it fails, then back down just a little bit. But then it is more likely that it fails by not meeting the timing that you didn't check for, and not so likely metastability.

The other way is with a metastability locked-loop. I might not have invented it in discussion here some years ago, but I believe I first named it. That is a circuit that adjusts the phase of one clock such that it comes as close as possible to the actual metastability point, similar to the way PLLs lock onto a frequency.

-- glen

- G
- GaborSzakacs
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Dec 11, 2014 2:08 PM

Your original problem was most definitely *not* metastability. However mitigating the probability of metastability is still worth while. It's important to understand the mechanisms involved. From a simple perspective, you can consider that any flip-flop has a "window" near the sampling clock edge where metastability can happen. For modern CMOS, that window is very small, probably less than 1 ps. In any case it's *much* smaller than the window you normally try to stay out of between setup and hold when using synchronous logic.

The chances of getting a metastable event at the first flip-flop when introducing an asynchronous signal is simply the probability that an edge of the incoming signal falls within this metastability window. Note that the expected failure rate is related to both the clock rate, which determines how often in time a window is "open" and the edge rate of the incoming signal.

Now we come to why you want a second flip-flop. A metastable event has the effect of increasing the clock to output timing of the first flip-flop. There is theoretically no upper bound on the amount of time that the event can last, but the chances of the event lasting any particular length of time go down

*very* quickly as the length of time goes up. In real world applications, there are secondary processes (mostly system noise) that "help" an event to end in a way similar to a coin standing on edge on a bar where there are a lot of patrons picking up and setting down mugs. In any case you can see that you want "slack" time in the path from this first flip-flop to all other synchronous elements.

The second flop is an easy way to ensure the ease of adding a lot of slack in the path. However it has a secondary impact on the chance of failure. When the first flop has an event that increases its time such that all subsequent flops no longer meet setup requirements, your circuit will fail. With the second flip-flop in place, instead of having an upper bound after which the circuit will fail, what you need for failure is an event that causes the second flip-flop to go metastable. This means that instead of the probability of an event being greater than "x" you now are looking at the probability of an event being exactly "x" +/- something very small. So even if the first path doesn't have the slack to prevent a metastable event from violating the setup/hold of the second flop, the system won't actually fail unless the event is within a very small range. This dramatically reduces the MTBF.

Now deciding whether you really need a second flop depends on requirements for MTBF and the amount of slack you can give between the first flop and all of its loads. At a low clock frequency it's likely that you can ensure enough slack that you don't need the second flop to meet the MTBF requirements. A slower clock also means that you add more delay by inserting another flop. If latency is an issue, you probably don't want to do that. It's a bit counterintuitive, but in this case you could actually improve MTBF without adding delay by using a second flop on the opposite clock edge, assuming you can meet timing to subsequent flops in 1/2 clock period.

--
Gabor

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Dec 11, 2014 10:14 PM

I suspect that the paper (which doesn't sound very thorough) is presupposing that you take a design with a given propagation delay, and just start turning the frequency down on the clock.

--

Tim Wescott 
Wescott Design Services 
http://www.wescottdesign.com

- S
- Stef
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Dec 11, 2014 11:59 PM

On 2014-12-10 hvo wrote in comp.arch.fpga:

But why would you want to save a few FFs? Are you running short? Or is the added delay too much for you? Adding the FFs can also help the tools place your design more easily.

Okay my current design is in a luxury position. I needed the Xilinx Zynq for it's dual core processor and a little bit of FPGA. I'm left with a huge amount of unused FFs even after using triple synchronizers on all inputs.

--
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail) 

Children are like cats, they can tell when you don't like them.  That's 
when they come over and violate your body space.

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Fri, Dec 12, 2014 2:11 AM

Much easier than increasing the propagation delay. If your clock is

1MHz, it takes a lot of logic and routing to make a significant fraction of that in propagation delay.

-- glen

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Fri, Dec 12, 2014 5:17 AM

I agree with everything you've said up to this point.

This all sounds right except the part about the noise. Is there a way to prove that? I have never heard an analysis that proves that noise has any real impact on the rate of resolution of metastability. I'm not saying this is wrong, I just haven't seen anything like a proof.

I don't follow this at all. The same reasoning applies to one case as the other and any metastability resolution time of the first FF greater than "x" will make the second FF go metastable. However, not the second FF also has a chance of resolving before time "y" before impacting functional circuitry. The problem is mitigated both by the fact that both FFs have to persist in metastable state as well as the fact that the slack time for the "empty" path is typically *much* longer than needed to resolve the metastable event and *not* propagate it to the second FF other than as an exceedingly rare event (billions of operating years).

I don't agree. You are better off with one looooong period than two short ones. Part of the reason is that the added FF has delays that subtract from the slack time, but more importantly the exponential term is multiplicative, er... exponential actually, while having two delays is just additive. e^2n >> 2*e^n

--

Rick