Metastability mitigation and I/O registers

- R
- RCIngham
  
  Contact options for registered users
posted
10 years ago

Thu, Jul 18, 2013 9:13 AM

This is a design targetted at a Microsemi ProASIC3 device, but I expect that the answer should be technology-independent.

I have to input a bunch of discrete signals to send over a telemetry link. Therefore I will be doing the usual 2-register metastability mitigation. Should I pack the first register into the I/O cell, as per usual bus-related practice, or would it be better not to do that (timing is completely asynchronous) in the hope that the pair of registers will be placed closely together and hence improve the overall metastability MTBF?

I haven't found anything on this in the Actel/Microsemi web-site or on C.A.F., but maybe my google-fu is lacking today.

Thanks in anticipation, Robert

--------------------------------------- Posted through

formatting link

- C
- Chris
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Jul 18, 2013 12:14 PM

It shouldn't matter where you put the register, once it is in your clock domain the p&r tools should either ensure timing is met from reg to reg or flag it with a timing error if things are constrained correctly. I would pack the reg at the io buffer.

- G
- GaborSzakacs
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Jul 18, 2013 2:15 PM

The only reason why it would make a difference is if the IOB flops either had much poorer metastability performance (larger metastable window and/or longer metastable event time), or if you cannot get a short enough path to the next fabric flop from the IOB flop to leave enough slack for a metastable event. To some extent this depends on your clock frequency, since it's generally easier to get more slack for metastability to resolve when you have a longer clock period.

I think only MicroSemi could answer the question about the metastability performance difference between an IOB flop and an internal flop. Presumably you can get all the path timing details you need from the tools.

--
Gabor

- C
- Chris
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Jul 18, 2013 3:28 PM

The data sheet for the proasic shows 260 to 350 ps of setup time for the io buffer register depending on device speed grade while the versatile (internal ff) setup is 430 to 570 ps depending on the speed grade.

Again the tools will warn you if you are not meeting the setup time into the second flop.

I would use the io buff reg.

- G
- GaborSzakacs
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Jul 18, 2013 4:00 PM

I think you're missing two points about metastability here.

1) The input setup and hold times only define the window where the input data transition is not guaranteed to show up on the output at the associated clock edge. The actual window where an input data transition results in output metastabiliy is usually much smaller than this, perhaps on the order of 1 to 10 ps. The fact that the two flop types have similar setup and hold timing does not necessarily mean that they are designed the same or that they have similar metastability characteristics. 2) The tools only guarantee that you meet setup at the second flop when there is no metastability at the first one. If you want to allow extra settling time to cover most metastable events, then you need to ensure there is additional slack in this path. I know you can do this with Xilinx tools, but I'm not familiar with the Libero tools. In any case, if metastability is really a concern, it would be good to know how much slack you need to cover say 99.9% of metastable events on the input flop so you can apply adequate constraint to the path between flops.

--
Gabor

- C
- Chris
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Jul 18, 2013 4:20 PM

Gabor- yep for the first point, I was assuming that the io buffer had bett er performance based on the setup only but you are more correct in that it may not have better performance from a meta stability standpoint.... Althou gh I would still assume it if there is no other data from actel. I didn't see anything else in my one minute search. On the second point, I can see your point, if you do hit a meta stable event on the first flop you need so mething to ensure that either the reg delay or path timing would keep you f rom propagating thru to the next flop and I am not sure the tools would tel l you this. This is the case whether or not you use an io buffer or intern al ff. I still hold by my suggestion that with the info I saw I would pack the buffer with a reg.

- J
- jonesandy
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Jul 18, 2013 5:52 PM

Robert,

Use a DDR input register. The PA3E DDR Input register's falling edge sample is re-registered in the IOB on the rising edge of the IOB clock, so it is already a two-stage synchronizer/meta-rejector, albeit at twice the clock f requency. The DDR input register has a dedicated connection(and delay) betw een the two registers, so the delay is immune to P&R.

If your clock rate is near the high end of the range for an IOB register, I would recommend a standard IOB register for the synchronizer and a fabric register for the meta-rejector, with a timing constraint on the path to ens ure sufficient margin for metastable events.

Andy

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Jul 18, 2013 6:05 PM

Are you doing any logic at all on these data signals? If not, I can't imagine there wouldn't be enough delay to provide metastability protection. If you aren't doing any logic on these signal chains, why would it matter if one of the bits goes meta stable?

Think of it like quantum mechanics. The sample that was taken at the time of a transition doesn't really "know" if it is a one or a zero. It only then matters when you "look" at it with logic. If you are just sending the bit somewhere else I can't see how the fact that it is not resolved by the next clock edge can have an impact on anything. It just makes the next stage metastable and you have a second shot at resolving it.

How fast is your clock and your input transition rate? That will help determine how much slack you need to resolve the metastability to a vanishing small probability. I believe you should be able to spec the slack time in the path you are talking about. Then you don't care where the FFs are, just that the routing meets your timing requirement.

--

Rick

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Jul 18, 2013 6:30 PM

(snip)

This should be true, but not usually a problem.

The reason for the double FF's is that if the first goes metastable, you have almost the whole clock period for it to settle before the clock edge on the second one.

In a normal system, you might have 80% or 90% of the clock period in logic delay between two FFs. If one went metastable, in addition to the logic delay, it could miss setup for the next. As settling is exponential, with minimal delay you have from five to ten times as long, which usually gets you pretty far out on the time scale.

As far as I know, you can't design a system such that the minimal logic path is still too long, but if you did a third FF should be enough. Well, I suppose if the tools put a really long routing delay in there it could do it.

I have never done a design where the most logic between FFs was less than two CLBs. If you manage that, I might believe it.

It is usual for FPGAs to have zero hold time. That is, the output of one can't change too fast for the input of another on the same clock. (Mostly that is because of the minimum amount of logic and routing between them.) Metastability only happens after the clock edge, not before.

Just to be sure, since I am not sure the OP mentioned it, the question is really about metastability, and not clocking of a multiple bit register, such that some bits go on one clock edge, and some on the next. That also occurs when crossing clock domains, causes systems to fail, but is not related to metastability.

-- glen

- G
- GaborSzakacs
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Jul 18, 2013 6:59 PM

That's a good point, and you can often get by with a single flip-flop to register the input, followed by a timing constraint on paths from this flop's output that gives some extra margin for metastability to resolve. This is especially easy if you have a slow system clock, where it also helps to reduce latency. I rarely think about metastability in my designs because it happens so infrequently, and I usually assume people concerned about it have some requirement for very high system reliability (my designs mostly deal with video, where a bad pixel here or there isn't noticeable). Often, especially beginners immediately suspect metastability when they have problems with asynchronous inputs, while in almost all cases the failures are due to registering the same asynchronous input with more than one flop causing race conditions, and often causing FSM failure.

--
Gabor

- G
- Gabor
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sat, Jul 20, 2013 3:33 AM

The point is that after the sampling flop you want to treat this signal as synchronous. Now suppose the output of the sampling flop goes metastable for longer than the slack time on its output path. This is where you can no longer treat this as a synchronous signal. If it goes to more than one downstream flop, then you have the possibility that one will sample the transition on this clock edge and the other won't sample the transition until the next clock. This is how FSM's go "zero-hot." Adding a second flop with no logic between reduces the chance of this happening downstream of the second flop by many orders of magnitude. This is because now the metastability event from the first flop would need to resolve just at the metastability window of the second flop in order to cause any further issue downstream. As I said before this window can be extremely small, much smaller than the setup/ hold window.

"Vanishingly small" may mean one thing to one person and another thing to others. If for example you can say that 1 ns of slack gets you to an event rate of 1 per month, is that "vanishingly small" or do you want to add that second flop and get to something like one event per millennium? I rarely use a two-stage synchronizer, but then again nobody's life depends on the operation of my logic designs.

--
Gabor

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sat, Jul 20, 2013 4:23 AM

(snip on metastability)

Yes, but note that problem is still there even without metastability. If the transition is close to the clock, and the delay to the different FFs is (even slightly) different, they can clock the input on different clock edges.

FIFOs use gray code to resolve that problem.

If the critical path delay is 80% of the clock period, the system can fail if the metastabilty time is 20% of the clock period.

If you have a 100MHz clock, and one event per month, that is one in 2.5e14 clock cycles. As metastability resolves exponentially, with a full cycle in between it will fail about every 2.5e14 to the fifth power cycles, or about one in 3e55 months. That should be long enough for just about everyone.

-- glen

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sat, Jul 20, 2013 6:16 AM

Yes, this is all classic metastability stuff. If this is just data and is not used to control any FSMs or otherwise branches out to multiple FFs, metastability won't matter. That was my point about it only mattering when the value of the signal is "looked" at. If this is just data being clocked into another FF it just doesn't matter, the next FF just gives you more metastability resolution time before the signal reaches some point in the circuit where it does matter.

I think the term vanishingly small is pretty universal. That is why I didn't say "1 failure per month" or "1 failure per millennium". As to what numbers are required, that is up to the designer and the application. How much timing slack is required depends on the failure rate needed, the clock and data rates and the details of the logic family used.

At one point some folks at Xilinx made a pretty good case for standardizing on 2 ns since it gives you a 1 in a billion year failure rate (or something like that) with 100 MHz clocks and data. I don't recall the exact numbers, but they made it clear that anything longer than 2 ns was gravy for most designs in FPGAs. But more recently I have read here that the newer FPGA families are trending back in the other direction so that longer slack times may be needed for high end designs. But it doesn't seem like there is a "metastability voice" at the FPGA companies anymore. Do they even publish the relevant numbers for recent families?

--

Rick

- R
- RCIngham
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Jul 22, 2013 3:14 PM

Gentlemen all,

Thank you for an interesting discussion. Just to clarify a few points:

It's a ProASIC3 design, so not bleeding edge technology, but there is enough data on the Microsemi web-site to calculate MTBFs if the inter-FF delay is known.

I am assuming that each of these inputs is independent.

The system clock is < 10 MHz, but a "rule-of-thumb" that works for higher frequencies would be nice.

The design is flight safety-critical, so I would like the MTBF relating to metastability to be "a lot of years", but it isn't (so far) a quantified system requirement.

While I am sorting out the "build from a script, not the GUI" process, I think that I will add the appropriate sort of timing constraint to my trial sand-box design, and see where it breaks for both cases. I will then report back.

Many thanks, Robert

--------------------------------------- Posted through

formatting link

- R
- RCIngham
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Tue, Jul 23, 2013 3:01 PM

Well, as I couldn't work out how to specify the timing constraint(s) I needed, I did some post-layout simulations instead. I split the bunch of discretes into two, half with '-register yes' and the others with '-register no' in the 'set_io' constraint, then swapped the constraints over for a second build+simulate pass.

In both cases the arrival at the 'D' input of the second register was slightly faster and slightly more consistent in the '-register yes' cases. The time delta between groups was about 150ps, so it might be worth having. And I now have a possibly accurate CK-to-D time to put in my Metastability MTBF spreadsheet.

YMMV for other technologies.

--------------------------------------- Posted through

formatting link