Should I worry about metastability

Hi, I need to synchronize an incoming 27MHz signal (50% duty cycle) with an internal clock running at 108Mhz (which is 27*4, but the signals do not have a known phase relationship). The target technology is Spartan II-E.

Is a simple 2-stage DFF synchronizer a safe way to handle this ? (I remember a Xilinx article stating that metastability can be ignored for clock rates < 200MHz).

Many thanks, Guy.

Reply to
Guy Eschemann
Loading thread data ...

I don't think you can ever ignore metastability. It will get up and bite you if you do.

Simon

Reply to
Simon Peacock

Quick question, is your 27 MHz signal a single bit line or a bus?

If it's a single bit, I was taught that 2 stage FFs was "good enough," but I think it depends on how you're planning to use that single bit line. Is it merely some sort of status flag thing? It's not a data stream is it?

Now if your 27 MHz signal is a bus, then you should consider using an asynchronus fifo (which I think Coregen has a core for that, or if not, i'm sure there's a reference design on Xilinx's website). The problem with using 2 stage FFs with a bus is that some bits might go metastable, and get delayed one cycle, while others don't, so the bits on your bus get misaligned. For example, if you have a 4-bit bus that transitions from a

0x0 to an 0xF, on the 108 MHz side you might see it go from 0x0 to 0xE then finally to 0xF.

One important thing to keep in mind is that even if you have two "seperate" "single bit" signals coming in, you might have to treat them as a bus if they are related to each other. Again, it depends on how you're planning to use them.

--Vinh

Reply to
Vinh Pham

It might be a good idea to see what situation that < 200 MHz thing applies to.

Metastability can happen at any frequency. Say you have just a 1 Hz signal that you're resynching to just a 2 Hz clock. If the phase relationship is just right, your FFs can go metastable all the time.

Besides, bugs caused by metastability are a pain to track down. If you're unlucky, they'll occur with the customer's equipment, but not with the one in your lab. Worse yet, when you pay to ship the customer's equipment back to your lab, the bug mysteriously disappears. Prevention is far more valuable than a cure, in this case. :_)

--Vinh

Reply to
Vinh Pham

See Vinh's response in regard to whether this 27MHz thing is one or more signals.

For a better understanding on Metastability, I recommend:

formatting link

This MAY be sufficient. The primary requirement is that there be maximum possible slack time in the path between the two FFs that make up your synchronizer. At 108MHz, the cycle time is about 9.25ns.

If the first FF has a clock to output delay of .5 ns, and the second FF has an input setup time of .5 ns, and the max delay of the routing between them is 1ns, then you have a slack of 7.25ns.

A 3 stage synchronizer would effectively have double this slack time (2 x 7.25ns) , but would add another cycle of latency.

If you don't control the routing delay, your efforts may be wasted.

For example, if you set a the timespec to 9ns for this path, the current Xilinx router will stop optimizing the path as soon as it has met this requirement, and you will have a miserable synchronizer because the slack time will be far less than 7.25 ns. Maybe as little as .25ns . You need to set specific timespecs that cover this critical path, and set it far more aggressively than the clock frequency would indicate. I would recommend a target of 3 ns for the path, which if the spec is met, will give you at least 6 ns of slack.

The most useful thing you can do to help the tools is to put placement constraints on the two FFs of the synchronizer, and place them in the same CLB. This makes it far easier to meet the aggressive timespec that you must also specify.

This is NEVER (absolutely, categorically NEVER) true. (can you remember which article you saw this in?

Well designed synchronizers are your friend!

Philip

Philip Freidin Fliptronics

Reply to
Philip Freidin

As Vinh says, once you have more than one related signal that you need to synchronize, the two flop method breaks down. Have look at the following tutorial/paper that describes various ways to handle asynchronous design. I've posted it before, but it never hurts to do so again...

formatting link

- Paul Leventis Altera Corp.

Reply to
Paul Leventis

My 27 MHz signal is a single bit line, not a bus. Actually it is the clock line for an 8-bit video data stream. My problem is to synchronize this incoming data stream with the 108MHz system clock.

I know I could be using an asynchronous FIFO for this, but this might be overkill. In order to save resources, wouldn't it be better to synchronize the 27 MHz clock with my system clock (2-stage pipeline should do the job), then use some logic to detect the appropriate edge and use the output of the edge detector as a clock enable signal for the input data FFs, which are clocked by the system clock ?

Regards, Guy.

Reply to
Guy Eschemann

At the risk of proposing something you've already thought of, is there really no way to use the 108 MHz clock in the FPGA as the source of the

27 MHz clock that video data source uses, so that there is a known phase relationship between the two? If you can swing that, they'll be in the same domain and all you have to worry about is a non-varying setup time.

Marc

Reply to
Marc Randolph

I think the frequency reference is talking about ignoring metastability once you have used the 2 FF sync trick. The sync circuit depends on the timing slack between the two FFs to allow the output of the first FF to settle. Above 200 MHz the slack time gets pretty short even if your routing is pretty good.

The other effect of frequency is on MTBF. This is a linear effect related to the product of the two frequencies. So when reading a pushbutton at 32 kHz, I would not worry too much about metastability. But at much higher clock rates without a sync circuit, it can be a problem as you say.

--
Rick "rickman" Collins

rick.collins@XYarius.com
 Click to see the full signature
Reply to
rickman

Since your internal clock is 4 times the external clock, you should have no problem. Use the 27 MHz clock to register the data and sync the 27 MHz clock to the 108 using two FFs. The resulting synchronized clock will give a rising edge between a quarter and a half period of the 27 MHz clock and can be safely used to enable reclocking of the data. You should use an edge detector which will delay the clock one more quarter of a 27 MHz period and will still be safe.

--- --- Sync'd Data ---------|D Q|------------------------------------|D Q|------

27 MHz --+----|> | --- | | Data | --- +------------| | | | | --- --- | --- | & |------|CE | +----|D Q|-----|D Q|--+--|D Q|----O| | | | 108 MHz ------|> | |> | |> | --- |> | --- --- --- ---

View this in a monospaced font.

--
Rick "rickman" Collins

rick.collins@XYarius.com
 Click to see the full signature
Reply to
rickman

That's not a metastabiliy issue. It's a simple setup/hold time. Some of the bits get there before the clock and some get there after. Even if the all get there at the same time, the setup times will differ (slightly) from FF to FF or the routing will be different or ...

--
The suespammers.org mail server is located in California.  So are all my
other mailboxes.  Please do not send unsolicited bulk e-mail or unsolicited
 Click to see the full signature
Reply to
Hal Murray

"worry" is an interesting word.

What are the costs of getting it wrong? You should worry a lot more if your design will go on an expensive satellite or will be controling a nuclear reactor.

If you want your circuit to work correctly, you always have to pay attention when crossing clock domains. In many cases, it is easy to get it good-enough.

You can never eliminate metastability problems, just reduce the probability/MTBF.

By "good-enough" I mean that the chances of metastability causing troubles are so low that you should spend your time looking for problems in other areas. For example the MTBF might be the age of the universe.

I'd suggest finding that Xilinx article and understanding it. As others have said, the key is slack time. If you give the tools a chance, they can do stupid things. You need to supply the right constraints to make sure they don't.

--
The suespammers.org mail server is located in California.  So are all my
other mailboxes.  Please do not send unsolicited bulk e-mail or unsolicited
 Click to see the full signature
Reply to
Hal Murray

You should always be concerned about metastability, whenever asynchronous signals are being synchronized. Let me add some numbers to Phil Freidin's excellent comments:

Metastability creates unpredictable additional settling delays (even oscillations can be considered delays to valid out). The probability of a specific max delay depends on the clock rate, the data rate, and the IC technology.

For Virtex-IIPro, we measured and extrapolated the following data for your case: If your data and clock were truly asynchronous (!), and you could guarantee 3 ns of slack between the two stages of yor double-synchronizer, then you would get, statistically, one failure in a billion years. For every extra ns of slack, the MTBF would be a million times longer. ( Since these numbers are for Virtex-IIPro, add an extra nanosecond for the slightly slower Spartan-II.) As you can see, with a little care in short routing between the two flip-flops, you need not loose any sleep.

But you said that the clock was not truly asynchronous, but was 4 times the data rate, with an unknown ( but stable?) phase relationship.

I would solve that with an adaptive circuit, driv>

Reply to
Peter Alfke

Ah, that makes sense. Thanks rick.

Reply to
Vinh Pham

True, thanks for catching that. Yeah, violating setup/hold doesn't automatically = metastability. I was using the word a little too liberally. "Oh your 401K doing poorly this year? Yeah that's metastability in action there."

Reply to
Vinh Pham

Guy, this is the cheapest way to sync your incoming video (good job Rick) and its work (I use this trick many time in the past).

But this is correct only for duty cycle of >25% (high time). If you can't garanteed a high time of your 27MHz clock of min. 1/4 cycle, you must create a 13.5Mhz derived from 27MHz and detect both edges of

13.5Mhz with 108MHz clk (because if your high time is to small, you have some chance to never see it with the 108 MHz clk).

(Please view in fixed-width font, e.g. Courier)

--- --- Sync'd Data ---------|D Q|------------------------------------|D Q|------

27 MHz --+----|> | | | Data | --- | | | | | | +---------+ | | | | --- | | | | +--|D Q|O-+---+ | | +----|> | | | | --- | | | +---------------+ --- | | | +------------| | | | | --- --- | --- |XOR|------|CE | +----|D Q|-----|D Q|--+--|D Q|-----| | | | 108 MHz ------|> | |> | |> | --- |> | --- --- --- ---

regards fe

Reply to
FE

Just out of curiosity, has anyone ever used the phase-shifting capabilities of the Xilinx DCMs to implement an adaptive clocking circuit to avoid metastability? Once the clocks are in phase, what's the standard drift, i.e. how often would it be necessary to verify if the phase relationship is still right? Are there any reasons not to do this? Pierre-Olivier -- to email me, remove the obvious from my address --

Reply to
PO Laprise

Pierre-Olivier,

The simple answer is yes, the phase shifting feature has been used to avoid the switching region of the clock signal.

If the DCM is used for this purpose, it will automatically adapt for phase shift with temperature, voltage, so you won't have to.

There are fixed phaseshift modes where you find the best place to sample, and then change the phase constant, or modes where the phase shift is variable and you either train on-line (or off-line) and use the resulting phase shift that gave the best result.

If the clock and the data are really asynchronous, no amount of phase shifting will ever remove the possibility of metastability. But if the issue is one of known frequency, but unknown phase, then a self-training DCM phase shift design might solve the problem.

Aust> Just out of curiosity, has anyone ever used the phase-shifting

Reply to
Austin Lesea

Austin Lesea wrote: > PO Laprise wrote: >

Although in most cases it is probably not enough to worry about, the phase might shift between the two devices due to differences in how they age, or how temperature affects them. If one moves one way, the other moves the other way, and the eye is small enough, it could spell trouble.

I wasn't responsible for the implementation, so I do not know the details, but we do/did have an application that required using a DCM to find the eye of the data (V2Pro wasn't going to be available in our time frame, so we could not use the Rocket IO).

The phase shift is set up to be controlled by software, and they shift it across across a wide band, inspecting the "data error" bit as they go. Once they know where the good data eye is, they lock the DCM there.

The concept that the phase relationship could move over a very long period of time had been brought up, but since I wasn't directly involved with this, I do not know what solution was settled on. My suggestion was to use a second DCM and monitor circuit to do a new sweep every so often. If the monitor circuit finds that the main circuit is too close to the edge of the data eye, the main DCM would be shifted by a few digits to move it back close to the middle. The downside is that this requires the duplication of a non-trivial amount of circuitry in our case, not to mention chewing up DCM's pretty quickly if the number of interfaces is gets very high.

Have fun,

Marc

Reply to
Marc Randolph

Marc,

You are describing just one of many interfaces that I have seen or heard of.

Some use one DCM multiplexed across many others to find centers of bits (although this is tough to do as it requires multiplexed clocks).

Others are not concerned with finding the center more than once, as they characterized the total phase shift external to the FPGA (internal phase shift is already updated/corrected constantly by the DCM).

If it is V2 to V2, and two DCMs are used, one at the transmit, and one at the receive with a forwarded clock, then that again requires calibration once.

Margin is the key: the less margin you have, the more difficult to solve the problem even with all the tricks you can possibly hide up your cleeves.

Aust> Aust> >

Reply to
Austin Lesea

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.