100 Mbit manchester coded signal in FPGA

Michael Dreschmann · 2006-08-08T17:42:34+00:00

Hi all,we are trying to implement a 100 MBit communication link witch usesmanchester coding. The signal is generated by a CPLD (xc2c64a) and wehope we can receive it with an FPGA (Virtex4 for example).Because the CPLD design will work at 2.5V and should use minimal power(sensor node) my question is if it is possible to use a crystal with aNOT-gate in the CPLD for generating the oszillator frequency insteadof an external oszillator.The second question concerns the reception of the datastream withinthe FPGA. My thought was to use a digital pll as mentioned here generate a sample clock for the incomming bitstream.With the help of a DCM module I would generate two 300 MHz clocks, oneshifted by 180 degree. Then I should be able to sample the incommingstream with 600 MHz and I hope this is enough to stay phase lockedwith the datastream. But I haven't done such a fast communicationbefore, so I've no idea if this will work. Any comments from you wouldbe nice.Thanks, Michael

J

Jim Granville 19 years ago

and also the very similar

formatting link

problem is, cpld osc is not 'about the same', as these devices use unbuffered inverters for the Xtal osc, and Schmitt post buffers.

- no presently available CPLD's offer unbuffered inverters, they have loop gains even higher than buffered cmos gates which tyically chain 3 inverters.

still leaves edge effects, which are not ideal on a clock source :)

-jg

Vote

R

rickman 19 years ago

I am not familiar with USB data recovery designs, but I have worked with Manchester encoding before. A 4x clock is on the hairy edge for data recovery. The way Manchester encoding works, you start the bit in the opposite state and transition to the correct state in the middle of the bit. You can also see a transision at the start of a bit if it matches the previous bit. To sample the data you have only one half a bit time. You have to have a lockout period to prevent a false edge detection on the edge between bit cells.

Since you have a one clock cycle uncertainty in the timing of the detection, you have trouble timing the length of the lockout period for detecting the edge in the next middle of bit cell. At 4x any timing error will result in eventual slips in timing. You need assure that you clock just a bit faster than 4x to never have the edge detector locked out when in the center of a bit cell.

The logic of a Manchester encoder or decoder is simple. But you have to be careful to sample fast enough to make it work. A 4x clock will not work reliably. A 4.1x clock will.

Vote

P

Peter Alfke 19 years ago

Vote

R

rickman 19 years ago

I don't agree. By ignoring the signal for 2 clock edges, you are actually delaying by up to 3 clock edges. You may have just missed catching the transition on edge 1, then you wait until edge 2 and 3. So the jitter in the measurement will put you (worst case) right at the next transition on your input signal and you may just miss that. If all clocks were perfect and there was no worry about transition timing on the edges, etc, then yes, I would agree you could do it with 3 clocks. But in the real world with imperfect clocks and signals you can miss an edge when the timing is boarderline.

The same issue prevents you from working properly with 4 clocks because by waiting 2 clock edges you can improperly "see" the transition between cells and by waiting 3 clock edges you can miss a valid transition just as above. With 5 clock cycles to a data bit, you will be able to wait 3 clock edges and will always be assured of ignoring the interbit transition and "seeing" the mid-bit transition. 3/5 of a bit to 4/5 of a bit delay is always in the first half of the next bit to within 10% of the bit period or 1/2 a clock cycle. Not many systems will have that much jitter or timing distortion.

As to how much jitter is tolerable, I think this comes down to your metastability figures. I don't recall the exact number, but I believe it is the neighborhood of low 10's of ps. I have never worked with a system where the clock was that accurate or stable and the signals were reproduced that accurately. Heck, just a couple of extra pF on a signal line can skew the edges enough to add 100 ps of jitter and screw up a 3x or 4x clocked decoder.

I guess you could go to a fractional ratio between the clock and the data. At 3.5x waiting 2 edges will give you 2 to 3 clocks delay which is 0.25 of a clock beyond the inter-bit transition to be ignored and

0.5 clock cycles before the next mid-bit transition. It is just the cases that are trying to be integer multiples of the bit rate that are problematic until you get to 5x. There really is no reason to run at integer multiples since you can never match the frequency exactly. So even 4.1x will give you enough margin unless the skew and jitter is pretty bad which can happen depending on the transmission medium.

Vote

J

John_H 19 years ago

While the specific approach about a digital differentiator and looking at specific samples relative to an input stream ignored by a few cycles may be confusing, if you don't wait to acquire the signals but analyze what you got, the concept flows. _____

If you can find a sampling such that you can guarantee at least one unambiguous sample in any half bit period - including jitter - then you can recover the data. The worst case for the 3x clock is when one clock samples midpoint of one half-bit section of the data leasing the front and back edges of the next half bit perilously close the the edges.

(view in fixed space font:) __ _____ __ __/ \_____/__ Sample Points ^ ^ ^ ^

Depending on the jitter it might not work but the sample points are 1/12 of a full bit period from the data transition. If one of the sample points is on the hairy edge, the value *will* stabilize through standard handling of asynchronous signals.

The minimum sampling frequency for guaranteed operation is determined by the minimum pulse width (less than half a bit period) degraded by the jitter of the edges (both leading and trailing) as well as jitter in the sampling clock. The DCMs may produce a large amount of (calculated) jitter that would be included in any DCM-based sampling clock analysis. But for a 100 MB/s data rate, the 5 ns period won't be degraded *that* much. While 300 MHz should work flawlessly, even a 250 MHz sampling clock might work if the duty cycle is well controlled. 5 ns half period, 750 ps pk-to-pk jitter, at least one of the sample points 4 ns apart should hit the meat of the half period. Fractional multipliers are just such a bother.

Metastability windows are now sub-picosecond so all that needs to be worried about is standard synchronization for those one in a million events that happen once every 5 seconds at 2x 100 MB/s.

Sample everything, let the logic figure it out.

The clock is extracted from the data rather than the data extracted from the clock.

There's even a Xilinx app note that describes how to extract data from wide busses above 600 MB/s without rocket I/O. There's no 1.5x clock available at those rates!

Vote

R

rickman 19 years ago

I don't agree that this is the worst case. The worst case is when the transition of the signal is at the same time as the clock edge. The signal can be sampled either in the leading or the trailing state. So the edge might be sampled right at the edge or 1/3 of the way into the bit time (2/3 of the first half of the bit). Your decoder has to sample either 0, 1 or 2 clocks after the edge is detected. It then has to wait a defined number of clocks before it recognizes another edge. This is to prevent the detection of the edges that can occur between bit times. If you select 1 clock for the lock out, the worst case is when you detect the mid-bit edge as soon as it happens. Then the lock out is too short. If you wait 2 clocks after detecting the mid-bit edge the worst case is when the edge is just missed by a clock and is not detected until the next clock. Then waiting 2 more clocks puts you at the middle of the next bit and the slightest amount of jitter will cause you to miss the mid-bit edge.

So any amout of jitter will cause failures with a 3x clock. You can use a 3.1x clock and will have less margin (0.06 clock periods) at the one extreme and 0.1 clock period margin at the other. But a 3x clock is not workable.

The problem is not that the signal becomes metastable, the problem is when *both* the edge detection and the next edge detection are close to metastable. Then you can miss an edge because of even microscopic amounts of jitter.

Your statements are not supported by the facts.

I agree that the metastability windows are small. But metastability has nothing to do with this analysis. I brought it up to show that the difference between an edge being detected in this clock cycle and the next is a very small amount of time and much smaller than the jitter in a typical clocked system.

Don't they use that logic in the Rangers?

Please tell how they do it!

Here are two example bit streams that I collected from my white board. Please decode them for me and explain the algorithm you used. Each one is clocked at 3x the bit rate...

0001101110010000 0000101111010000

An algorithm should be able to correctly decode these streams.

Vote

J

John_H 19 years ago

Detection of edges that can occur between bit times was not understood (by me) as a necessary condition for manchester decoding. If this is the case, the situation is more difficult. If the requirement did not include glitch filtering (as should be the case for 100 Mbit/s chanchester encoding) then sampling at the edge would leave the samples on either side fully 1/3 into the half bit period; only the edge sample would be ambiguous and would either widen the previous half bit or the next one in the new timing domain.

Jitter is not an issue. Glitches are an issue. If the logic levels are well behaved, my suggestions should hold. If the logic levels are not well behaved, it appears that finding and holding on to error-free manchester decoded data is an elusive task.

If you aren't dealing with edge detection but sample the signals and let any metastability settle out first. Edge detection is done on this data. The constraint is that the smallest half period degraded by both the transmit jitter and the receive clock jitter MUST be less than the sampling period, adjusting for the sub-picosecond metastability window (which is less than any jitter consideration).

If there are glitches sampled in either half, you are correct that my statements are not supported by facts. If the signal is well behaved though degraded by jitter and duty cycle I stand by my understanding of sampling theory to suggest that I am correct. As long as you can get one unambiguous sample of the half period, you can extract the original data stream. Understanding when two samples are both the same half period versus two half periods from adjoining bits is where a tracking NCO or similar clock extraction is needed. If there is no NCO, just the immediate neighborhood of a few samples, then no - unambiguous decoding is not possible at 2.5x rates.

The difference of being on one side of the metastability window or the other is having one more sample in one half the bit period or the other. The edge detection is just moved by one sample, not ignored entirely.

formatting link

With this small sequence of bits, unambiguous decoding is still available

0001101110010000 10 01 01 10 10 01 1 0 0 1 1 0 0000101111010000 10 01 01 10 10 01 1 0 0 1 1 0

The half bit perios are always identified by a 101 or a 010. The half bit periods from two adjecent bits are always identified by 111 or 000. On average, the bit rate is two half-bit pairs every three clocks.

Q.E.D. yet?

Vote

J

John_H 19 years ago

I may have inverted my single-bit results. It looks like manchester has the logic level in the second half of the bit, not the first like I thought. I checked wikipedia.org and saw my inversion, otherwise the idea still holds.

Without knowing the clock rate, Manchester encoding has pulses that are either 1/2 bit period or 1 whole bit period. You should get two tight ranges of count values that correspond to these intervals. In the 3x clock case, those two ranges overlap. It's knowing when the half bit periods are definitely half or whole bit periods that the alignment can be determined and the frequency extracted without ambiguity.

Vote

R

rickman 19 years ago

Ok, that is what you are not getting. Time sampling does not produce an even distribution of pulses in the time sampled domain with a 3x clock. The above sequences both need to be decoded to the same sequence, 1001. The first sequence assumes "ideal" sampling with no timing abiguities. The second sequence is what you might get if the sampling is done right on the important edges and a small amount of jitter messes up your data.

000_1101110010000 ^ First bit = 1 00011_01110010000 ^ edge between bits, ignore 000110_1110010000 ^ Second bit = 1 000110111_0010000 ^ Third bit = 0 00011011100_10000 ^ edge between bits, ignore 000110111001_0000 ^ Fourth bit = 0

If you don't understand how this was decoded, you need to go back to Wikipedia.

The second sequence has two places where the signal capture was on the wrong side of the edge because of jitter. So it now is not possible to decode and recover the correct data. The same thing can happen with a

4x clock. But as the clock gets faster than 4x there are no more multiples where you can't find a correct delay for ignoring the inter-bit edge.

Vote

J

John_H 19 years ago

You may not notice but I did decode exactly what you show BUT I included the preceeding and following bits as well. The inversion is the issue. Some references suggest the bit value is in the first half, some the second. This is pointed out later in the wikipedia article. I used the first half of the bit period for the data but you can see my bit half pairs are correct.

Of COURSE the sampling doesn't produce an evenly PACED distribution of pulses, it produces counts of 2 to 4 for the two half bits from adjecent values and counts from 1-2 for the isolated half bits. It happens that these overlap as I described. If you only have an all-zero or all-one pattern then no, you cannot ambiguously extract the data. Once you get any other data, the alignment is guaranteed.

You mentioned the sequences need to be decoded to 1001 yet you decoded 1100. At the 2x transmit output, the encoded sequence would be either 10100101 or

01011010 depending on your polarity. Is that what you were attempting to show? Or was it 1100?

Your "go back to Wikipedia" comment was uncalled for. If you don't understand what I'm trying to describe or vice-versa, it's a problem with the communication medium (to include the inaccuracies of English) and not that I'm brain challenged.

Embedded clocks are used all over. Successfully. But you don't see it. Are you right?

Vote

P

Peter Alfke 19 years ago

Please let me be moderator, for this is interesting. Let's simplify the discussion by assuming no jitter and no glitches. Lets also agree that Manchester code is ambiguous in an all-1 or al-0 transmission, since they look identical. The decoder needs a 0-1 or 1-0 bit transition in order to start on the "right foot". That is all well-known. The question then is: Can Manchester Code be decoded with the help of a local clock that is, for example, exactly 3 times the bit rate, or are there severe limitations on the local clock frequency. The simple approach that triggers on a transition and then suppresses the next one, if it occurs at half-bit time, has a problem with a local

3x asynchronous decoding clock. John suggests a solution, and Rickman does not accept it. Round 2:

Peter Alfke

Vote

R

rickman 19 years ago

First let me say that I am not trying to be rude in any way. If you read my posts and see something that you find offensive, I did not intend that. My comment below about reviewing Wikipedia was meant as a simple statement, not an insult. So I apologize for anything that is perceived as offensive. Please keep in mind that writing is very different from speaking. Since tone can not be conveyed redily words can be interpreted very differently depending on the tone you perceive.

For the technical issues... The inversion is not the relevant issue. If you had an algorithm that would decode the stream I gave you as the inverted data I would have accepted that. The problem is the timing. The way Manchester is decoded is to trigger a timer (it was a one shot back when I first worked on this problem) that will ignore any following transitions for approx 3/4 of a bit time. This gives you

+-1/4 of a bit time to allow for distortion and jitter in the signal. When you sample the incoming signal with a 3x clock or a 4x clock there are degenerate cases where the signal is sampled at the time it is changing which adds a full clock period to the jitter. In both of these cases there is not enough margin to allow for this an you can get erroneous decoding.

Your analysis, if I understood it correctly, produced 6 bits of data when there were only four. I am also interested in the algorithm you used. It would be instructive if you gave us the detail of how you decode the bit stream.

Ok, I think I understand where the extra 2 bits came from. Somehow you assumed that the intial and final zeros were adjacent to ones and added extra edges that produced data. So we can ignore those edges and the other data looks good. But what was your algorithm? You need to have a method that can be implemented in logic. I am pretty confident that no matter what algorithm you choose, I can find a case where it won't work.

I think you are referring to the initial alignment. Manchester encoding is typically used in systems where the signal is broadcast over a radio or other analog medium which can have timing and amplitude distortions which can introduce erroneous bits. That is typically handled by sending a synchrononization sequence of alternating 1s and

0s. This produces a pattern of transitions only at the bit center to assure proper alignment. The sequences I sent did not include any medium induced distortions, so the initial transition was a bit center and was an appropriate transition to start your process.

Yes, the second bitstream produces a wrong pattern because of the jitter introduced. That is my point. You can decode the first bitstream because there is no distortion. But the second bitstream shows that that distortion introduced by sampling on the transition will give errors and can not be avoided with a 3x or 4x clocking scheme.

As I posted above, I was not trying to insult you. I was suggesting that you do not understand how to decode a Manchester encoded signal and should check the references. Sorry that it sounded like an insult. I have no reason to insult anyone here and I apologize.

I don't understand what you are saying with this.

Vote

J

John_H 19 years ago

I appreciate that you recognize the ineffectiveness of communication and that you're not intending to be rude. That helps.

If you choose to use a one-shot for the decoding, you are limited to a higher clock rate. There is more than one way to do a decode. The degenerate cases - all 1s, all 0s, repeating 0011 - can keep the data from *starting* a proper decode but cannot confuse the system once data

*has* started.

For your sampling challenged stream using the second half of the bit pair for data:

0000101111010000 |||: : : : 000: : work.

The first bit came from backing up the extraction in a sense suggested by Brian Drummond in the embedded clocks thread - retroactive decoding - along with the knowledge that the last half of the Manchester bit pair is 0. Similarly the first half of the bit pair at the end of your sequence absolutely *starts* with a zero. These known quantities weren't covered explicitly in the algorithm I've demonstrated but could have been.

The two bitstreams decoded identically in my example. You said 1001 but you showed 1100.

I'll have Verilog ready later. It's specifically for the 2x-4x (exlusive) case and - like any Manchester decoder - will have a lock delay based on the data and sampling conditions. This wide range means something about the rate must be known but no precision on that knowledge. Simple RC oscillators could be used at both ends for a 3x sampler and work with this algorithm.

Manchester decoding with greater than 3x allows the sampling to be split into distinct halves where the error for sampling of N/2 and N-wide pulses in an Nx sampling scheme do not overlap. They may abut at lower values and higher distortion but they don't overlap, allowing simpler decoding schemes.

Embedded clocks work.

- John_H

Vote

R

rickman 19 years ago

I can't say I understand your algorithm exactly, but try it on this example

00001100110010000

Can you describe your decoding in a way that can be implemented in logic. Even if it is a lookup table, it should be definable in logical terms.

John_H wrote:

Vote

J

John_H 19 years ago

000 realign 000 realign 001 Manchester pair 0.1 100 Manchester pair 1.0 110 Manchester pair 1.0 010 Manchester pair 01. 000 realign 000 realign ------------ 001 (assumed) Manchester pair 0.1 1 0 0 1 1 ------------

------- The rest of the post is just code ----------------------- Before simulation:

module Manchester ( input clk , input reset , input datIn , output reg [1:0] ManchesterPair , output reg usePair );

reg r_reset = 1'b1; reg startup; reg [4:0] rcv; reg short; reg long; reg [1:0] bitStart;

always @(posedge clk) begin r_reset

Vote

R

rickman 19 years ago

Ok, I thought this would fail and it did. This sequence is the same bit pattern as before, with one edge detected differently from jitter.

0001101110010000 0001100110010000 ______^ - this should be pointing to the second zero after the 1->0 transition

Can you fix your algorithm to deal with this case? Do you see what I am referr> rickman wrote:

Vote

J

John_H 19 years ago

01 01 10 10

?00?10?11?01?00? 0001100110010000 01\__/10 10 ambiguous 1001

Thank you, rickman. Your persistence has shown me that a 3x solution probably cannot unambiguously decode a Manchester encoded signal. I would hope that running through the simulations would have pointed this out quickly and clearly. For the 3x case, the unambiguous pairs determined by the runs of three or more constant values decode fine in the first case but cannot guarantee a decode in the second, at least if the run of 4 is discounted for the moment; I feel ignoring it is necessary for the general case since phases will slip between the sampler and the transmit clock.

In the second example you gave above that has the one key bit sampling the other side of the edge, the pairs (also working backward from the end of the pattern) end up with a gap that I cannot determine the appropriate bit.

Despite my early confidence, you appear correct.

My background dealt a lot with the CMI (Complementary Mark Inversion) encoding for 140 Mbit/s telecom data. This format requires a rising edge at mid bit for a zero (always a 01 pair of half bits) while a one had no mid-bit transition at all (either a 00 or 11 half bit pair). Transitions from a one to a one would switch polarity, guaranteeing at least one transition per bit period where falling edges *only* occur at the edge of the bit period. This encoding scheme appears to be easier to decode than the Manchester at lower oversampling rates.

I don't easily find a solution to the Manchester decoder that will work simply for a very wide frequency range (such as 10x) while handling small multiplier values without confusion. The limits for widest half pulse versus narrowest full pulse sample periods aren't so easy to define when the ideal multiplier is unknown.

I appreciate the "fun" in delving deeper into this subject than I normally would go. If you want something "like" Manchester but with better behavior, perhaps CMI is an encoding to consider.

- John Handwork

Vote

S

Symon 19 years ago

I think CMI has a problem with four times oversampling. From G.703:-

CMI is a 2-level non-return-to-zero code in which binary 0 is coded so that both amplitude levels, A1 and A2, are attained consecutively, each for half a unit time interval (T/2). Binary 1 is coded by either of the amplitude levels A1 or A2, for one full unit time interval (T), in such a way that the level alternates for successive binary 1s.

For binary 0, there is always a positive transition at the midpoint of the binary unit time interval.

For binary 1: a) there is a positive transition at the start of the binary unit time interval if in the preceeding time interval the level was A1; b) there is a negative transition at the start of the binary unit time interval if the last binary 1 was encoded by level A2.

So, a falling edge is the start of a symbol, no problem. The difficulty comes when there's a binary one symbol coded as (b) above, followed by a symbol of either type where the sampling point coincides with the rising edge, followed by a long string of binary zeroes. Until the next binary one comes along, it's not possible to resolve what the mystery bit is. I think if you wait for the next binary one you can resolve the mystery bit, but you'd need a fifo of depth equal to the longest string of zeroes you can receive.

With four times oversampling.

| 0 | 1 | X | 0 | 0 | 0 | 0 | 0 |

00110000011100110011001100110011 ^^ Which one of the marked bits is wrong?

BTW, excellent thread guys. It interesting that NRZ and RZ are much easier to recover with low sample rates. This is because with NRZ a transition means the start of a symbol, and ONLY occurs at the start of a symbol. With RZ, a rising edge means the start of a symbol, a falling edge the middle and ONLY those places. Manchester coding, and CMI to a lesser extent, suffer from the problem that certain transitions can be either at the start or in the centre of a bit. If only they didn't have DC, NRZ and RZ would be better choices! It's easy to see why AMI coding is so popular.

Cheers, Syms.

Vote

100 Mbit manchester coded signal in FPGA

Join the Discussion

Didn't find your answer?