Virtex 4 Cameralink DCM Limitation

- E
- ees3dc
  
  Contact options for registered users
posted
11 years ago

Thu, Jun 14, 2012 11:11 AM

I have a cameralink (LVDS SERDES) I'm trying to capture data with using a Virtex 4 mature product. I have ported the XAPP485 deserializer using V4 primitives (slightly different to Spartan3A) and configured the DCM to run at 32MHz.

The problem is the LVDS-TTL receivers on the PCB cannot run at the 32MHz x7 rate. The slow risetime means I hardly see a 2V '1' threshold in the Xilinx.

I therefore need to reduce the incoming clock rate, but the DCM minimum frequency is 32MHz....

OK, I could reduce the cameralink down to 20MHz giving 1/20e6*7 bit period of 7.14ns. I have a 200MHz clock on the board for IDELAY, I could use both edges to oversample the cameralink data (and 20MHz subclock to ease data recovery) by sampling at 2.5ns. But this is going to be a nightmare to peice together (it can be done offline) and require lots of storage......

Is there a more elegant way of capturing the data please?

The application is to take an image of a satellite on seperation from the launcher in space - its a nice to have but I'd really like to make this work.

--------------------------------------- Posted through

formatting link

- G
- Gabor
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Thu, Jun 14, 2012 2:21 PM

Well it's more rework to your board, but the real way to do this is to bring the LVDS right into the V4 instead of using receivers. I've never used V4, but at least in V5 I found that I needed to use PLL's instead of DCM's for a reliable 7:1 deserialization. If you were going to convert signals to TTL levels, then it would make a lot more sense to use the National DS90CR288A chips instead of just receivers. Sampling with an asynchronous clock sounds like a nightmare.

-- Gabor

- C
- carltonnbd
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Thu, Jun 14, 2012 8:01 PM

n

x7

d

h

ees3dc,

Some thoughts:

If you are at the frequency limit of your LVDS translation buffers, you wil= l have no choice but to lower the clock frequency on the transmit input sid= e such that your LVDS translation buffers then work with a cameralink clock which is 7x= of the transmit as well as receive side clocks. That is true regardless of= whatever approach you take within the FPGA. If you must push it, and you d= esire all the margin on the logic levels you can get,then because at the FP= GA the IOB is a receiver, you could make the input buffer a LVCMOS25 which = has a Vin of 1.7v instead of 2.0v.

Check out the datasheet, the 32MHz lower bound is for the clkoutx. For clkd= v, it is 2MHz. Hence you should be able to derive a clock which is 1:7 from= the cameralink clock using the DCM. Higher input frequency can be an issue= as then the DCM jitter comes into play so be careful. I suspect it is one = (of possibly many) reasons Xilinx had to introduce PLLs in their successors= to the Virtex-4 when bit rates got beyond the ~3-500Mbps range.

An issue you will have is that while this derived receive side clock is for= all intents and purposes of the same frequency of the transmit side clock,= there is no (easy) way to guarantee they are both in phase. As such, you w= ill need to double-buffer the de-serializers for each of the 4 streams on t= he cameralink so that nothing is missed. This of course then followed by sy= nchronizing to the derived receive clock.

Regards, Carlton

- G
- Gabor
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Fri, Jun 15, 2012 10:16 PM

snipped-for-privacy@gmail.com wrote: [snip]

I'm not sure I follow you on this one. If you look at the transmit diagram in the DS90CR287 data sheet, the clock signal looks like any other data line with a word of 1100011, i.e. it is high for 4 bit periods and low for three, and a data word starts in the middle of a clock high period. All of my Camera Link receiver designs treat the clock like a 5th data line. I deserialize it and use the 0->1 transition to predict a word starting two cycles later. In a Virtex

5 I have to play some games to route the clk pair to a PLL as well as the deserializer. I'm not sure if V4 has any similar restrictions.

-- Gabor

- C
- carltonnbd
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Jun 18, 2012 4:01 PM

Hi Gabor,

If I understand your approach correctly, you are using the higher clock rat= e further into the chain. As such, your approach with detecting the rising = edge makes perfect sense. There is no need for the second register set.

What I was referring to had the presumption that it is/was desired to trans= ition to the slower clock. For example, perhaps right after the de-serializ= er. Because the de-serializer is always working when the link is active and= because there is a phase shift relative to the slower clock, it is necessa= ry to additionally register right after the de-serializer to permit alignme= nt to the slower clock while at the same time not dropping or losing anythi= ng.

Also note, that my comment about the DCM was backwards. I meant to mention = a 1:7 not 7:1. For some reason I got on a 7:1 track. Woops. To pull off a 1= :7 will require 2 DCMs. The first operates in a mode to permit a lower (th= an 32MHz) CLKIN frequency but outputs (a) a CLKFX higher frequency clock co= mpatible with the second DCM and (b) a regenerated slower clock. The second= DCM receives the now compatible clock from the first DCM and creates the f= inal overall 1:7.

Regards, Carlton

- G
- Gabor
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Jun 18, 2012 7:31 PM

further into the chain. As such, your approach with detecting the rising edge makes perfect sense. There is no need for the second register set.

transition to the slower clock. For example, perhaps right after the de-serializer. Because the de-serializer is always working when the link is active and because there is a phase shift relative to the slower clock, it is necessary to additionally register right after the de-serializer to permit alignment to the slower clock while at the same time not dropping or losing anything.

1:7 not 7:1. For some reason I got on a 7:1 track. Woops. To pull off a 1:7 will require 2 DCMs. The first operates in a mode to permit a lower (than 32MHz) CLKIN frequency but outputs (a) a CLKFX higher frequency clock compatible with the second DCM and (b) a regenerated slower clock. The second DCM receives the now compatible clock from the first DCM and creates the final overall 1:7.

It's been a while since I did that part of the design, but my recollection is that I gave up on the ISERDES and just ended up using input DDR flops at 3.5x the word rate. It was easier than trying to decipher the bit-slip business when I have no guaranteed pattern on the inputs (except the clock). The original design was implemented on Lattice ECP2 with their

4x gearbox and an output clock rate of 1.75x the word rate. In any case I put the data into a long (56-bit) register 2 or 4 bits at a time in the incoming clock domain (3.5x or 1.75x) and read it out 7 bits at a time with an unrelated clock guaranteed to exceed the word rate (a sort of FIFO). My design had six of these inputs, so I needed to conserve clock resources as much as possible. In V5, one PLL does the work of 2 DCM's, and does it better with more jitter tolerance and less output jitter. In my case I needed to deal with the full frequency range of Channel-Link or 20 to 85 MHz, which also requires using the DRP port of the PLL to switch between high and low frequency range settings. At the low end, the PLL works down to 19 MHz. In any case I didn't have the OP's problem of slow receivers because I put the camera link input directly to the V5 with only some ESD protection diodes in between.

Regards, Gabor

- L
- langwadt
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Jun 18, 2012 9:30 PM

ate further into the chain. As such, your approach with detecting the risin= g edge makes perfect sense. There is no need for the second register set.

nsition to the slower clock. For example, perhaps right after the de-serial= izer. Because the de-serializer is always working when the link is active a= nd because there is a phase shift relative to the slower clock, it is neces= sary to additionally register right after the de-serializer to permit align= ment to the slower clock while at the same time not dropping or losing anyt= hing.

n a 1:7 not 7:1. For some reason I got on a 7:1 track. Woops. To pull off a= 1:7 =A0will require 2 DCMs. The first operates in a mode to permit a lower= (than 32MHz) CLKIN frequency but outputs (a) a CLKFX higher frequency cloc= k compatible with the second DCM and (b) a regenerated slower clock. The se= cond DCM receives the now compatible clock from the first DCM and creates t= he final overall 1:7.

can't you just generate the 7x clock with clkfx of single dcm?

shift data and clock in to regsister on that 7x clk, look for the transition on clock move the right bits from the shifter to a register generate your slow clock with a divider aligned with the data update

-Lasse

- G
- Gabor
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Tue, Jun 19, 2012 6:59 PM

further into the chain. As such, your approach with detecting the rising edge makes perfect sense. There is no need for the second register set.

transition to the slower clock. For example, perhaps right after the de-serializer. Because the de-serializer is always working when the link is active and because there is a phase shift relative to the slower clock, it is necessary to additionally register right after the de-serializer to permit alignment to the slower clock while at the same time not dropping or losing anything.

1:7 not 7:1. For some reason I got on a 7:1 track. Woops. To pull off a 1:7 will require 2 DCMs. The first operates in a mode to permit a lower (than 32MHz) CLKIN frequency but outputs (a) a CLKFX higher frequency clock compatible with the second DCM and (b) a regenerated slower clock. The second DCM receives the now compatible clock from the first DCM and creates the final overall 1:7.

First, you really only need 3.5x rather than 7x if you use both clock edges. Second, a DCM is a bad choice because it cannot both multiply the clock by a number (other than 1 or 2) and also phase shift the clock to line up with the data eye. A PLL can multiply and phase shift at the same time. Cascading DCM's to get the phase shift is a poor choice because the FX output of the DCM adds a lot of jitter, making the second DCM prone to losing lock. You might be able to work around the phase shift problem using the IDELAY components of each input, though.

-- Gabor

- L
- langwadt
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Tue, Jun 19, 2012 10:01 PM

k rate further into the chain. As such, your approach with detecting the ri= sing edge makes perfect sense. There is no need for the second register set= .

transition to the slower clock. For example, perhaps right after the de-ser= ializer. Because the de-serializer is always working when the link is activ= e and because there is a phase shift relative to the slower clock, it is ne= cessary to additionally register right after the de-serializer to permit al= ignment to the slower clock while at the same time not dropping or losing a= nything.

tion a 1:7 not 7:1. For some reason I got on a 7:1 track. Woops. To pull of= f a 1:7 =A0will require 2 DCMs. The first operates in a mode to permit a lo= wer (than 32MHz) CLKIN frequency but outputs (a) a CLKFX higher frequency c= lock compatible with the second DCM and (b) a regenerated slower clock. The= second DCM receives the now compatible clock from the first DCM and create= s the final overall 1:7.

agreed, my thinking was assuming the eye doesn't move around too much you would only have to calibrate the idelay once.

but I'm not sure that theres a guarantee the fx output will always have the same phase alignment with regards to input when you don't have feedback

-Lasse

- C
- carltonnbd
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jun 20, 2012 5:08 PM

Lasse,

The preferred way to do the clock derivation would be as Gabor suggests, us= e a PLL. Unfortunately for the OP, V4 only has DCMs.

The reason for the 2 cascaded DCMs is because the OP has a desired clock re= lationship which is out of bounds for a single DCM. The OP's desired link c= lock frequency is something less than 32MHz. Which means that the first DCM needs to be a maximum range setup DCM. Problem is, the maximum range DCM ca= nnot directly derive a 3.5x or 7x clock based upon the limited input clock = frequency. So for the first DCM, the clk2x output is used to double the=20 frequency of the link clock frequency.

The clk2x clock frequency is now of a value whereby a maximum frequency set= up DCM can be used to derive the overall 3.5x or 7x clock by using the clkf= x and setting M and D to values which provide the overall desired frequency= .

As far as phase relationship on this second DCM, if clk0 is piped back to c= lkfb, clkfx is supposed to be aligned to the clk0 but with the twist that i= t is every D clkin cycles. If clk0 is aligned to clkin, clkfx should also b= e aligned to clkin.

Gabor is right, that depending upon arrangement of the 2 cascaded DCMs it i= s possible that the jitter can be out of limits. Generally speaking, it is = when clkfx is cascaded or clkfx is part of the feedback path where problems= can begin to arise. I believe however that the above implementation is ok = with respect to jitter being within limits because clkfx is only used as a final output and is not being fed back into the cascaded DCMs.

Regards, Carlton

- L
- langwadt
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jun 20, 2012 6:56 PM

use a PLL. Unfortunately for the OP, V4 only has DCMs.

relationship which is out of bounds for a single DCM. The OP's desired link= clock frequency is something less than 32MHz. Which means that the first D= CM

cannot directly derive a 3.5x or 7x clock based upon the limited input cloc= k frequency. So for the first DCM, the clk2x output is used to double the

etup DCM can be used to derive the overall 3.5x or 7x clock by using the cl= kfx and setting M and D to values which provide the overall desired frequen= cy.

clkfb, clkfx is supposed to be aligned to the clk0 but with the twist that= it is every D clkin cycles. If clk0 is aligned to clkin, clkfx should also= be aligned to clkin.

is possible that the jitter can be out of limits. Generally speaking, it i= s when clkfx is cascaded or clkfx is part of the feedback path where proble= ms can begin to arise. I believe however that the above implementation is o= k with respect to jitter being within limits because clkfx is only used

by why not just use the clk fx directly? as far as I can tell it can do 7x

might even go crazy and try 14x and use both edges via input ddr flops, that way you get 4x data rate sampling and realign at every edge which is how most full-speed usb devices do it

-Lasse

- C
- carltonnbd
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Jun 25, 2012 3:37 PM

Hi Lasse,

The problem is that the OP desires an incoming frequency less than 32MHz. This means the maximum-range type of DCM has to be used in order to accept = the less than 32MHz clock input. The tradeoff is that the clkfx is restrict= ed to what frequencies can be generated. In this case the upper limit on cl= kfx is < 3.5x and also 7x for a maximum-range type DCM.

To illustrate the limit, try creating a DCM which uses the clkfx in the cor= e generator for a V-4 target. Use an input frequency < 32MHz.

Regards, Carlton