Virtex 4 Tapped Delay Lines

Hi

I was wondering if anybody could help. I'm looking for a way t

create tapped delay lines on a Xilinx Virtex 4 without having t specify which logic slices should be used. I'm trying to create time interval analyser to measur accurately (to within approx 500 picoseconds) the length of tim between a start and a stop pulse

If anybody has any simpler ideas of how to do this instead of usin

tapped delay lines i'd be very grateful. The alternative i'd bee thinking of were to up the clock speed (currently at 10MHz) to sa

250MHz and then have 8 phase shifts of this clock. When the star (or stop) pulse arrives it would only AND correctly with one set o phases specifying which part of the original clock pulse it was in but this might be prone to strange delays etc. through the AND gate

Thanks a lot

Alastai

Reply to
al99999
Loading thread data ...

Xilinx has a tapped delay line in each IOB for skewing the data path, and this can be between the input pad and the input FF.

"Virtex-4 modules have an IDELAY module in the input path of every user I/O. IDELAY allows the implementation of deskew algorithms to correctly capture incoming data. IDELAY can be applied to data signals, clock signals, or both. IDELAY features a fully-controllable,

64-tap delay line. Each tap delay is carefully calibrated to provide an absolute delay value of 78 ps independent of process, voltage, and temperature variations."

So the range is 5 ns, with 78 ps resolution.

formatting link

So you are on the right track. I think with IDELAY and your example numbers, you would bring the input signal in on 8 input pins, and set the delay from 0 to 3.5 nS , all the input flipflops are clocked by your 250 MHz clock, and then you decode the resulting 8 bit result to get your fine timing, and count the 250 MHz clock for coarse timing.

You can either double your resolution or halve the number of inputs if you also take advantage of the DDR capability:

At 250 MHz, your cycle time is 4 ns, and half cyycle time is 2 ns, so a DDR input FF-pair, with IDELAY set to 0, gives you sampling at

0 and 2 ns. With IDELAY set to 7 (546 ps) you get sampling at 0.546 and 2.546 ns.

I would also add some sort of calibration capability to the design such as being able to drive all the inputs from a reference signal (make the IOBs bidirectional), and then go through a calibration process to figure out the phase relationship to the effective 250 MHz clock edge, and the sampling time of the inputs. The result might feed a barrel shifter on the 8 bit code to handle the phase correction.

A totally different approach might be to use the SerDes blocks if you have them, with all the 10B/8B decoding and other protocol logic disabled. You would then get a stream of 20 or 40 bit words with the SerDes receiver running from the transmitter clock and have it setup for 2.000 gigabits per second. Again, you would need some sort of calibration process to correct for the arbitrary phase relationship between the TX clock and the deserialized RX data. Once you figur this out, it should remain stable till you reset/cycle the power.

Have fun,

Philip

Reply to
Philip Freidin

Quote: Xilinx has a tapped delay line in each IOB for skewing the data path, and this can be between the input pad and the input FF.

The range is 5 ns, with 78 ps resolution.

formatting link

Ok, looking at the datasheet for IDELAY in fixed delay mode, the only output is 'O' which is the data output. Do I not need to be able to access the output of the tap multiplexer?

Quote: So you are on the right track. I think with IDELAY and your example numbers, you would bring the input signal in on 8 input pins, and set the delay from 0 to 3.5 nS , all the input flipflops are clocked by your 250 MHz clock, and then you decode the resulting 8 bit result to get your fine timing, and count the 250 MHz clock for coarse timing.

What do you mean bring the input signal on to 8 input pins? Physically wire up the input pulse to 8 of the virtex 4 IO pins?

Sorry, just a little bit confused!! I'd be grateful for any more detail you could provide on how to go about doing this.

Thanks!

Alastair

Reply to
al99999

path

formatting link

Ok, looking at the datasheet for IDELAY in fixed delay mode, the onl

output is 'O' which is the data output. Do I not need to be able t access the output of the tap multiplexer

se

What do you mean bring the input signal on to 8 input pins?

Physically wire up the input pulse to 8 of the virtex 4 IO pins

Sorry, just a little bit confused!! I'd be grateful for any mor

detail you could provide on how to go about doing this

Thanks

A
Reply to
al99999

Al, Philip gave you good advice: For each input pin, you can specify a delay from the pad to the O. The granularity (given a 200 MHz calibration frequency) is 78.125 ps,but each tap has its own non-cumulative error of about 15 ps. I would improve your accuracy by using 16 inputs, each having a different IDELAY value, so that you divide the 5 ns into 16 steps of

312 ps each (give or take a 15 ps non-accumulative error). The tap delays are unaffected by any jitter of the 200 MHz clock. You interconnect all 16 inputs. When an edge comes in, it will be delayed differently in each IDELAY, and you use your 200 MHz clock to register a 16-bit input word which has ones on one end, and zeros on the other. It's then your job to find the transition point (look-up-tables are good for that),and that 4-bit binary value identifies the time as a fraction of your 5 ns timing (200 MHz) This means you have an absolute time for the rising as well as for the falling edge, and the difference is your pulse width. Worst-case error is thus +/- one tap. Peter Alfke, from home
Reply to
Peter Alfke

Are the pin-captures within this 15ps window, or is that just the error of the delay elements themselves ?

The tap

This sounds like a good app-note...., Peter ?

Such an app note could also cover : a) If you use just one FPGA pin (eg existing PCB design), what are the alternatives ?

b) Trickiest portion of this, I can see, will be crossing the 'phase boundary' between the delay line capture, and the counter-capture. Edge detect flag could be as simple as Sample.0 Sample.15.

For the Calibrate Philip mentions, and this ease of edge detect, the delay block should be toleranced to be always greater than the clock - ie maybe 6ns for 200MHz.

The Clock can be scaled, to match the FPGAs ability to count/capture the edges - which will be related a little to the max time between edges

- longer counters are slower

c) Pattern detect might need to be single sample error tolerant. ie a pattern of 111110100000000 might occur ?

-jg

Reply to
Jim Granville

The 15 ps are what I remember as the difference between the ideal delay from pin to O vs the measured delay, because the taps are not perfectly equal. As a difference between further non-adjacent taps, this statistical error actually gets smaller. The total delay over the 64 bits is exactly 5 ns = one period of the 200 MHz clock. It is servo-controlled. The 200 MHz are allowed to vary by +/-10%, (causing of course an inversely proportional change in tap delay) although that is not described in the data sheet.. I could imagine calibrating this with a variable frequency input of

Reply to
Peter Alfke

Thanks for all your help. One quick last question, is it possible t internally connect the pins, or do I need to physically wire them u external to the fpga? Thanks again

Alastai

Reply to
al99999

You need to look a bit further (which I admit is not easy, as the data sheet has to cover a massive amount of information, and if you don't know what you are looking for, it can be hard to find)

In the user's guide (ug070.pdf)

formatting link

(Just documenting the I/O is from page 215 through 384)

on page 309 is figure 7-1 and it shows some of what is in the input part of the I/O tile. (doesn't show ISERDES or I/O standards selection for example), but it does show the IDELAY and the DDR structure. Note that all the little muxes are config bits. The figure shows that the output from IDELAY can be used either directly (O), but it also can feed the inpout Flip Flops.

Right. Be careful of signal skew on your PCB. Or 4 inputs if you use my IDELAY + DDR suggestion.

This is tricky stuff. Unfortunately, to be able to get the most out of what these chips have to offer, it requires a lot of study. 169 pages of user guide is a lot to read when looking for one specific detail, but the investment in learning this stuff, is it is easier to find next time :-)

Cheers, Philip

Reply to
Philip Freidin

You could bring the signal in on 1 pin, and then setup 8 other I/Os as bi directional, and send the signal out on all 8, and then bring it back in on those 8, with the IDELAY stuff. Doing this will make the external pins wiggle, so they would all have to be "no connection" externally.

Overall, I would not recommend this structure, as you will not have good control of the delay to each of the output circuits, and this would therefore add to the error in timing.

I think it is best to distribute the signal on your PCB.

Philip

Reply to
Philip Freidin

..but be aware of the load you're placing on the signal. The FPGA's pins have considerable capacitance. If you wire the signal to 8 pins, you could have 100pF loading the end of your line. This gives you a rise time of the order of 5ns if driven from 50 ohms. The timing might be skewed as the signal rises through the thresholds of the inputs. Cheers, Syms. p.s. I'm wondering if you could use some spare unbonded IOBs for this? Take the signal in. Distribute it with low skew (easy to say!) to the outputs of

4 or 8 unbonded IOBs. Use the input delay thingies? The same as previously suggested, but without using up real IOBs?
Reply to
Symon

..but be aware of the load you're placing on the signal. The FPGA's pins have considerable capacitance. If you wire the signal to 8 pins, you could have 100pF loading the end of your line. This gives you a rise time of the order of 5ns if driven from 50 ohms. The timing might be skewed as the signal rises through the thresholds of the inputs. Cheers, Syms. p.s. I'm wondering if you could use some spare unbonded IOBs for this? Take the signal in. Distribute it with low skew (easy to say!) to the outputs of

4 or 8 unbonded IOBs. Use the input delay thingies? The same as previously suggested, but without using up real IOBs?
Reply to
Symon

After some thinking:

  1. You can divide the input load by using "zero-delay buffer chips" with up to 8 outputs and very little skew betrween them. And you can even compensate for the (assumed constant) skew between the outputs.(see below)
  2. You can drive all IDELAYs from the fabric, using internal fan-out. Again, you an compensate away the routing delay differences.

The compensation is done by setting all IDELAY values be to the same, and then observing the parallel captured word. It should always be either all zeros or all ones. If it's different, change the responsible IDELAY accordingly.

Obviously, this compensation deos not cover drift with temperature and Vcc.

Peter Alfke, Xilinx Applications

Reply to
Peter Alfke

Hi,

Is this two different approaches or two steps of one process? For 2) above, I connected the input pin to 16 IDELAY blocks but got this error:

FATAL_ERROR:Pack:pktv4iob.c:737:1.24.2.1 - Input buffer CH1_IBUF drives multiple DELAYCHAIN symbols. The implementation tools can not pack the design. Process will terminate. To resolve this error, please consult the Answers Database and other online resources at

formatting link
If you need further assistance, please open a Webcase by clicking on the "WebCase" link at
formatting link

How can I fan out the one input without getting this!!

Thanks

Al

Reply to
al99999

"al99999" schrieb im Newsbeitrag news: snipped-for-privacy@o13g2000cwo.googlegroups.com...

Hi Al,

I think Peter did suggest the impossible, there is no connections in the FPGA that would allow single signal to be routed to multiply IDELAY elements. The only possibility would be to use unbonded IOBs as route through, but I have not found an option that allows the use of unbonded IOBs in user design :(

Antti

Reply to
Antti Lukats

Don't you just love "SW that knows best", and tries to outhink the user ! :( This should be allowed, with a warning - but I can think on one caveat

- possibly Xilinx do not have test coverage on unbonded IOs, and so gives no guarantee they actually work ?

Peter/Austin ? - comments on user access to unbonded IO resource ?

-jg

Reply to
Jim Granville

So, to use unbonded IOBs in my V2PRO design I use something like:-

NET "all_your_base" LOC="UNB700";

in my UCF file. You need to turn off the DRC check in the "Generate Programming File" properties. Anyone able to try this in V4? FPGA editor is a good way to get the names of the unbonded IOBs.

Cheers, Syms.

Reply to
Symon

So, to use unbonded IOBs in my V2PRO design I use something like:-

NET "all_your_base" LOC="UNB700";

in my UCF file. You need to turn off the DRC check in the "Generate Programming File" properties. Anyone able to try this in V4? FPGA editor is a good way to get the names of the unbonded IOBs.

Cheers, Syms.

Reply to
Symon

"Symon" schrieb im Newsbeitrag news:438e3273 snipped-for-privacy@x-privat.org...

Hi Symon,

yes it works in V4 too, but the correct LOC syntax is

NET iopad LOC="UNB_X2Y53";

and with 7.1 tools for V2/P the UCF should be

NET iopad LOC="NOPAD45";

not UNBxxx

formatting link

there is IOB based onchip oscillator that can be used in unbonded IOB :)

Cheers, Antti

Reply to
Antti Lukats

Hi Antti, It seems there are two types of unbonded IOBs. One set called NOPADxx, another set called UNBxxx. Both seem to work in this type of application. My guess is that the UNBxxx type have a pad on the silicon for a bond wire which can be used in a package with a lot of balls (if you see what I mean!), whereas the NOPADxx ones are not used in any package so don't have this pad. Guesswork though! Cheers, Syms.

Reply to
Symon

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.