DIFF_OUT buffer example

Following up on John Providenza's question about the DIFF_OUT buffer feature, I've put together a small example which builds a complementary clock input buffer out of two normal IBUFGDS's.

Also for reference, I've copied my original notes about this handy feature of the V2 & S3 families.

Brian

All the V2-ish differential input buffers have a complementary output available, that can be used to create a 180 degree clock without needing a DCM.

These can also be used just to invert a differential input without needing any other logic (or board cuts & jumps).

Look at the DIFFS component in fpga_editor to see what's going on; besides the normal 'phantom' route from the DIFFS to the DIFFM, there's also a route from the DIFFM to a differential receiver in the DIFFS that outputs the complement signal.

I first spotted these when they showed up in early versions of XAPP622 as a hard macro.

Support & tool bugs for these have varied version to version, see Answer Record 21958 for recent problems.

I've banged into various other problems in using them over the years; if I get a chance this weekend, I'll try to dig up some old webcase code showing how to create one out of two normal IBUF{G}DS's as a work around.

These can be used on regular IOB inputs as well as global clock inputs, but you've generally needed to LOC the global input buffer and bufg's to allowed sites to get this to work.

search for ibufgds_diff_out ibufds_diff_out

--

--

-- diff_out buffer example

--

-- shows how to create complementary internal clocks using

-- IBUF{G}DS's with neither a DCM nor local inversion required

--

-- forwards a global clock input to output, output/2

--

-- substitutes two ibuf{g}ds's for ibuf{g}ds_diff_out component;

-- various tool revs have choked when using attributes on those

--

-- intended for V2-ish family members

--

-- !!! example LOC constraints specific to XC3S200-FG256 !!!

--

-- COMPLETELY UNTESTED; SYNTHESIZED WITH 6.3 & EXAMINED IN FPGA_EDITOR

--

-- Input Clocking:

-- this example doesn't use the resulting clock for DDR inputs,

-- but best (or at least easier to analyze) DDR input timing may

-- result when using CLB registers rather than DDR IOB input regs

--
library ieee;
  use ieee.std_logic_1164.all;
 Click to see the full signature
Reply to
Brian Davis
Loading thread data ...

Hi Brian, I'm struggling a little to see why I'd require a complementary clock. The DDR output IOBs have inversion control on both clock inputs, so why not just connect the normal clock to both pins and invert the appropriate one? Are you saying that a local inversion affects the skew? I have seen big clock nets' mark/space get affected by a lot of loads, is this the problem you're addressing? IFAIK, all the clocked resources have programmable inversion so what am I missing? Cheers, Syms.

Reply to
Symon

Exactly, the local inversion feature introduces quite a bit of skew, which can be avoided by using complementary internal clocks. (excluding from this discussion V4 with internal diff clock nets)

The DIFF_OUT feature lets you get a low jitter complementary DDR clock on-board without needing a DCM (with its inherent jitter)

It also can be used to invert a differential input right at the pad, which I didn't show in the example, but that same input net swap trick works with IBUFDS buffers too.

From my past measurements of internal clocks (using clock forwarding) it looks like the internal clock net rise/fall is quite asymmetric; so, it's best to use the same edge sense of complementary clock phases.

If you use a DCM with duty cycle correction, it pre-skews the driver so that both the threshold crossings line up again near 50%, but now you're stuck with the DCM jitter and other baggage ( and at higher input frequencies, eventually the duty cycle correction makes the clock pulse sallying forth from the driver extremely narrow )

Yes, that too; the other trick shown in the example is how to keep the two IOB DDR clock nets identically loaded by splitting the internal logic clock loads out onto another BUFG net.

Brian

Reply to
Brian Davis

Symon wrote

from XAPP462, page 37:

The CLKx clock signal precisely triggers the DDR flip-flop's C0 input at the start of the clock period. Similarly, the CLKx180 clock signal precisely triggers the DDR flip-flop's C1 input halfway through the clock period. The cost of this approach is an additional global buffer and global clock line, but it potentially reduces the potential duty-cycle distortion by approximately 300 ps..

Reply to
Tim

Thanks Brian, that makes sense. I've seen similar clock skew effects before, but never bad enough yet to need separate clocks. If I do, I'll remember your neat solution! In fact, I've just remembered something that I had to fix with a DCM doubler, I'll try this on it when I get time. BTW, as it comes for free, I guess it's a complimentary complementary clock! I'll get me coat... Cheers, Syms.

Reply to
Symon

Hi Tim, OK, I guess that's why I've not had problems with using just one clock. Even at >600Mbps I've got enough slack in my timing budget to cope with 300ps. Thanks for the reference! Cheers, Syms.

Reply to
Symon

Thanks for pointing out that link.

One caution on XAPP462 v1.1 : the novice at Xilinx who wrote the "Skew Adjustment" section (pp 32-34) got the descriptions and figures completely backwards, and confused the terms 'skew' and 'delay'.

Pages 4-5 of XAPP259 give a much better description of the delay element.

------

DESKEW_ADJUST = SYSTEM_SYNCHRONOUS :

Inserts a delay into the DCM FEEDBACK path, which makes the output clock happen EARLIER. ( not later, as depicted in XAPP462 )

This increases setup, guarantees zero hold, and adds a temp and VCCAUX affected delay element into the DCM deskew path.

DESKEW_ADJUST = SOURCE_SYNCHRONOUS :

Removes delay element from the DCM FEEDBACK path, which makes the output clock happen LATER. ( not earlier, as depicted in XAPP462 )

This reduces setup time, increases hold time, but results in a smaller overall input sampling window.

------

For DDR input applications, or for cascaded DCM's, you generally want to be in SOURCE_SYNCHRONOUS mode (the latest few revisions may do that automatically for DCM cascades)

See also Answer Records 12406, 18079

Brian

Reply to
Brian Davis

Does this run into skew problems between the main clock and the IOB clock?

--
The suespammers.org mail server is located in California.  So are all my
other mailboxes.  Please do not send unsolicited bulk e-mail or unsolicited
 Click to see the full signature
Reply to
Hal Murray

Bryan,

I posted a question about this technique in a response to a separate thread (multiphase data extraction question).

I'm using this to gain access to the IDDRs associated with both pads of a diff pair, so I can sample the input on four phases of a clock, with very low skews in the data paths.

Do you recommend separate ibufds primitives, or a single ibufds_diff_out primitive?

Andy Jones

Reply to
Andy

The only reason I started using two IBUFDS's instead of one IBUFDS_DIFF_OUT was to avoid various tool bugs that dropped placement, I/O standard, and termination attributes when applied to the IBUFDS_DIFF_OUT components.

The IBUFDS_DIFF_OUT is really just two IBUFDS's in disguise for V2/S3, but I haven't looked at the V4 implementation.

Brian

Reply to
Brian Davis

The output DDR nets traverse loaded->unloaded, which shouldn't be a problem ( except for the usual caveat about perhaps clocking the falling edge data with a falling edge clock ahead of the IOB ).

DDR inputs traverse unloaded->loaded, which might require opposite edge or 90/270 phasing.

IIRC, for fast V2 DDR inputs I used two differential local clock inputs ( to work around limited local clock routing resources ), DDR registers implemented in CLBs ( published IOB timing at the time was obsfucated by the inclusion of DCM jitter in IOB setup/hold numbers), and a global clock input driving a DCM to generate 90/270 phases to help reclock the two-wide data path phases into the global clock domain.

maybe I should have used input latches instead :)

Brian

Reply to
Brian Davis

Hi,

Maybe you all know... if not... take a look to /Xilinx/vhdl/src/unisims/unisim_VITAL.vhd there are the vhdl VITAL source code for unisim library used for simulation purpose.

there are DIFF_OUT, IDDR... almost all

Sandro

Reply to
Sandro

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.