Is there any advantage of using a BUFIO/BUFR's for driving IOB FF's
versus a BUFG? After looking through that section in the V4 user guide
I'm not sure I really see an advantage other than resource usage of the
global clock buffers.

Normally I just use the typical IBUFG -> DCM -> BUFG setup and use the
output of the BUFG to drive everything...


The BUFRs have less delay than the BUFGs, and the BUFIOs have less
delay than the BUFRs.  If you are going through a DCM, this may not
matter to you, but if you are not using a DCM, it can matter quite a
bit.  The difference in speed depends on the size of the device.  The
BUFRs and BUFIOs only go to three clock regions, while the BUFGs can
drive a clock to the entire chip.

For example, I use the BUFRs and BUFIOs for interfacing to PCI because
with them I can make the timing without using a DCM, and if I use
BUFGs, I can not.  I do not use DCMs on the PCI clock for several
reasons, including that it may be spread spectrum and may change
frequency, and I want to save them for other uses.


John McCaskill

The BUFIO and BUFR are really meant for use with source-synchronous data
inputs. The BUFIO can only be driven from clock-capable input pins. The
BUFR can be driven from the BUFIO or the fabric, but there not much
advantage if you're driving it from the fabric. The typical use is to
have the BUFIO clock the input SERDES at the fast clock rate, and use
the BUFR with its divider to clock the SERDES outputs to the fabric.

Joe Samson
Pixel Velocity

As stated BUFIO/BUFR is intended for Source Synchronous Applications.
There are significant advantages of using BUFIO/BUFR with ISERDES for
source sync data capture: The sampling window is smaller, and the setup
time is negative. One simply can instantiate BUFIO/BUFR/ISERDES to
capture data without doing any type of clock/data alignment; provided
that the data eye > (setup+hold).

Also, if you'd like to forward a clock for SDR application, you can
only use BUFIO to drive clock frequency > 500MHz.


Joseph Samson wrote:
Ok, well I'm pretty sure I don't have any need to implement a SERDES.
I'm not too familiar with source synchronous design, but based on

I'm pretty sure I won't have to worry about using BUFIO's/BUFR. The
only reason I asked this question originally is because in the past
I've only had experience with a Virtex 2 COTs board, and now that I'm
using a Virtex 4 board. I stumbled upon some examples that had
BUFIO/BUFR's, but was sort of clueless as to why they used them. The
board I'm using has some ADCs (250 MSPS) and a FIFO for data capture,
yet they are using the BUFIO/BUFR combo. Do you suppose they are using
this to achieve higher clock rates in the front end? I can meet
front-end timing with BUFG's just fine...

Is ISE smart about dealing with the BUFR's? What if you have too many
slices for a given BUFR region and they can't fit? Will it burp?

Here is my current processing chain. Maybe you would have some
recommendations for clock schemes? It would be much appreciated.

--> [ADC IN] -- 250 MHz--> [FIFO] -- X MHz --> [SEQ PROCESSING] --> X/2
MHz --> [OUT]

Currently, I have X = 200 MHz, but obviously, I'm not meeting timing.
This is okay tho, beacuse I have a differential programmable oscillator
coming from off chip to drive all of my sequential processing, so I've
been using 160 MHz due to the below constraint failure below:

Slack:                  -1.136ns (requirement - (data path - clock path
skew + uncertainty))
wideangle_drpp2_inst/qrdc_ins/multre_mux_reg_ins/d_r_12 (FF)
wideangle_drpp2_inst/qrdc_ins/doutre_reg_ins/d_r_4 (FF)
  Requirement:          5.000ns
  Data Path Delay:      6.071ns (Levels of Logic = 1)
  Clock Path Skew:      -0.005ns
  Source Clock:         logic_clk rising at 0.000ns
  Destination Clock:    logic_clk rising at 5.000ns
  Clock Uncertainty:    0.060ns
  Timing Improvement Wizard
  Data Path: wideangle_drpp2_inst/qrdc_ins/multre_mux_reg_ins/d_r_12 to
    Delay type         Delay(ns)  Logical Resource(s)
    ----------------------------  -------------------
    Tcko                  0.291
    net (fanout=6)        0.835
    Tdspdo_APL            3.913
    net (fanout=1)        0.783
    Tdick                 0.249
    ----------------------------  ---------------------------
    Total                 6.071ns (4.453ns logic, 1.618ns route)
                                  (73.3% logic, 26.7% route)

Is there anyway to improve timing with any of the Virtex 4


markus wrote:
I'm not familiar with ADC applications (i.e. whether they use source
sync data capture etc). I think I may know why they use BUFIO/BUFR.

I think what they are doing is to deserialize the incoming data. If you
were to capture 250 MSPS data in an SDR fashion and no FIFOs or
deserialization, you effectively force the FPGA fabric to operate at
250MHz. Although this is a valid and possible clock frequency to
achieve, it becomes more and more difficult to achieve this frequency
with a really packed FPGA. Perhaps the example deserialize the incoming
data so that it doesn't have to operate at 250MHz.

In regards to your question about BUFR. Yes, the tool should notify the
user when there's not enough logic in the BUFR clock domain. Basically,
what you want to do is to use the BUFR for deserialized clock domain in
the data capture process, not for processing (unless you really need
to). What I've personally have done int the past is to immediately
transfer the deserialized data from BUFR to BUFG using FIFO16s.

In regards to your question about meeting timing. Here are my options:

1). Go to higher speed grade, this is the easiest solution, but the
most uneconomical
2). Check the data path that's failing, and insert pipelining register
3). Deserialized the data from 250MHz to a lower frequency.

Hope this helps,


Brandon Jasionowski wrote:
You would just get an error saying the design is unroutable if that

You can try to see if you can abosorb some of the registers into DSP48.
If DSP48s are infereed, first thing to check if you have asynchrounous
reset. Registers can not be pushed into DSP48 if you do.

Jim /

Interesting.. I didn't know that fact about DSP48's. Indeed I am using
asynchronous resets, so I'll try changing the registers before and
after the multiplier to synch reset.

Much thanks to all.

Jim Wu wrote:
Several reasons I can think of to use BUFIO/BUFR:

* BUFIO is faster (>700MHz) and the clock skew is smaller.
* BUFIO/BUFR not only save global clock buffers, but also the global
routing resources (in V4, each clock region can ONLY have max 8 global
* The input clock can be easily divided with BUFR, which is very useful
of serdes designs.

Jim /

Brandon Jasionowski wrote:
