Usage of BUFIO in Virtex 4?

Is there any advantage of using a BUFIO/BUFR's for driving IOB FF's versus a BUFG? After looking through that section in the V4 user guide I'm not sure I really see an advantage other than resource usage of the global clock buffers.

Normally I just use the typical IBUFG -> DCM -> BUFG setup and use the output of the BUFG to drive everything...

Thanks,

-Brandon

Reply to
Brandon Jasionowski
Loading thread data ...

The BUFRs have less delay than the BUFGs, and the BUFIOs have less delay than the BUFRs. If you are going through a DCM, this may not matter to you, but if you are not using a DCM, it can matter quite a bit. The difference in speed depends on the size of the device. The BUFRs and BUFIOs only go to three clock regions, while the BUFGs can drive a clock to the entire chip.

For example, I use the BUFRs and BUFIOs for interfacing to PCI because with them I can make the timing without using a DCM, and if I use BUFGs, I can not. I do not use DCMs on the PCI clock for several reasons, including that it may be spread spectrum and may change frequency, and I want to save them for other uses.

Regards,

John McCaskill

Reply to
John McCaskill

The BUFIO and BUFR are really meant for use with source-synchronous data inputs. The BUFIO can only be driven from clock-capable input pins. The BUFR can be driven from the BUFIO or the fabric, but there not much advantage if you're driving it from the fabric. The typical use is to have the BUFIO clock the input SERDES at the fast clock rate, and use the BUFR with its divider to clock the SERDES outputs to the fabric.

--
Joe Samson
Pixel Velocity
Reply to
Joseph Samson

As stated BUFIO/BUFR is intended for Source Synchronous Applications. There are significant advantages of using BUFIO/BUFR with ISERDES for source sync data capture: The sampling window is smaller, and the setup time is negative. One simply can instantiate BUFIO/BUFR/ISERDES to capture data without doing any type of clock/data alignment; provided that the data eye > (setup+hold).

Also, if you'd like to forward a clock for SDR application, you can only use BUFIO to drive clock frequency > 500MHz.

-M

Joseph Sams> > Is there any advantage of using a BUFIO/BUFR's for driving IOB FF's

Reply to
markus

Ok, well I'm pretty sure I don't have any need to implement a SERDES. I'm not too familiar with source synchronous design, but based on these,

formatting link
formatting link

I'm pretty sure I won't have to worry about using BUFIO's/BUFR. The only reason I asked this question originally is because in the past I've only had experience with a Virtex 2 COTs board, and now that I'm using a Virtex 4 board. I stumbled upon some examples that had BUFIO/BUFR's, but was sort of clueless as to why they used them. The board I'm using has some ADCs (250 MSPS) and a FIFO for data capture, yet they are using the BUFIO/BUFR combo. Do you suppose they are using this to achieve higher clock rates in the front end? I can meet front-end timing with BUFG's just fine...

Is ISE smart about dealing with the BUFR's? What if you have too many slices for a given BUFR region and they can't fit? Will it burp?

Here is my current processing chain. Maybe you would have some recommendations for clock schemes? It would be much appreciated.

--> [ADC IN] -- 250 MHz--> [FIFO] -- X MHz --> [SEQ PROCESSING] --> X/2 MHz --> [OUT]

Currently, I have X = 200 MHz, but obviously, I'm not meeting timing. This is okay tho, beacuse I have a differential programmable oscillator coming from off chip to drive all of my sequential processing, so I've been using 160 MHz due to the below constraint failure below:

Slack: -1.136ns (requirement - (data path - clock path skew + uncertainty)) Source: wideangle_drpp2_inst/qrdc_ins/multre_mux_reg_ins/d_r_12 (FF) Destination: wideangle_drpp2_inst/qrdc_ins/doutre_reg_ins/d_r_4 (FF) Requirement: 5.000ns Data Path Delay: 6.071ns (Levels of Logic = 1) Clock Path Skew: -0.005ns Source Clock: logic_clk rising at 0.000ns Destination Clock: logic_clk rising at 5.000ns Clock Uncertainty: 0.060ns Timing Improvement Wizard Data Path: wideangle_drpp2_inst/qrdc_ins/multre_mux_reg_ins/d_r_12 to wideangle_drpp2_inst/qrdc_ins/doutre_reg_ins/d_r_4 Delay type Delay(ns) Logical Resource(s) ---------------------------- ------------------- Tcko 0.291 wideangle_drpp2_inst/qrdc_ins/multre_mux_reg_ins/d_r_12 net (fanout=6) 0.835 wideangle_drpp2_inst/qrdc_ins/multre_mux_reg_ins/d_r Tdspdo_APL 3.913 wideangle_drpp2_inst/qrdc_ins/multre_ins/Mmult_p net (fanout=1) 0.783 wideangle_drpp2_inst/qrdc_ins/multre_p Tdick 0.249 wideangle_drpp2_inst/qrdc_ins/doutre_reg_ins/d_r_4 ---------------------------- --------------------------- Total 6.071ns (4.453ns logic, 1.618ns route) (73.3% logic, 26.7% route)

Is there anyway to improve timing with any of the Virtex 4 capabilities?

Thanks,

-Brand> As stated BUFIO/BUFR is intended for Source Synchronous Applications.

Reply to
Brandon Jasionowski

I'm not familiar with ADC applications (i.e. whether they use source sync data capture etc). I think I may know why they use BUFIO/BUFR.

I think what they are doing is to deserialize the incoming data. If you were to capture 250 MSPS data in an SDR fashion and no FIFOs or deserialization, you effectively force the FPGA fabric to operate at

250MHz. Although this is a valid and possible clock frequency to achieve, it becomes more and more difficult to achieve this frequency with a really packed FPGA. Perhaps the example deserialize the incoming data so that it doesn't have to operate at 250MHz.

In regards to your question about BUFR. Yes, the tool should notify the user when there's not enough logic in the BUFR clock domain. Basically, what you want to do is to use the BUFR for deserialized clock domain in the data capture process, not for processing (unless you really need to). What I've personally have done int the past is to immediately transfer the deserialized data from BUFR to BUFG using FIFO16s.

In regards to your question about meeting timing. Here are my options:

1). Go to higher speed grade, this is the easiest solution, but the most uneconomical 2). Check the data path that's failing, and insert pipelining register 3). Deserialized the data from 250MHz to a lower frequency.

Hope this helps,

-M

Brand> Ok, well I'm pretty sure I don't have any need to implement a SERDES.

Reply to
markus

Several reasons I can think of to use BUFIO/BUFR:

  • BUFIO is faster (>700MHz) and the clock skew is smaller.
  • BUFIO/BUFR not only save global clock buffers, but also the global routing resources (in V4, each clock region can ONLY have max 8 global clocks.
  • The input clock can be easily divided with BUFR, which is very useful of serdes designs.

Cheers, Jim

formatting link

Brand> Is there any advantage of using a BUFIO/BUFR's for driving IOB FF's

Reply to
Jim Wu

You would just get an error saying the design is unroutable if that happens.

You can try to see if you can abosorb some of the registers into DSP48. If DSP48s are infereed, first thing to check if you have asynchrounous reset. Registers can not be pushed into DSP48 if you do.

Cheers, Jim

formatting link

Reply to
Jim Wu

Interesting.. I didn't know that fact about DSP48's. Indeed I am using asynchronous resets, so I'll try changing the registers before and after the multiplier to synch reset.

Much thanks to all.

-Brand> > Is ISE smart about dealing with the BUFR's? What if you have too many

Reply to
Brandon Jasionowski

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.