How To Synchronize FPGAs

Hello newsreaders,

For a while I have been confronted with the following task which I find quite challenging but unfortuantely didn't manage to solve it, yet. What I want to do is to use 2-4 FPGAs (Xilinx Virtex 2 Pro) together on one printed circuit board (PCB). They are used to process a large amount of incoming serial data (data rates of several GHz's). My idea is to handle that data parallel by the 2-4 FPGAs. But now there arises the problem how to adequately split the data and how to synchronize the FPGAs among one another, in particular? Is it possible or first of all a realistic idea to synchronize multiple FPGAs in the GHz range? How can this be done without much protocoll overhead? I would like to do it without applying an extra transfer protocoll among the FPGAs just for that purpose! Up to this date I didn't find a proper solution, yet. Maybe someone can give me a hint? Any ideas how to solve that problem?

Regards, Leroy Tanner

Reply to
Leroy Tanner
Loading thread data ...

Maybe I am missing something, but wouldn't you just drive all the chips with one onboard clock then in your code trigger the processes on the rising edge?

Don

Reply to
Don Golding

Post Below...

with

how

Start Post....

It gets tricky when you have multiple FPGAs clocked at hundred(s) of MHz. I don't have any direct expeience there, but I think looking for appnotes on vendor sites that address "Board Level De-skew" (using FPGA clocking resources to account for clock distribution headaches) and specifically for Xilinx, "Channel bonding" (using multiple RocketIO transceivers to receive data in parallel). The RocketIO transceivers are difficult beasts, at least if you're not using a standard protocol. I'm not sure if the channel bonding can span multiple V2pro devices, but I know it can span multiple transceivers.

Not sure on your budget, or application requirements, but it may be worthwhile going to a single, larger part that contains the resources you need. It at least partially removes the headache of high-speed PCB design/layout.

--Josh Model

Reply to
Josh Model

Reply to
Symon

--
Rick "rickman" Collins

rick.collins@XYarius.com
 Click to see the full signature
Reply to
rickman

I believe most important is to first latch the signals in the IOB to minimize clock skew problems. Otherwise, an external shift register to generate bit parallel signals for input to the FPGA.

-- glen

Reply to
glen herrmannsfeldt

"Symon" :

ok, I agree on that and it might be a good approach to minimize skewing in the first section. but nevertheless I must synchronize the other FPGAs to each other, not at a rate of several GHz but say at ca. 300 MHz. In my opinion a central clock isn't an appropriate solution!?

Reply to
Leroy Tanner

Think about what a central clock entails from purely a routing perspective. Let's assume you're an SI wizard, and have no issues there.

300 MHz would be ~ 3.3 ns per clock cycle. If I remember my rule of thumb, you've got about 6 inches per 1 ns for the speed of an electrical signal in FR-4 material. So the worst case match between all your data lines and all clock lines for all FPGA's will be the skew that eats into your timing budget.

Just as an example (I'm not really a layout person, so it's my posterior speaking), matching all lines to 4 FPGAs +/- 3 inches seems relatively tricky, but not completely unreasonable. So now ~1/3 of your entire clock cycle is wasted (more, if you were assuming DDR) before you even get to the FPGA fabric. it makes laying out your design that much more tricky.

Now, in the slightly more real world you've got to throw in the jitter present on a 300 MHz clock, impedance mismatches causing reflections, crosstalk on your board with all that data zipping around (because GHz and even 300 MHz lines are really antennae) and you've got a lot to deal with.

Anyhow, synchronzing dataflow at those speeds on a PCB is not nearly as simple as just plopping down a clock. It's a hard design, but you get to choose where to place the burden. If you've got really good PCB people, maybe they can match and terminate the really well. If you've got the DCM/ DLL (or their altera, or "insert brand" counterpart) hardware to de-skew the board clock, you could let the FPGA do it (though I don't recall at what frequencies the DCM's top out). If you've got neither, you might want to consider going to a single chip serial interface, because you're going to get into trouble otherwise.

--Josh

Reply to
Josh Model

Hi Leroy, Say you've got 4 FPGAs A, B, C & D. Each gets fed the 300MHz clock, so on the fabric of each FPGA is CLK_A, CLK_B etc. When you send data from (say) FPGA B to FPGA D, send a clock with the data, generated by FPGA B from its internal CLK_B, called (say) CLK_B_TO_D. Use this source synchronous clock with a DCM in FPGA D to get the data into a BRAM FIFO inside FPGA D. Get the data out from this FIFO into the fabric of FPGA D using CLK_D. Repeat for all the other paths. Any good? Cheers, Syms.

Reply to
Symon

There are two ways to approach this problem: (1) have each FPGA perform a part of the process on the entire data stream or (2) have each FPGA perform the entire process on part of the data stream. We once implemented (2) for a bandwidth expander where each chip did the complete process (one clock cycle Huffman decoding, translation of the code to a value, then arithmetic processing) for a portion of the incoming data stream. Each chip was provided a chunk of the incoming data (e.g., in a two-chip system, chip one processed chunks 1,3,5,... of the data and chip two was processed chunks 2,4,6,... of the data). We actually used two on the board because of I/O bandwidth limitations, but the chip was designed to allow for 1,2,4,or 8 chip operation.

-=Dave=-

Reply to
dave

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.