Multi-FPGA Interconnection: latest techniques

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Hi Experts,

In FPGA Prototyping/Emulation flows,  Multi-FPGA partitioning puts limitation on performance  due to limited IO pins.
What are the latest Multi-FPGA Interconnection  techniques available today?  By using Multi Gigabit Transceivers , how much performance improvement is expected ?

Thanks in Advance
Parth

Re: Multi-FPGA Interconnection: latest techniques
Quoted text here. Click to load it

How much performance do you want?  There are transceivers upwards of 56Gbps
these days.  Questions:

How many transceivers can you get at that speed?
How to route an nn Gbps signal from one place to another?
How many transceivers can you successfully route and at what speed?
How to make that reliable in the face of bit errors, packet loss and other errors?
What end to end bandwidth can you actually acheive?
What latency impact does all that extra processing have?

Relevant paper of mine:
https://www.cl.cam.ac.uk/~atm26/pubs/FPL2014-ClusterInterconnect.pdf

Theo

Re: Multi-FPGA Interconnection: latest techniques
On Thursday, September 24, 2020 at 8:16:21 PM UTC+5:30, Theo wrote:
Quoted text here. Click to load it

Hi Theo,  

Thanks a lot for the reply.

On Xilinx UltraScale  board with 8 FPGAs , using automatic FPGA partitioning tools (which uses Muxes for pin multplexing , ie  HSTDM multiplexing ) , Maximum system performance achieved is 10MHz -15 MHz only.
 individual FPGA may run up to 100 MHz but overall performance is limited to 10MHz -15 MHz as the tool inserts pin Muxes in the order 8:1 , 16:1 so on

Do we have any Interconnect technology that can achieve 70 MHz-100 MHz on 4-8 FPGA board.?
If partitioning is done manual or by auto partition tool , can  BLUELINK Interconnect or GTX transceivers achieve 70 MHz-100 MHz speeds ? Interconnect logic area overhead can be tolerated.

Re: Multi-FPGA Interconnection: latest techniques
On Thursday, September 24, 2020 at 12:08:03 PM UTC-4, partha sarathy wrote:
Quoted text here. Click to load it

Are you sure you aren't doing something wrong?  The purpose of pin muxing would seem to be to increase the data rate.  But I assume this will incur pipeline delays.  Or do I not understand how this is being used?  

--  

  Rick C.

  - Get 1,000 miles of free Supercharging
We've slightly trimmed the long signature. Click to see the full one.
Re: Multi-FPGA Interconnection: latest techniques
On Thursday, September 24, 2020 at 11:24:39 PM UTC+5:30, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it
Hi Rick,  
Thanks for the reply with details.
Does the gigabit transceiver pipe line inserted delay  count more than  20ns say for 50MHz FPGA clocks?


Best Regards
Parth

Re: Multi-FPGA Interconnection: latest techniques
On Friday, September 25, 2020 at 10:25:38 PM UTC-4, partha sarathy wrote:
Quoted text here. Click to load it
ail.com wrote:
Quoted text here. Click to load it
ote:  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
le  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
 56Gbps  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
d other errors?  
Quoted text here. Click to load it
f  
Quoted text here. Click to load it
ioning tools (which uses Muxes for pin multplexing , ie HSTDM multiplexing  
) , Maximum system performance achieved is 10MHz -15 MHz only.  
Quoted text here. Click to load it
ted to 10MHz -15 MHz as the tool inserts pin Muxes in the order 8:1 , 16:1  
so on  
Quoted text here. Click to load it
z on 4-8 FPGA board.?  
Quoted text here. Click to load it
NK Interconnect or GTX transceivers achieve 70 MHz-100 MHz speeds ? Interco
nnect logic area overhead can be tolerated.
Quoted text here. Click to load it
g would seem to be to increase the data rate. But I assume this will incur  
pipeline delays. Or do I not understand how this is being used?  
Quoted text here. Click to load it
0ns say for 50MHz FPGA clocks?

Sorry, I'm not at all clear about what you are doing.  

Maybe I misunderstood what you meant by pin muxing.  Are they using fewer p
ins and sending data for multiple signals over each pin?  That would defini
tely slow things down.  

Using SERDES (the gigabit transceiver you mention) should speed that up, bu
t might include some pipeline delay.  I'm not that familiar with their oper
ation, but I assume you have to parallel load a register that is shifted ou
t at high speed and loaded into a shift register on the receiving end, then
 parallel loaded into another register to be presented to the rest of the c
ircuitry.  If that is how they are working, it would indeed take a full clo
ck cycle of latency.  

--  

  Rick C.

  + Get 1,000 miles of free Supercharging
We've slightly trimmed the long signature. Click to see the full one.
Re: Multi-FPGA Interconnection: latest techniques
Quoted text here. Click to load it

That's right - you get a parallel FIFO interface. There's no guarantee what
you put in will get to the other end reliably (if BER is 10^-9 say and your
bit rate is 10Gbps, that's one error every 0.1s).  So on these kinds of
links to be reliable you need some kind of error correction or
retransmission.  In the Bluelink case, that was hundreds of ns.

Basically you end up with something approaching a full radio stack, just
over wires.

Theo

Re: Multi-FPGA Interconnection: latest techniques
On Saturday, September 26, 2020 at 8:55:16 AM UTC+5:30, gnuarm.del...@gmail
.com wrote:
Quoted text here. Click to load it
  
Quoted text here. Click to load it
gmail.com wrote:  
Quoted text here. Click to load it
wrote:  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
ts  
Quoted text here. Click to load it
able  
Quoted text here. Click to load it
ce  
Quoted text here. Click to load it
of 56Gbps  
Quoted text here. Click to load it
d?  
Quoted text here. Click to load it
and other errors?  
Quoted text here. Click to load it
pdf  
Quoted text here. Click to load it
itioning tools (which uses Muxes for pin multplexing , ie HSTDM multiplexin
g ) , Maximum system performance achieved is 10MHz -15 MHz only.  
Quoted text here. Click to load it
mited to 10MHz -15 MHz as the tool inserts pin Muxes in the order 8:1 , 16:
1 so on  
Quoted text here. Click to load it
MHz on 4-8 FPGA board.?  
Quoted text here. Click to load it
LINK Interconnect or GTX transceivers achieve 70 MHz-100 MHz speeds ? Inter
connect logic area overhead can be tolerated.  
Quoted text here. Click to load it
ing would seem to be to increase the data rate. But I assume this will incu
r pipeline delays. Or do I not understand how this is being used?  
Quoted text here. Click to load it
0ns say for 50MHz FPGA clocks?
Quoted text here. Click to load it
pins and sending data for multiple signals over each pin? That would defini
tely slow things down.  
Quoted text here. Click to load it
but might include some pipeline delay. I'm not that familiar with their ope
ration, but I assume you have to parallel load a register that is shifted o
ut at high speed and loaded into a shift register on the receiving end, the
n parallel loaded into another register to be presented to the rest of the  
circuitry. If that is how they are working, it would indeed take a full clo
ck cycle of latency.  
Quoted text here. Click to load it
Hi Rick,  
Thanks for the clarifications.  It is obvious now that the Serdes is not su
itable for Pin Muxing

Regards
Parth

Re: Multi-FPGA Interconnection: latest techniques
On Sunday, September 27, 2020 at 9:31:42 AM UTC+5:30, partha sarathy wrote:
Quoted text here. Click to load it
il.com wrote:  
Quoted text here. Click to load it
e:  
Quoted text here. Click to load it
.@gmail.com wrote:  
Quoted text here. Click to load it
y wrote:  
Quoted text here. Click to load it
e:  
Quoted text here. Click to load it
puts  
Quoted text here. Click to load it
ilable  
Quoted text here. Click to load it
ance  
Quoted text here. Click to load it
s of 56Gbps  
Quoted text here. Click to load it
eed?  
Quoted text here. Click to load it
s and other errors?  
Quoted text here. Click to load it
t.pdf  
Quoted text here. Click to load it
rtitioning tools (which uses Muxes for pin multplexing , ie HSTDM multiplex
ing ) , Maximum system performance achieved is 10MHz -15 MHz only.  
Quoted text here. Click to load it
limited to 10MHz -15 MHz as the tool inserts pin Muxes in the order 8:1 , 1
6:1 so on  
Quoted text here. Click to load it
0 MHz on 4-8 FPGA board.?  
Quoted text here. Click to load it
UELINK Interconnect or GTX transceivers achieve 70 MHz-100 MHz speeds ? Int
erconnect logic area overhead can be tolerated.  
Quoted text here. Click to load it
uxing would seem to be to increase the data rate. But I assume this will in
cur pipeline delays. Or do I not understand how this is being used?  
Quoted text here. Click to load it
 20ns say for 50MHz FPGA clocks?  
Quoted text here. Click to load it
r pins and sending data for multiple signals over each pin? That would defi
nitely slow things down.  
Quoted text here. Click to load it
, but might include some pipeline delay. I'm not that familiar with their o
peration, but I assume you have to parallel load a register that is shifted
 out at high speed and loaded into a shift register on the receiving end, t
hen parallel loaded into another register to be presented to the rest of th
e circuitry. If that is how they are working, it would indeed take a full c
lock cycle of latency.  
Quoted text here. Click to load it
uitable for Pin Muxing  
Quoted text here. Click to load it

Multi-Gigabit Transceiver (MGT): Configurable hard-macros MGTs are implemen
ted for
inter-FPGA communication. The data rate can be as high as ~ 10Gbps [MGT, 20
14]. Nevertheless, the MGT has a high latency (~ 30 fast clock cycles) that
 limits the system clock frequency and only a few is available. When the TD
M ratio is 4, the system clock frequency
is ~ 7MHz [Tang et al., 2014]. In addition, the communication between MGTs  
is not errorfree. They come with a non-null bit error rate (BER). Therefore
, at this moment, MGT is not used as inter-FPGA communication architecture  
in multi-FPGA prototyping

Re: Multi-FPGA Interconnection: latest techniques
Quoted text here. Click to load it

It really depends on what you mean by 'prototyping'.  If you have
interconnect which is tolerant of latency, such that the system doesn't mind
that messages take several cycles to get from one place to another (typical
of a network-on-chip implementing say AXI), then using MGT with a
reliability layer is fine for functional verification.

If you mean dumping a hairball of an RTL netlist across multiple FPGAs and
slowing the the clock until everything works in a single cycle, then they're
not right for that job.

They're both prototyping, but at different levels of abstraction.

Theo

Site Timeline