How to avoid lossing channel bonding when using Rocket IO?

Hi

I'm using RocketIO with 4 channels, but they would loss banding when the data flow becomes fast. What would result in lossing channel banding? And how to avoid this problem?

Many thanks!

king

Reply to
king
Loading thread data ...

Your Majesty, Google this:- "channel bonding" site:xilinx.com

Answer 15050? Some of the many other answers?

Or maybe rogue clowns* have desoldered the FPGA? Seriously, a little more information is required! :-)

Your humble subject, Syms.

*
formatting link
:-)
Reply to
Symon

I have no idea why you think this would happen. If you have four channels and you clocked them identically at the source they will arrive at the destination at different times due and data on different channels would not be synchronised with each other. Channel bonding simply transmits a marker on all four channels on the same clock so that when they arrive at the destination the very small elastic buffer adjusts itself so that data out from the four channels comes out on the same clock correctly. In theory you only ever need to do it once but noise could confuse it. It doen't matter how much data you then send, the system knows that data a,b,c&d was all sent on clock x and the receiver presents a,b,c&d out on the same clock

Colin

k> Hi

Reply to
colin

to colin:

but it really happend :-( the frequency is 155MHz, and I made channel banding once a time when the data flow is slow, there is nothing wrong but as soon as the data flow becomes fast (also working at 155MHz), the 4 channels can not receive a special mark at the same time so I think they are lossing channel banding

I wonder what happend when the speed of data flow changes, and why this can result in the loss of channel banding.

Thanks a lot!!

king

Reply to
king

You need to rephrase your problem in a way that makes sense so that someone can help you. The numbers of bits that are being transmitted and received will always be the same so you can't simply say "as the data flow becomes fast" you need to elaborate on this so that others can understand what is happening.

You also haven't mentioned which device you are using or the protocol that you are using, so it is next to impossible to help out.

Ed

-- Xilinx Inc.

Reply to
Ed McGettigan

to McGettigan:

thank you for your remind.

I use the VII Pro xc2vp20 device. There are 8 GTs, and I used 4 of them. The protocol is Custom. The clk is 155MHz, the data width is 2 bytes, So these 4 bonded channels could process 10Gb/s data flow in theory.

Unfortunately, however, it could only work at 1.5Gb/s in field. At this(1.5Gb/s) time, all the things looks right: The FSM on both TX and RX sides transformed well,no invalid state, no invalid transmition. And the Rocket IO works well: no rxlossofsync, no rxnotintable... RX can receive the 4 SOPs at the same time(channel bonding ok).

As soon as the flow rate increase to 2Gb/s and upper, The RX side could not receive the 4 SOPs simultaneously, which is one of the transmition conditions of my RX FSM, so the FSM will stop. And I know that it lossed channel bonding. In order to overcome this error, I send CBS once again as soon as sending 65535 data packets at TX side. CBS would let the RX FSM continue to move, but it looks wierd. There are some invalid transmition happend, And some error signals of Rocket IO appears, such as rxlossofsync, rxnotintable.

That is my problem.

Many thanks!

king

Reply to
king

Ok, this has more information on your problem, but I still need to make some assumptions such as these:

1) You are communicating between 2 seperate boards 2) Each board has its own clock source and thus some frequency PPM difference 3) Your custom protocol transmits 8b10b characters 4) Your custom protocol transmits a clock correction (CC) sequence when in IDLE

My initial thought is that your custom protocol has not been designed to properly handle the necessary amount of clock tolerance.

When each link is transmitting 1.5Gbps with IDLE/clock correction characters between bursts the receivers are able to insert or delete enough of the IDLE/CC characters to handle the frequency difference between your receiver and transmitter. When you increase the burst size and the time between the IDLE/CC transmissions you exceed the size of the elastic FIFO in the receiver and get an underflow/overflow condition and your link goes bad (loss of channel bonding, invalid characters, etc..).

You mentioned transmitting 65535 data packets followed by a CBS (Channel Bonding Sequence). Even if a data packet was only 2 bytes, this would definitely be creating problems as most protocols would generate a CC every 4-8K bytes.

Instead of creating your own custom protocol I would suggest using the Aurora protocol that we developed and deliver as part of CoreGen. This will handle everything that you want to do including channel bonding in a nice simple package.

Ed

-- Xilinx Inc.

Reply to
Ed McGettigan

to Ed McGettigan,

Thank you very much and sorry for late reply. I am during vocation from 5.1 to 5.7

All you assumptions are correct!

As for CC, I send one after transmitting 5 data packet, each packet has a constant length: 256 bytes.

And now I found some new phenomenons: The data flow is TX_fifo -> (TX_logic -> RIO -> RX_logic) -> RX_fifo, when there is no operation to the rx_fifo( rx_fifo_wr is ' 0' ), the RIO would seem to be OK, the RX FSM is all right. As soon as I use the rx_fifo, the whole data flow would be bad, just as metioned.

I am very surprised at this phenomenon. Because there is no feedback from the rx_fifo to the RIO, why the operation to the fifo would influence the RIO?

Do you think this is caused by PAR?

Thank you!

king

Reply to
king

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.