Virtex-4FX embeded MAC and Rocket-IO data corruption??

Hi,

After a very enjoyable few days trying to sort out a test design involving Ethernet and Virtex-4 I thought it time to ask some advice.

I have a FX60's embedded MAC using RocketIO to talk 1000Base-X to an SFP module(cat5 copper) this is then cross-over cable connected to an Intel E1000 network card. There are no switches involved in the links. The design is based on the 1.4.1 reference design that comes with coregen

8.1 for the embedded-mac. I am not using the Host Interface or such like, just a basic logic hooked on the end of the MAC with it statically cconfigured using the tie-offs.

I can get the PC with Intel card to send raw Ethernet frames to my virtex-4's mac address and they seem to get through about 40% of the time, and I see the "frame good" signal, the rest of the time I see a "bad frame". On looking at more signals I am seeing a large amount of disparity error and NotInTable coming from the RocketIO module and what looks like corruption or bit shifts on the data.

I even see this when there is no traffic on the link and there are just idle frames being sent.. It seems semi-repetitive, with say the order of a few hundred good bytes between small bursts of bad ones.

Anyone got any ideas?

I have checked the network card, the cable(long ones, short ones, good quality ones and bad ones.) All cards and cables pass large statistic back-to-back testing at 1Gig between PC's with very few dropped frames.

My initial thoughts were the reference clock being fed to the RocketIO (which i think is okay, its the ref clock from a PCIexpress socket going via a ICS9DB202 to clean it up and make it 125Mhz.) I have considered heat too, so we bolted a larger heatsink and fan onto the chip...

Words of wisdom will be greatfully received...

-- /\/\arc Kelly ..Just your average physicist trying to get by in a world full of normal people...

Reply to
Marc Kelly
Loading thread data ...

Have you seen the release notes and applied all the fixes ?

Also, I don't know about 1000BaseX, but with SGMII, you need a MDIO clock ... either internal (by configuring the divider with the host internal) or external (via the mdio clock in pin). Without it, I never could get the SGMII to autonegotiate witht the PHY ...

I'm working with EMAC and SGMII, and I had errors because my sgmii board already has AC coupling capacitors in the rx path (at the phy end), so with the on-chip ac coupling, the signal was just too attenuated. Switching off the internal ac coupling did the trick.

If it's from a PCIexpress socket, are you sure it's spread spectrum is deactivated ? If not, even with a pll, for a few ms, it will be quite off 125MHz ...

Sylvain

Reply to
Sylvain Munaut

Marc, Have you designed the board with the Xilinx on in house ? After checking the quality of the reference clock I would look at the power supply to the MGTs and the quality of the PCB routing between the Xilinx and the SFP. Any way you can look at the RocketIO signals and check the eye pattern?

/MikeJ (x-physicist)

Reply to
MikeJ

It's actually a prototyping board bought from PLDA (their XpressFX60 board) with the SFP sockets already on it. They're main selling point is its PCIexpress capability, but we're after it for the high speed IO currently.

Its possible I think, would have to get our hardware people onto it, as I do mostly firmware and so they keep all very high spec scopes hidden from me :)

--
/\/\arc Kelly
..Just your average physicist trying to get by in a world full of normal
people...
Reply to
Marc Kelly

Yes, I had been hoping that one of them would magically fix things, sadly not.

I believe things are negotiating, although I may be wrong. The fact I am seeing real fames that work, and the system can echo them back to the PC as well gave me some confidence. I will check the MDIO clock issue however..

Ah, that does sound interesting, I will have to check he schematics for the board(s) tomorrow at work and see.. It does seem to be the kind of thing that might be causing it.. if so, then I owe you many beers...

yeah, we turned off all the spread spectrum settings in a moment of inspiration.. sadly didn't seem to have any effects. I even tried clocking it from a DCM generated 125Mhz clock, just to see what happened.. same effect as the proper ref clock.

--
/\/\arc Kelly
..Just your average physicist trying to get by in a world full of normal
people...
Reply to
Marc Kelly

Ok, I know of that board - they should know what they are doing and I assume they have looked at the eye pattern.

I know that feeling - having had some colleagues break/misplace expensive probes :)

Maybe it's worth replacing the reference clock with a quality low jitter differential oscillator and see if it makes any difference ? You can get away with some carefully matched length twisted pair mod wire - my company does it quite often ... but make sure the oscillator power supply is good - and put a smd cap across the pins of the oscillator at least.

/MikeJ

Reply to
MikeJ

Just to clarify, that is a cap across the power pins of the oscillator! /Mike

Reply to
MikeJ

The board actually has a spare place for mounting an oscillator that feeds into one of the MGT reference clocks, I need to check to see if it feeds the correct column to be used to drive the RocketIO I need..

'tis the joys of playing with such fun hardware I guess..

--
/\/\arc Kelly
..Just your average physicist trying to get by in a world full of normal
people...
Reply to
Marc Kelly

Well this is a long shot, but is it possible that one is fixed at Full Duplex (no-negotiate) while the other is trying to negotiate and falls-back to half-duplex? This is a fairly common problem that occurs and the link appears to work for simple 'pings', but any real traffic has massive amounts of errors. This is due to the fact that one is in full duplex and transmits while receiving and the half-duplex connection sees this as a collision.

Just a possibility... and I'm only SURE that this happens with

10baseTX and 100baseTX, not with 1000baseX

-bh

jitter

Reply to
bh

I have tried with both ends forced t full-duplex, just incase. and made a new crossover cable too, just incase. I had some good success with turning off the internal ac-coupling caps as someone mentioned, and things look more sane. For small packets ~65-128 bytes long I get good transmission with maybe 1-2% packet loss, larger packets seem to be a problem however.

With an idle link I see the /K28.5/D16.2/ idle pattern fine, but sometimes the /D16.2/ is corrupt, and gives a "notintable" error from the MGT, always with what seems to be the same pattern.

I need to get Synplicity's Identify_debugger to play nicely tomorrow with a nice long sample memory to check how regular this is happening. The external logic analyser I have access to currently doesn't have the depth when running at a decent speed.

Maybe a possible issue with the MGT itself? I can move to another one i think and test that.

--
/\/\arc Kelly
..Just your average physicist trying to get by in a world full of normal
people...
Reply to
Marc Kelly

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.