sysACE load vs bootloader load of vxWorks on ML310

I have a problem with vxWorks boot on my ML310. If I boot, using an ACE file with my vxworks embedded through the sysACE controller, it works say 95% of the time. If I boot, using an ACE file (same HW download.bit in it--but no vxWorks elf--instead a bootloader), it fails 95% of the time. VxWorks begins to boot and then hangs on the first access (ie an fopen() call) in vxWorks on the compact flash.

We are seeing the same problems on 2 different ML310s and a Xilinx FAE was able to replicate the problem on his ML310 as well (although maybe not the intermittent sysACE load method). The boot loader method is >95% likely to hang/crash whereas the sysACE loader is more troublesome to replicate.

The boot-loader method of vxWorks still almost always fails-even though before jumping to the vxWorks RAM image I do a checksum verification of actual file vs. RAM copied image and if mismatched-won't jump. But worse yet, now it seems that vxWorks also occasionally (well, more than occasionally-anywhere between 2.0% - 30.0% of tries) hangs even when booting via the sysACE controller load method (ie the JTAG). I begin to suspect a Xilinx driver issue or driver interaction issue with our CF card (same behavior also being observed on Xilinx provided CF that came with the ML310).

Some data points to note:

1) When we hang, it appears the /wait line and rdy-/busy lines on the CF are permanently low-which is most assuredly not a good thing.

2) Sometimes the hang generates sysACE error LED. Even if not lit, the signals /wait & /busy on the CF are always 0.

3) I NEVER hang on CF accesses at the booter/ no vxWorks code level. However, I find myself asking "is the vxWorks crashing because some CF FAT lib operations at the booter level have left the CF or sysACE in a bad state?" I am using Xilinx's FAT Fs Library on a CF that's formatted FAT16. I also memset the 1st 64MB of DDR to 0 to ensure no leftovers from previous boot attempts.

4) I caught one of these sysACE error LED conditions and then used XMD to query the CF registers. The following is the dump: (our sysACE core is at

0xCF00 0000 base address):

****************XMD% mrd 0xCF000000 32 b

CF000000: 00 CF000001: 00

CF000002: 00 CF000003: 00

CF000004: 34 4 CF000005: 42 B

CF000006: 35 5 CF000007: 00

CF000008: 80 ? CF000009: 00

CF00000A: 00 CF00000B: 00

CF00000C: 00 CF00000D: 00

CF00000E: 00 CF00000F: 00

CF000010: 6B k CF000011: 00

CF000012: 00 CF000013: 00

CF000014: 16 ? CF000015: 03 ?

CF000016: 0C ? CF000017: 10 ?

CF000018: 0A CF000019: 08

CF00001A: 00 CF00001B: 00

CF00001C: 02 ? CF00001D: 00

CF00001E: 00 CF00001F: 00

In particular the bits of interest in this dump are:

STATUSREG @ 0x4-0x7 offset: bit2: CFGERROR "error has occurred in the Compact Flash controller"

ERRORREG @ 0x8-0xB offset: bit7: CFGREADERR "an error occurred while reading configuration information from Compact Flash"

CONTROLREG @ 0x18-0x1B offset: LOCKREQ is set true and RESETIRQ is true.

5) When the error occurs, it's as if the PPC is no longer executing code.

So to summarize:

1) Why the behavior of vxWorks boot seems to vary depending on whether the bin executable was loaded via the PPC using Xilinx FAT Fs library versus the vxWorks executable being loaded by sysACE controller (as an appended image to the ACE file)? What is sysACE doing to CF/himself after loading that the boot code/sysACE load of the boot code is not? 2) Why vxWorks hangs on std file i/o operations (intermittently) 3) Why even with errors occurring in the sysACE controller registers, does the system permanently hang-or put another way, "shouldn't the drivers recover gracefully when errors occur on the sysACE controller/core?"

At first I thought the issue had to be a problem with the Xilinx FAT calls or the loader itself--but after verifying checksums of actual file vs the RAM copy of vxWorks binary file, I would hope to have ruled this possibility out-- and now that sysACE loads are also hanging, albeit much less frequently, I'm thinking a HW or driver issue?

Are there are any other signals I should be looking at? Anyone have any idea of other things to try? Or have the name/email of someone at Xilinx that could shed light on this issue? I've tried everything I can think of. Unfortunately it seems the FAEs are extremely overburdened and finding someone knowledagble of EDK and vxWorks is not easy to begin with..... (Is someone from Xilinx corporate listening????)

Thanks,

Paul

Reply to
Bo
Loading thread data ...

Paul, I saw something somewhere on the net, looked but could not find it again, saying that they were having lots of trouble writing to the CF. They thought the sysace interrupt was firing all the time, and they could not turn it off and or handle it correctly. I had some problems getting my little program going. I ended up having the EDK generate the sysAce file and dragging the file into the compact flash rather than doing it via impact. I also had another problem where sector 0 on the CF got trashed, and I had to redo stuff with fdisk via a linux system to get it "write" again. excuse the pun. In a former life, I did some VxWorks stuff and filed a problem report. They pretty much called me every few days to status whether I had made any progress solving the problem. ;,) It sounds like you are at a higher level of sophistication than I am. I was thinking that heh, SysAce ain't so bad, but it sounds like more fun is waiting for me.

regards

-Newman

Reply to
newman

Newman,

Perhaps I'm unclear about something... when I say 'JTAG' below--what I was referring to is the sysACE controller ASIC JTAG interface to the FPGA and it's auto-load of given file to FPGA at boot. Not impact or a GUI JTAG load.

I'm trying to get the case elevated in Xilinx support somehow. I'm puzzled as to how a problem like this is now impacting more/all ML310 users. We sort of suspected that the CF is somehow getting trashed--but if that is the case, why I can retry a boot and it might boot the very next pwr cycle. That does not compute--nor does the fact that low-level, non vxWorks boot code can access the compact flash EVERY time. Purely conjecture, but I think perhaps the Xilinx FAT Fs libraries are leaving the CF in some quasi-bad state after I close the file and then crank up vxWorks--and it just flat out does not like the state the boot-level lib calls left it in....

If you come across anything.. let me know.

Thanks,

Paul

Reply to
Bo

OK, I was referring to generating the ACE file via Impact or generating it via the EDK tool menu selection. I've had luck generating the file with the EDK method, but have not gone back to verify the Impact method works for me.

My system is a Memec UltraController with a mezzanine memory card and an add-on SysAce card.

That is tough to figure out. My problem was repeatable from a hard SysAce reset start up. I think it had a different failure scenario from a soft (XMD) rst. My guess was that the BRAMs were not reloaded on a soft reset.

I did a lot of work adding debug code to the local libraries, and spewing results out the serial port. I had to recompile the local libraries manually cause EDK by default, over writes the local files with fresh copies when recompiled from the pull-down menu. Perhaps the status of the Compact Flash could be output out the serial port before entering and after exiting the problem code area. A major problem I had with VxWorks stuff was that I had to beg VxWorks to supply a section of source code of the area in which I suspected a problem so I could add debug code into it to isolate the cause. Sometimes the cause was a bug in their source code. My information is several years old.

- Good luck, Newman

Reply to
newman

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.