sysACE load vs bootloader load of vxWorks on ML310

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
I have a problem with vxWorks boot on my ML310. If I boot, using an ACE file
with my vxworks embedded through the sysACE controller, it works say 95% of
the time. If I boot, using an ACE file (same HW download.bit in it--but no
vxWorks elf--instead a bootloader), it fails 95% of the time. VxWorks begins
to boot and then hangs on the first access (ie an fopen() call) in vxWorks
on the compact flash.

We are seeing the same problems on 2 different ML310s and a Xilinx FAE was
able to replicate the problem on his ML310 as well (although maybe not the
intermittent sysACE load method). The boot loader method is >95% likely to
hang/crash whereas the sysACE loader is more troublesome to replicate.

The boot-loader method of vxWorks still almost always fails-even though
before jumping to the vxWorks RAM image I do a checksum verification of
actual file vs. RAM copied image and if mismatched-won't jump. But worse
yet, now it seems that vxWorks also occasionally (well, more than
occasionally-anywhere between 2.0% - 30.0% of tries) hangs even when booting
via the sysACE controller load method (ie the JTAG). I begin to suspect a
Xilinx driver issue or driver interaction issue with our CF card (same
behavior also being observed on Xilinx provided CF that came with the
ML310).

 Some data points to note:

 1)       When we hang, it appears the /wait line and rdy-/busy lines on the
CF are permanently low-which is most assuredly not a good thing.

2)       Sometimes the hang generates sysACE error LED. Even if not lit, the
signals /wait & /busy on the CF are always 0.

3)       I NEVER hang on CF accesses at the booter/ no vxWorks code level.
However, I find myself asking "is the vxWorks crashing because some CF FAT
lib operations at the booter level have left the CF or sysACE in a bad
state?" I am using Xilinx's FAT Fs Library on a CF that's formatted FAT16. I
also memset the 1st 64MB of DDR to 0 to ensure no leftovers from previous
boot attempts.

4)       I caught one of these sysACE error LED conditions and then used XMD
to query the CF registers. The following is the dump: (our sysACE core is at
0xCF00 0000 base address):


               ****************XMD% mrd 0xCF000000 32 b

CF000000:   00              CF000001:   00

CF000002:   00              CF000003:   00

CF000004:   34   4         CF000005:   42   B

CF000006:   35   5         CF000007:   00

CF000008:   80   ?         CF000009:   00

CF00000A:   00             CF00000B:   00

CF00000C:   00             CF00000D:   00

CF00000E:   00             CF00000F:   00

CF000010:   6B   k         CF000011:   00

CF000012:   00              CF000013:   00

CF000014:   16   ?        CF000015:   03   ?

CF000016:   0C   ?        CF000017:   10   ?

CF000018:   0A             CF000019:   08

CF00001A:   00             CF00001B:   00

CF00001C:   02   ?       CF00001D:   00

CF00001E:   00             CF00001F:   00

 In particular the bits of interest in this dump are:

 STATUSREG @ 0x4-0x7 offset:  bit2: CFGERROR "error has occurred in the
Compact Flash controller"

ERRORREG @ 0x8-0xB offset: bit7: CFGREADERR "an error occurred while reading
configuration information from Compact Flash"

CONTROLREG @ 0x18-0x1B offset: LOCKREQ is set true and RESETIRQ is true.

 5)       When the error occurs, it's as if the PPC is no longer executing
code.


So to summarize:

1)       Why the behavior of vxWorks boot seems to vary depending on whether
the bin executable was loaded via the PPC using Xilinx FAT Fs library versus
the vxWorks executable being loaded by sysACE controller (as an appended
image to the ACE file)? What is sysACE doing to CF/himself after loading
that the boot code/sysACE load of the boot code is not?

 2)       Why vxWorks hangs on std file i/o operations (intermittently)

 3)       Why even with errors occurring in the sysACE controller registers,
does the system permanently hang-or put another way, "shouldn't the drivers
recover gracefully when errors occur on the sysACE controller/core?"

At first I thought the issue had to be a problem with the Xilinx FAT calls
or the loader itself--but after verifying checksums of actual file vs the
RAM copy of vxWorks binary file, I would hope to have ruled this possibility
out-- and now that sysACE loads are also hanging, albeit much less
frequently, I'm thinking a HW or driver issue?

 Are there are any other signals I should be looking at? Anyone have any
idea of other things to try? Or have the name/email of someone at Xilinx
that could shed light on this issue? I've tried everything I can think of.
Unfortunately it seems the FAEs are extremely overburdened and finding
someone knowledagble of EDK and vxWorks is not easy to begin with..... (Is
someone from Xilinx corporate listening????)



Thanks,

Paul



Re: sysACE load vs bootloader load of vxWorks on ML310

Quoted text here. Click to load it

Paul,
  I saw something somewhere on the net, looked but could not find it again,
saying that they were having lots of trouble writing to the CF.  They
thought the sysace interrupt was firing all the time, and they could not
turn it off and or handle it correctly.  I had some problems getting my
little program going.  I ended up having the EDK generate the sysAce file
and dragging the file into the compact flash rather than doing it via
impact.  I also had another problem where sector 0 on the CF got trashed,
and I had to redo stuff with fdisk via a linux system to get it "write"
again. excuse the pun.  In a former life, I did some VxWorks stuff and filed
a problem report.  They pretty much called me every few days to status
whether I had made any progress solving the problem. ;,)  It sounds like you
are at a higher level of sophistication than I am.  I was thinking that heh,
SysAce ain't so bad, but it sounds like more fun is waiting for me.

regards
-Newman



Re: sysACE load vs bootloader load of vxWorks on ML310
Newman,

Perhaps I'm unclear about something... when I say 'JTAG' below--what I was
referring to is the sysACE controller ASIC JTAG interface to the FPGA and
it's auto-load of given file to FPGA at boot. Not impact or a GUI JTAG load.

I'm trying to get the case elevated in Xilinx support somehow. I'm puzzled
as to how a problem like this is now impacting more/all ML310 users. We sort
of suspected that the CF is somehow getting trashed--but if that is the
case, why I can retry a boot and it might boot the very next pwr cycle. That
does not compute--nor does the fact that low-level, non vxWorks boot code
can access the compact flash EVERY time. Purely conjecture, but I think
perhaps the Xilinx FAT Fs libraries are leaving the CF in some quasi-bad
state after I close the file and then crank up vxWorks--and it just flat out
does not like the state the boot-level lib calls left it in....

If you come across anything.. let me know.

Thanks,

Paul

Quoted text here. Click to load it



Re: sysACE load vs bootloader load of vxWorks on ML310

Quoted text here. Click to load it

OK, I was referring to generating the ACE file via Impact or generating it
via the EDK tool menu selection.  I've had luck generating the file with the
EDK method, but have not gone back to verify the Impact method works for me.

Quoted text here. Click to load it
My system is a Memec UltraController with a mezzanine memory card and an
add-on SysAce card.

Quoted text here. Click to load it
That is tough to figure out.  My problem was repeatable from a hard SysAce
reset start up.  I think it had a different failure scenario from a soft
(XMD) rst.  My guess was that the BRAMs were not reloaded on a soft reset.

Quoted text here. Click to load it
I did a lot of work adding debug code to the local libraries, and spewing
results out the serial port.  I had to recompile the local libraries
manually cause EDK by default, over writes the local files with fresh copies
when recompiled from the pull-down menu.  Perhaps the status of the Compact
Flash could be output out the serial port before entering and after exiting
the problem code area.
A major problem I  had with VxWorks stuff was that I had to beg VxWorks to
supply a section of source code of the area in which I suspected a problem
so I could add debug code into it to isolate the cause.  Sometimes the cause
was a bug in their source code.  My information is several years old.

- Good luck,
 Newman

Quoted text here. Click to load it



Site Timeline