Xilinx Virtex-4 OCM Usage Issues

Hello to the group!

I've been struggling to get the Xilinx IOCM and DOCM modules working with the PPC405 in my current design and I'm starting to run out of ideas. The first iteration of the design uses cached SDRAM via the PLB to store/load the boot code and runs without issue. Since the design is starting to get full (running out of LUTs, but BRAMs are available), it was decided that it might be worthwhile to use the OCM interface to cut down on logic. The OCM should be the perfect solution because the boot code is currently only called once and then code executes out of SDRAM.

I have been able to get the OCM modules connected and currently the boot code makes it through without issue. The problem occurs just after the conditional jump to SDRAM: the first instruction out of SDRAM is completed and then PPC405 stops. I have been able to connect via a debugger and everything appears to be fine at the stopped location, but if I step the processor, it gets lost and never returns. I am not sure why it stops at this point (See code below).

I have gone through the OCM/Virtex-4 information and everything seems ok:

  • The errata 212/213 fix is in place
  • Cache is disabled (cache was working fine on the first design)
  • I have compared all the PPC registers at the point of failure with the values from the first iteration and have found no major unexplainable differences
  • I have added a PLB_IBA core and observed that the OCM design loads the first four instructions from SDRAM after completing the boot code. I have verifed that these instructions are correct. After this point, however, there is no more activity on the PLB nor the OCM busses
  • An ILA on the OCM bus showed that the instructions stop being executed by the IOCM following the jump (and it looks like a few extra instructions are loaded from IOCM because of the conditional branch)
  • Neither the ILA nor the PLB_IBA cores showed an error/abort occuring. The debugger did not indicate that any exceptions had occurred
  • I have tried changing the OCM values (range checking/fixed latency/auto-detect clocking), but this seems to make no difference

  • The memory map used: SDRAM 0x0000 0000 - 0x01FF FFFF DOCM 0x2080 0000 - 0x2080 1FFF IOCM 0xFFFFC000 - 0xFFFF FFFF

  • The Assembly code is nothing fancy: ...
0xffffca2c main+0x258: 7c0903a6 mtctr r0 0xffffca30 main+0x25c: 4e800421 bctrl 0x2000: 7ca62b78 mr r6, r5 0x2004: 7c852378 mr r5, r4 ...

Basically it seems like the bus is hooked up correctly, but that maybe a register bit or mode is not correct. I am wondering if anyone in this forum has the IOCM/DOCM working and also executes code out of SDRAM (or anyone else who has comment) - are there any register bits that I might have left out? Does software have to do anything differently now that the design is non-cached (I have tried initializing all cached registers...)?

Thanks,

-Charles Eddleston

Reply to
charles.eddleston
Loading thread data ...

I assume that the end of your bootloader looks something like this:

copy_program_to_sdram(); jump_to_sdram();

Do you issue a sync and isync instruction before jumping to sdram? Personally I've had the experience that the ppc405 can act very weird if you don't issue a sync and isync instruction before you jump to recently modified code.

/Andreas

Reply to
Andreas Ehliar

Yeah the boot code pretty much copies the program to SDRAM, loads the jump address and jumps. I modified the code to include the isync/sync instructions:

0xffffca30 main+0x25c: 4c00012c isync 0xffffca34 main+0x260: 7c0004ac sync 0xffffca38 main+0x264: 7c0903a6 mtctr r0 0xffffca3c main+0x268: 4e800421 bctrl

The code still halts as it did before. Thanks for the idea.

Reply to
charles.eddleston

Hi Charles,

Sounds like a bit of a puzzle. I do recall hitting a slight problem with the OCM before: the instruction-side OCM is completely hidden from the processor's data read/write path. So if your program tries to read

0xFFFFC000 - 0xFFFFFFFF it will not see its own code.

I can't see how this would be causing you problems though... particularly as you say you've seen the first packet of instructions being fetched from the SDRAM, so the OCM should really be out of the picture at that point. (I can't remember exactly why this caused *me* problems; I think it was something to do with the way the standard library was compiled that it had some data stored in the code space, which could never be accessed.)

So my suggestion is, why not dump the OCM entirely and use a small chunk of initialized BRAM attached to the PLB as your boot ROM instead? Perhaps there is a little overhead in terms of decode logic vs. OCM, but probably not so very much.

Cheers,

-Ben-

Reply to
Ben Jones

Ben- Thanks for the suggestion - my explaination wasn't very clear: Our first iteration used a small BRAM on the PLB bus to boot, loaded SDRAM (which was defined as cached memory space), and then ran out of SDRAM. With the all the PLB arbitration and overhead logic, this resulted in a

4% of our virtex 4 (fx20). Currently, we're around 85% and would like to save this 4% to help build time and allow for future flexibility.

Thanks,

-Charles

Reply to
charles.eddleston

Hi Charles,

Thanks for the clarification. This is quite puzzling!

Are you using any of the memory protection features of the PowerPC, or is this all running in a flat memory space in supervisor mode? Just to rule out any odd effects due to TLB etc.

What are you using to debug the bootloader (GDB/source level, or raw XMD)? Is the bootloader written in ASM or C? Where is the application code that is written to SDRAM coming from (flash, I guess)? Does your bootloader verify that the code has been transfered succesfully?

If in XMD you then do a "stop", what happens? Do you get a "processor stopped at 0xwha73v3r", or something like "target cannot perform the operation"? Where do your interrupt vectors live (if in fact you have any)?

So your "bctrl" does the Right Thing (sets PC from CTR [which ==0x2000] and stores PC+4 in LR), and then the "mr r6, r5" happens OK, and then the whole thing goes up the creek, right? It sounds for all the world like it's taking some interrupt vector into the middle of nowhere.

Sorry for the deluge of questions - you don't have to answer them all! I'm just trying to get a feel for what your system looks like.

Cheers,

-Ben-

Reply to
Ben Jones

Yeah, it's been quite puzzling - I had the local Xilinx FAE out here the other day and we weren't able to get anywhere on it (besides agreeing that the PLB and OCM busses look like they should). So here are the answers to your questions, followed by the more recent developments:

1) We are just running out of flat memory space in priviledged mode.

2) We have a Green Hills probe to debug - I haven't tried using the XMD program. The boot code is C. Code is copied from FLASH into SDRAM and using the Green Hills probe, I am able to verify the contents of SDRAM.

3) When I stop the code the first time, it goes to 0x2004 and I can still read/write reigsters. When I do a step or press run again, I get something like a "Processor not stopped after single step" and then I cannot read/write any register from the green hills probe command line. From the green hills multi debugger, I get "timeout waiting for cre to stop in read of GPR 30. Single Step Failed."

4) Yeah, the link register increments by 4, the PC/CTR registers are correct and then register 5 is copied into register 6. I don't know anything about how interrupts work on the PPC, so I'll have to read up on that.

So the FAE suggested trying to break the problem down to eliminate the boot code. To that end, I have taken a short program that prints "entering code()" and "exiting code()" and then stops. I boot from this code and using the debugger reset. So at this point, none of the registers or memory have been setup. I use the debugger to load SDRAM

0x2000-0x200C (the first four instructions) and program the PC to 0x2000. If I load the design that uses the PLB RAM instead of the OCM RAM and load the registers from the last example (none are set up) and locations 0x2000-0x200C in SDRAM, I am able to step through all of these instructions.

Thus, it seems like nothing is wrong with the software, with the exception of not setting up a register value to enable the device to switch from OCM to SDRAM. The fact that it boots correctly out of IOCM and jumps to SDRAM at all, seems to indicate that the hardware logic is in place.

I hope that helps clarify what we've got in place and points out what I'm missing.

Thanks,

-Charles

Reply to
charles.eddleston

Hi Charles,

Thanks for the answers.

But, if you load the design that uses the OCM RAM and follow the same procedure, it doesn't work? In that case I agree, it looks like your bootloader is OK.

Thinking of reasons why the PowerPC might just hang up on you: it's unlikely, but it could be a Machine check - did you monitor the status of C405XXXMACHINECHECK?

My next question: what version of EDK are you using, and if it's less than

8.1i, did you see this answer record?

formatting link

Admittedly (a) this is the worst-written load of gibberish I've seen in ages and (b) it doesn't immediately look like it's relevant to you, it does in a roundabout way give a possible explanation for what you're seeing.

The default settings of the C_APU_CONTROL parameter allow certain extra instructions, including floating-point instructions, to be decoded and executed via the APU controller. To cut a long story short, certain invalid instruction forms may get interpreted as being destined for an external co-processor, which isn't there and so doesn't respond and thus hangs up the system. In particular, executing instruction 0xffffffff may have this effect, which is quite possible if one ends up reading from memory that isn't really there. And this might happen as the result of an unexpected interrupt, for example (if you don't have a vector table defined).

It's a long shot, but do check the C_APU_CONTROL parameter on your PowerPC core. It might not solve your problem, but it might just stop the system from hanging up and thus let you see what's actually happening to the registers & program counter.

I'll keep thinking...

-Ben-

Reply to
Ben Jones

Since Ben mentioned interrupts a couple of times, and you say this, let me point out one "gotcha" ... well it caught me out anyway...

The register pointing to the interrupt vector table holds ONLY the 16 MSBs of the vector table address.

Thus putting the table at 0xffff4000 (as we did) doesn't work. You'd think the EDK tools might have warned about this, but nooo... they built the code just fine.

Then the CPU took its first interrupt, and looked for a vector offset from ffff0000, and couldn't find one. So it took an "illegal instruction" interrupt, and went to ffff0700 to find the handler...

Moving the vector table to a 16 byte boundary (ffff0000 in our case) solved the problem.

IMO this limitation COULD have been more clearly documented, as well as trapped by the tools...

- Brian

Reply to
Brian Drummond

Hi Brian,

Good point. As far as documentation goes it took me a while to find where this is explained (PowerPC Processor Reference Guide, Chapter 7 "Exceptions and Interrupts", section "Interrupt-Handling Registers", EVPR, page ~204, and also mentioned in the OS and Libraries Documentation, Standalone PowerPC BSP section). :-) Do you have a suggestion as to where else you think this should be mentioned, to make it more obvious?

The EDK tools do the Right Thing when generating the default linker script - and Base System Builder will notify you if you don't have room for the vector table in your project due to this limitation (e.g. you have < 64KB of BRAM). I guess the linker could be modified to check what it's being asked to do with the "vectors" section and barf (or at least issue a warning) if the alignment is wrong, but this would be a bit of an ugly "special case"...

Cheers,

-Ben-

Reply to
Ben Jones

I'm out of good ideas but I'm not quite out of ideas :)

Perhaps you could use the icread instruction to figure out exactly what the ppc has in its instruction cache at this point. (I seem to remember that you mentioned that caches were enabled in your design.)

/Andreas

Reply to
Andreas Ehliar

Wow - this topic has got lots of activity. Ben, thanks for pointing out the APU bits - I was aware of these and do have this value already in place (I am running 7.1, although I keep thinking of trying 8.1).

As for interrupts, I just think that if I was hitting one that the OCM or PLB busses would jump to that location and try to fetch the instruction at the interrupt handler, on top of the fact that a simple "move register" command shouldn't cause an interrupt to occur to begin with. I'll keep the interrupt possability on the board to come back to in case my current train of testing doesn't pan out.

At this point, I have disabled cache in order to remove one more possible issue from the design.

The current train of thought I'm working on is getting an SDRAM/OCM Virtex-4 system working on a Xilinx ML403 development board and comparing it to my SDRAM/OCM system. So far I have got the LED test to work out of SDRAM when running the core at 2x the PLB clock speed (my system does 3x, but the dev board has issues running at 3x). So here are the things I am going to try first

1) Try my system running at a 1:1 ratio 2) Try our customized PLB controller on the dev board

Hopefully, I can get some information out of these two steps...

Reply to
charles.eddleston

I confess I haven't been through every page of documentation ... I'd have to think to recreate the places I looked unsuccessfully. XAPP778 page 14 was where I found it. It took a while searching the support site on things like "PPC interrupts".

Good to hear ... 7.1 didn't (at least, not in my hands).

I had a 32K plb_bram at ffff8000, and a 16k one at ffff4000. I inherited the project (so no BSB phase), and in ignorance, moved the lower from ffff0000 to make them contiguous, and wrongly assigned segments in "Generate Linker Script". This dialog could warn; but that wouldn't cover hand edited scripts.

IMO it *is* an ugly special case already. I don't believe telling the user about it is uglier than linking non-functional code.

- Brian

Reply to
Brian Drummond

Update: Well, I have gotten past the SDRAM execution issue! The board still does not function correctly, but it appears that this is related to our code and just needs more time to debug.

I got the development board up and running the same code from the same location and compared this to our board (we have our own PLB memory controller because we have a 16-bit SDRAM interface instead of the

32-bit used by the Xilinx memory controller). I noticed that our Sl_ssize was set incorrectly (not sure why). Changing this to specify a 64-bit bus width results in a code exectuation out of SDRAM.

Thanks for everyone's input - if I learn anything else from this issue, I'll post under this topic.

Reply to
charles.eddleston

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.