Hello,
I am looking for some help with a particularily nasty problem I have run into,
Out of our 10 prototype Virtex-4-FX20 (CES2 stepping) boards, roughly half are exhibiting an issue with the PPC405 starting up out of reset. After powerup, the bit file is loaded, done goes high, current load kicks in, but the PPC never boots. Other logic on the chip is running.
When the device boots properly, there are no issues booting from BRAM, loading DDR-DRAM from flash, or executing from DRAM. Everthing is working good.
Using chipscope, I can see the data from address 0xfffffffc being returned on the PPC405 PLB-I-Master side of the PLB arbiter correctly. However, the second address put out is garbage (0x100600), resulting in a bus error. The boot code is held in a BRAM off of the PLB. During a successful boot, the second address is 0xffffc000 which is correct. The reset sequence and first PLB bus cycle look identical in both the failing/non-failing cases.
Observations:
- Freeze spray (now known around here as 'FPGA programming spray') will without exception make this problem go away. (suggests a timing / power issue??)
- Warm resets (through the EDK reset controller) have no effect. The only way to make this problem go away is to reload the device.
- Reloading the device does not always work. Some boards will always boot fine on the second try, while others will only boot once cooled.
- The emulator (tried both XMD and Greenhills probe) cannot talk to the processor when it is in this state.
- Clocks, DCM locks, reset signals, debug/jtag signals, all look normal.
- The PPC is in an unrecoverable state which is a little disturbing regardless of how it got there.
What else have I tried (none of these have made a difference):
- clocking the PPC405 slower. Same clock as the PLB.
- JTAG loading -vs- selectmap loading
- Boot from the OCM bus instead of the PLB.
- Removed all other logic from the design except the PPC and an OCM BRAM
- Looked closely at the power supplies / grounding.
- I have already successfully played 'Stump the Xilinx FAE/factory'.
- Spent hours in Timing Analyzer looking at any unconstrained nets.
- Looked closely at errata
What angles still left to explore
- I am 95% convinced this is either the result of an external condition, or a chip defect.
- So, I am working up a power-supply change to delay VCCO from VCCINT. I don't believe that is it, but I am running out of things to try.
...................
Has anyone ever seen an issue like this (V4, or 2VPro)? I have done many FPGA designs over the years (although this is our first PPC-based design) and have rarely been this stumped.....
Any and all advice is welcome. Email me or post here.
Thanks, Chris '<