I've run into a problem with a little project I've been working on. The project is a flash loader utility for the TMS470 that runs from RAM and programs the flash with data coming in over a serial port. The micro is a TMS470R1A288, which is based on an ARM7TDMI core. There is 16K of on-chip SRAM starting at address 0x00400000; the flash loader code runs entirely from this SRAM.
The flash loader is very straightforward: it receives data over a serial port, erases the affected flash sector(s), programs the received data into the flash, and then verifies that the programming took place correctly by reading the data back out of the flash and comparing it to what was written. The first thing I noticed was that the verification step was failing.
In the example shown in the screen capture (at
Note: the flash library code which exhibits the bug was provided by TI and is being used as-is with only minimal changes. AFAICT, this code does work correctly when I have used it in other projects (on different flavors of the TMS470).
Although the verification routine is returning FALSE, indicating verification failure, the routine should *not* be failing; in actual fact, the programming was entirely successful. Indeed, if I disable the calls to the verification routine, the code I programmed into the flash runs correctly, and spot-checks of the flash contents show that the flash contains the correct data.
The specific line of code that is failing is highlighted in green in the C code pane. The buff[] array contains the data we just programmed, and j contains the first 32-bit word read out of the newly-programmed flash. If the programming was successful, these two values should match, and indeed they do: if you look at the watch window, you'll see that buff[0] and j both contain the value
0xea0017ee. So if the values match, why does this test fail?The answer lies in the disassembly pane. In the screenshot, we have just executed the ldr instruction at address 00401F4C. r3 contains the address of slot 0 of the buff[] array, and the ldr instruction we just executed should take the 32-bit word from the memory address pointed to by r3 and load it into r2. However, after executing this instruction, r2 does not contain the expected value (0xea0017ee), but rather 0xea000010. This causes the compare instruction at 00401F54 to fail, and we bail out of the verification routine with a FALSE result.
Now, I could be wrong, but this doesn't appear to be a compiler bug: the code that is being generated looks reasonable and correct. (CrossStudio for ARM uses gcc, BTW). ISTM that this is a bug in the microcontroller itself - it is not executing the instructions correctly for some reason. I did check the silicon errata sheet for the chip but nothing there seemed relevant to this issue.
So is my conclusion correct, or is there some other cause for this bug which I'm missing? And, of course, what the hell do I do about it? How can I trust a microcontroller that doesn't execute code correctly??