Continous eeprom checksum microcontroller - Page 4

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Re: Continous eeprom checksum microcontroller
I did some radiation effects testing on PIC's a while ago. From
memory on the particular PICs I tested, random singlebit errors
(soft errors) were just as likely in SRAM as they are in registers.

PROM/EPROM is more reliable for soft errors; but they develop
hard errors over time (but it took lot of exposure).

The test involved doing running checksum tests on program PROM,
registers, checkpoints to catch flipped PC bits, and testing
for watch dog resets. One version of the test also tested serial
EEPROM. The chips were put in a beam line and exposed until none
of them worked.

For high reliability applications, it's a good idea to program
very defensibly and use the watch dog reset. Bascially, expect
single bit errors in calculations, flipped bits in registers and
SRAM and the PC.

There was an article reference (I think slashdot or embedded.com),
that talked about the programming the space shuttle computer. Much
of the effort is lots of code reviews and talking about what ifs.
E.g. if I detect an error here, what do I do.

See ya, -ingo


Quoted text here. Click to load it


--
/* Ingo Cyliax, snipped-for-privacy@ezcomm.com, Tel: 812-391-0895 */


Re: Continous eeprom checksum microcontroller
On 5 Jul 2004 21:18:06 -0700, vishalpatil snipped-for-privacy@yahoo.co.in (Vishal)

Quoted text here. Click to load it

If it is at all possible to corrupt micro registers via external means
(that is not by a software related fault) then I wouldn't be
attempting to mitigate the fault as, God knows, what other aspects of
the micro would be questionable. Instead I would mitigate against the
resultant hazard. For instance, if the micro is controlling the
transmitted output power of the device (as for a CAT device) then I
would be looking at putting a power limiter in the hardware.

Also if stored values need to be validated by some means then an
appropriate software architecture should be adopted. Possibly you
could adopt a scheme to CRC or checksum critically stored parameters
before use, rather than performing continuous refreshes -- just a
thought.

The fact that you are even discussing such issues would seem to me
that a proper hazard analysis needs to be performed on the system to
determine where you stand and to get a handle on the type of
mitigations you need to put in place.

Ken.

Quoted text here. Click to load it


+====================================+
I hate junk email. Please direct any
genuine email to: kenlee at hotpop.com

Site Timeline