EEPROM/Flash data loss: probability & patterns?

Question

Hi group,I am asked to rewrite a stable-storage scheme on a Motorola HC12 (inits' 4K EEPROM). The module momentarily in use does an array of  small(16 byte) records, each with a checksum and a 1-bit correction code.The fixed sized records are ok (to avoid fragmentation) but I highlydoubt that a 1-bit correction will win anything with respect to faultydata in an electrical r/w memory. Which are the most common cases ofcorrupted data in the presence of the following 3 error sources?- data loss due to wear stress: at the end of the lifetime, errors dueto this will increase, but will they come bit by bit? How reliablewill a cell be after it showed its' first error? Better to avoid itafter that?- data loss due to power down while erasing/writing: I think thatwhole words of memory will come out wrong in that case, hardly ever asingle bit alone.- wrong data due to spikes on data/address lines: this sounds like themost probable source of 1-bit errors but will be less of a problem forsingle-chippers like the HC12. Any hints on that?I am under the impression that for errors of the kind that will showup in an EEPROM you will have to invest a considerable amount of errorcorrecting bits if you want to catch a significant percentage of them.I will duly stand corrected in case someone has the expertise...regards,MarkPS: the persistent memory will of course not rely on the correctioncode but rather on the right storage strategy; the correction shouldonly offer a higher access probability for the "youngest" data.--Mark PifferMCU and DSP programming & software design

jtp · Accepted Answer

Check out this article

Ken Lee · Answer

I agree. I think it's not feasible to use a checksum and only 1 bit for a data correction strategy. As it is a checksum is barely reliable to detect anything but a single bit fault in a packet. If you can spare the CPU overhead, then a CRC would be more desirable. You could adopt a error correction scheme, but this would be governed by the criticality of the device and the cost of the extra EEPROM uasge overhead.

I would think that once detected, this will become a degenerative fault for that location.

Basically this fault will never happen unless the hold-up capacitors and/or the low power detection circuitry are faulty. In which case you have a systematic fault & the device is operating out of spec. I'm assuming that the device has been initially designed for the use case of removing power.

In most cases the power supply of the device will filter the spikes. In a quiescent state it's unlikely that the EEPROM would be affected. Obviously the EEPROM is most vunerable when it is writing. Also exposure of the micro interfaces to the outside world (that is outside of its box) increases the risk. As I see it the mitigation of this occurrence has 2 forms, 1) a watchdog or similar mechanism is required for program execution recovery, 2) a reliable EEPROM fault detection scheme and the resetting back to "safe" default values.

This is my take on it also. It all depends on how valuable the data is.

Ken..

+====================================+ I hate junk email. Please direct any genuine email to: kenlee at hotpop.com

Niall Murphy · Answer

news:...Also check my ESP articles "Forget Me Not" and "A Version Therapy",which cover a double buffering technique and some tips on ensuringthat your eeprom remains valid when the s/w is upgraded. The articles are available online atat===========================See the User Interfaces for Embedded Systems Page at

EEPROM/Flash data loss: probability & patterns?

Join the Discussion

Didn't find your answer?