Bit rot in micro controllers? (2023 Update)

Just repaired our fridge when, according to Murphy's law, the next appliance became shaky. Our pellet stove has twice refused to be turned off. Unfortunately, instead of analog it's all buttons that are operated via port pins of a micro controller. Pressing several of those willy-nilly made the on/off button work, at least long enough to turn it off. When the circuit board is cold the botton always works but not when warm after running the stove overnight.

The micro controller is a Winbond W78E52BF-24 running on a 12MHz crystal. It is based on what they call electrically erasable MTP-ROM with which I assume they mean EEPROM. Date code is 2001 and that is also when we had that pellet stove installed.

Can these things develop loss of flash memory (bit rot) this soon, after only two decades? Any remedy short or reprogramming or is it toast?

Reply to
Joerg
Loading thread data ...

Reply to
Martin Rid

+1`

Don't know why the EEPROM would be suspect first. Wouldn't that produce a more permanent glitch? I suppose it could become temperature sensitive. Maybe the stove manufacturer has a note on this issue or even a recall.

Reply to
Rick C

I don't think so because several other buttons also fail when warm and they aren't set in a matrix. The solder job on the board looks very good and it all ohms out well.

That's why I am asking, to see whether this is a sign of impending "permanent" bit rot. I don't know much abou this EEPROM business.

I doubt they really care about that. There was another issue with this stove right after installation. I really had to rock the boat until they admitted it and actually sent out a correction note to installers with my mod in there (exhaust temp sensor).

Worst case I may have to "analogize" the whole enchilada but that would require a lot of reverse engineering and work. Of course, then it would last forever.

Reply to
Joerg

Maybe the switch contacts are getting corroded. This happened to my Maytag front loader washing machine. The power switch is directly above the soap tray and gets corroded from the bleach fumes. I put a drop of mineral oil on the button shaft where it worked its way into the switch. After a couple of applications the mineral oil reached the contacts and cleaned them. Now the switch works perfectly with no sign of degredation. I think the mineral oil is protecting the contacts from the fumes.

Reply to
Jonathan Winter

MTP-ROM is suppose to be good for up to 1 million write cycles:

formatting link
However, after only 1,000 erase/write cycles, the erase/write voltages begins to change. See Fig 15:
formatting link
"After 1,000 erasing/writing cycles at a program voltage of ±6 V with a write time of 5 ms, the VT of a programmed cell is lowered from 2.92 V to 2.9 V and the VT of an erased cell is raised from -1.3 V to -0.5 V." If the MTP-ROM device initially worked properly, and then slowly started failing to recognize button pushes, that might be the problem. Assuming you run the pellet burner for half the year and turn it on/off once per day, that would be: 180 days * 1 cycle/day * 20 years = 3,600 erase/write cycles which might be sufficient to see the problem.

Note: I don't have any experience with the Winbond W78E52BF-24. Therefore, I don't know if it has this problem.

Reply to
Jeff Liebermann

It doesn't seem likely that the system is writing its flash every time it is turned on, or off. Indeed, it doesn't sound like it has a reason to write its flash at all, with flash being used merely because it was cheaper than getting a custom ROM.

Sylvia

Reply to
Sylvia Else

Maybe you're right. I guess I have to dig deeper.

The other symptoms (sensitivity to heat and other inputs having problems) seem (to me) to point to a chip problem. The data sheet indicates that the 8 KBytes of electrically erasable/programmable MTP-ROM is for program memory.

formatting link
The device also has 256 bytes of scratch pad RAM. Therefore, there is no need to use the MTP-ROM as a non-volatile scratch pad, unless the chip is also doing something write intensive, such as data logging. The chip is capable of addressing up to 64 KBytes of external RAM. If this is static RAM, then it could be used for saving system status when the power is cycles. If it's volatile (dynamic) RAM, it would require a battery. If only a few things need to be stored (auger position, burn time, power status, etc), it could probably be done in the chip's built in 256 byte scratch pad RAM. If most data is being stored, it would be very tempting to save it to the MTP-ROM area.

I still like my first guess(tm), but to be certain, I would need to know more about the controller board and what the chip is doing.

Reply to
Jeff Liebermann

Quite probably yes - I remember a series of TFT monitors some years ago where that happened to the on-board controller. Raising/lowering VCC may help a bit (if it is not read-protected, you might try to read the memory at different VCC levels and see if you can get correct data, and then re-program).

cu Michael

Reply to
Michael Schwingen

My instinct would be there is a failing electrolytic capacitor somewhere that is allowing the CPU to see glitches that blind it to the on/off button. Some polling algorithms are a bit stupid so another button stuck down might also have the same effect. You might have hoped that there would be a failsafe emergency stop button on something that makes fire!

I never trust CPUs for safety interlocks! There is good reason.

Reply to
Martin Brown

It could be storing thermostat settings in eeprom, but that seems unrelated to the fault reported.

if it doesn't like the heat the fault is probably some heat sensitive part, an electrolytic capacitor for example.

Reply to
Jasen Betts

Such things are done in some instruments, to save the current control settings between sessions.

Reply to
Tom Gardner

I'd look for other solutions before considering bit rot.

Obviously the connectors and switches need to be examined and possibly IPA or DeOxit applied.

Electrolytic capacitors are the next failure point. Sometimes they are obviously "distressed", sometimes subtly faulty.

Check the PSU rails for voltage and transients, of course.

I had an SMPS where something would: - take a minute to start - turn off then on, immediately start - turn off for an hour then on, 50s to start - turn off for 12 hours then on, 60s to start Debugging that was one /good/ use case for a DSO.

Turned out to be a dicky electrolytic fed by a high value resistor; it was almost as if the capacitor was so "dry" had to reform the barrier before it would work.

Reply to
Tom Gardner

Even where the device stores some information, I would expect it to do so only when the information changes.

I'm inclined to agree that that's more likely.

Sylvia.

Reply to
Sylvia Else

My HP vector network analyzer developed "bit rot" when it was around 25 years old. One bit in one of the many uv-erasable eproms changed state. I found it by varying the supply voltage and looking for the first bit to change. I programmed a new eprom with the edited data and it has been working fine ever since. John

Reply to
John Walliker

The capacitors can also silently go short. can be temperature dependent.

Reply to
Johann Klammer

In the 60's & 70's 20 years was mentioned as a life span for diffused silicon. Somebody's diffusion rate.

Hul

Joerg snipped-for-privacy@analogc> Just repaired our fridge when, according to Murphy's law, the next

Reply to
Hul Tytus

OTOH adding wood to the stove is a pleasant chore, even in the middle of the night.

Reply to
Wond

Diffusion rate is *very* temperature dependent. I can't see a small microcontroller at a modest clock rate running particularly hot.

Reply to
Martin Brown

More likely something analog -- power supply sag/spikes, bad caps, a floating configuration pin, etc.

Can you verify that the processor is actually *running* when it "misbehaves"? (i.e., activity on ANY pins?)

Reply to
Don Y

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.