Power surges and modern electronics.

J

John Keiser 15 years ago

I am in Hawaii where power surges are unfortunately common. Having lost several PC power supplies, I now use line conditioning battery backup units to protect my PCs. Several months ago I salvaged a Westington flat screen [LTV-32w6 HD] that had been abandoned because the tuner failed. As the tuner is part of the motherboard I didn't fix it was easier to use an external VCR as the tuner. TV functions fine. I assume the tuner died in a power surge. Yesterday, the power failed and I awoke to find that the TV had turned itself on but no sound. Toggling mute and adding external speakers did not work. I assumed the worst but guessed that maybe this was a microprocessor locked into mute. I unplugged the set and tried again after 15 minutes. Sound was restored. Whew! I wonder how many consumers would be so lucky? Is this type of sensitivity common? [I have a nice old 32inch CRT that has been immmune from these problems and provides a great picture.] I'll probably add a line conditioner, but, really, are consumers expected to be that cautious?

Vote

W

William Sommerwerck 15 years ago

I've experienced this sort of failure with several items -- including an LV player, a TV set, a PDA, and a DVD player -- over the past 25 years. I call it "CMOS lockup", though whether that's the actual cause of the problem, I don't know.

Basically, the product "misbehaves" in some way -- including apparent "death". Removing the batteries or unplugging it, and then letting it sit for a while, causes it to be "recalled to life" (Dickens). Sometimes you need to yank the power cord while the device is running.

I don't know the exact cause, but I suspect it happens often enough that people discard products that are otherwise perfectly good.

Vote

P

PlainBill47 15 years ago

There are big differences between the 32" CRT and the 32 inch flat panel. First, the CRT is the product of more than 50 years of refinement. Comparitively speaking the flat screen is in it's infancy.

Second, the CRT set is almost certainly primarily analog. The flat screen has several processors, and is much more susceptible to interference.

Third, most flat panel sets are put together out of crappy components, especially the capacitors. While the power supply might have been adequate when new, the caps have deteriorated and are allowing noise from the SMPS onto the power supply rails.

PlainBill

Vote

B

Bob Villa 15 years ago

especially the capacitors.

Dell is hoping there are no more of those crappy-caps around!

Vote

D

D Yuniskis 15 years ago

Most modern devices use mmicroprocessors or microcontrollers. Often

*several* (VCRs typically had three or four; my cassette deck has one just to count capstan revolutions!).

If power fails -- wholly or partially -- it is possible for any of those processors to reset. Or, *partially* reset. Or, "get confused" (and head off to never-never land doing something unintended -- like "executing" *data*).

If the processor doesn't have brownout detection (or it isn;t implemented properly), a processor can get stuck in one of these confused states. If the processor doesn't have a hardware watchdog (or, it isn;t implemented properly), once confused it can fail to get OUT of that state.

Regardless, sloppy code in one or more of these can result in them running, but failing to move to their *correct* operational state. This is a common sort of bug -- the code makes assumptions (implicitly or explicitly) that, suddenly, are not valid (because something unforseen has happened -- e.g., a power glitch between instructions

102,678,993 and 102,678,994). Because the programmer made those assumptions, he didn't code to protect *against* them being incorrect.

So, for example, if there are two processors in the set, they *always* are powered up at the same exact instant (there is ONLY one power button, right?). And, the code that they execute never changes. So, 23.0257 milliseconds after RESET, processor #1 has done and can now send the "OK, I am ready to fire up the display" message to processor #2. Meanwhile, 24.6802 milliseconds after RESET, processor #2 goes looking for the "OK I am ready" message, *sees* it (since it was delivered about a millisecond earlier) and correctly fires up the display.

Now, if something happens that causes processor #1 to come out of RESET a bit later -- perhaps, 3 milliseconds (e.g., maybe power at *it's* reset circuit glitched a bit more than at #2's; or, it's brownout detector fired *twice* instead of once) -- then it (#1) might not issue the "OK" message until

26.0257 milliseconds (since it's RESET was 3 milliseconds later than #2's). Meanwhile, at 24.6802 milliseconds, processor #2 went looking for the message AND IT WASN'T THERE!!

Had the programmer NOT *assumed* the message *would* be there, he would have told #2 to wait for it -- for some amount of time. Instead, he might just crash; or, *incorrectly* (buggy) "wait"; or, fire up the display prematurely causing some other fault (that shuts him down -- or, something *else* gets shut down), etc.

The heat of almost ALL software bugs is one or more bad assumptions. When the world behaves in ways programmers don't expect, you get "anomalous behaviors" -- things that seeem unexplained and that change without any action on your part to "fix" them.

Makes you wonder how *anything* works properly! :-/

(of course, there can also be hardware issues that are causing this -- like a cap not completely discharging so the circuit it connects to never sees the "reset")

Vote

J

John Keiser 15 years ago

Thanks for the details. Doesn't instill consumer confidence in buying new expensive electronics!

Vote

D

D Yuniskis 15 years ago

I'm not claiming that this is what you *actually* experienced. But, it is the sort of thing that is commonplace -- increasingly so, nowadays.

E.g., I have a pair of Nakamichi Dragons (high end cassette decks, now very "dated"). They are "autoreverse" decks -- when the tape reaches the end of side A, side B is played (i.e., AS IF the tape had been "flipped" -- though this is done without any mechanical motion).

The tape counter, on reaching the end of side A, should start counting *backwards* as it begins playing side B (i.e., the counter should end up wherever it originally started once side B is complete -- assuming you started at the beginning of side A).

This is, in fact, how it works. There are two "play" buttons on the deck -- "play forward" (side A) and "play backwards" (side B). While play forward is active, you will see the counter increasing. If you press "play backwards", the counter will *decrease*.

*BUT*... if you stop the tape just as it reaches the end of side A, open the tape door, remove the tape, flip it over (so, now side B is "in front"), close the door and press "play forward", the tape will COUNT BACKWARDS (i.e., as if the tape was still installed in the deck playing "side B" BACKWARDS). So, the tape MOVES "forwards" (the machine has no way of knowing that this is "side 2" of the tape... it may be a completely DIFFERENT tape!) while the counter counts BACKWARDS!

If you had stopped the tape a second BEFORE it reached the end of side A, ejected it, flipped it, reinstalled it and pressed "play forward", the counter would NOT count backwards.

I.e., this is a bug. (technically, a "race") Would you expect that sort of thing in a $2K device? On something so *trivial*?? :<

Makes you wonder next time you get on an aircraft ("fly by wire"), have a surgical procedure ("Doctor, the patient's blood pressure is 9843 over 2"), etc. :-/

Maybe the Luddites were onto something, afterall! :>

Vote

D

D Yuniskis 15 years ago

The problem is that consumers don't *want* to know how things work or *should* work. And, they don't "vote with their wallet". They get a crappy product and they either live with it (possibly not even knowing how crappy it is!) or toss it out and buy another (probably EQUALLY as crappy)

Google "Therac".

Unfortunately, there are no real safeguards in place to *prevent* this sort of thing happening. There are "practices" that *should* minimize the chance of it happening. But, there were "practices" in place that should have prevented "Three Mile Island", etc.

Unfortunately, the folks designing these things have less and less time, less and less *motivation* and less and less *capability* for making "robust" products.

My DTV tuner shows *two* "9-1" channels.

Years ago, we would design devices that were (comparatively speaking)

*smart*. They could diagnose their own faults. They could assist the technician in troubleshooting (set up scope loops, etc.). Now, everything is reduced to the equivalent of an idiot light "Service Required" -- and, often, that light isn't even present! The device just "acts funny". And, since users often don't know how it truly *should* work, they can't AUTHORITATIVELY complain/deduce that it *is* malfunctioning.

(how many VCR's blink 12:00? Do you have to be a rocket scientist to set the clock on a VCR???)

I have a Zune media player (movies, music, etc.). I am convinced the hard drive inside it is dying. Instead of a diagnostic message to that effect appearing on the LARGE, COLOR, FULL GRAPHIC DISPLAY, the device sits there trying to read from the disk endlessly (locking up in the process). "Um, if it can't get the data on the first, second or even three hundredth attempt, what makes you think it will get it on the 9 millionth attempt two days from now???"

Instead of helping the consumer determine that he has a p[roblem (or, better yet, RECOVERING from that problem), it sits there frustrating the user and leaving him with no alternative other than to:

- call tech support (in some third-world country, no doubt)

- google for similar symptoms

- discard it in frustration

Vote

D

D Yuniskis 15 years ago

You are more "generous" than I! :>

But, you needn't "see the future" to code against these things! All you have to do is step back (figuratively) and look at your design and ask yourself: "What have I taken for granted, here?" Then, go back and "UN"-take it for granted.

Of course, there are some things that you *have* to "assume". But, far less than you actually usually *do* assume (at least if you want a robust design!).

E.g., if you loan someone money, do you *assume* you will be paid back AND HAVE NO CONTINGENCY PLANS FOR THE POSSIBILITY OF *not* BEING PAID BACK? (if so, I'd like to speak to you about a loan... :> )

Vote

J

Jeff Liebermann 15 years ago

Dell was not using "crappy" capacitors. What they were doing is the same thing that almost every other manufactory is currently also doing. They are rating their electrolytics as close to the bitter edge of failure as possible. That saves a few pennies in cost by using a lower voltage electrolytic but shortens the capacitor life. My guess(tm) is that Dell's OEM supplier in China selected the capacitors based upon faulty calculations, where it was designed to blow up in about 5 years, instead of the 1-2 years specified in the class action suit.

"Determining end-of-life, ESR, and lifetime calculations for electrolytic capacitors at higher temperatures"

At the bottom of the paper, note the various ways in which the ESR can climb as a result varying conditions. At 105C (rated max temp), a typical capacitor will have its ESR increase 5 times (and therefore 5 times the dissipation) after 3500 hrs of normal operation. For

24hr/day operation, that's only about 5 months of continuous operation.

This has nothing to do with the original question, but I thought it might be interesting.

# Jeff Liebermann 150 Felker St #D Santa Cruz CA 95060 # 831-336-2558 # http://802.11junk.com jeffl@cruzio.com # http://www.LearnByDestroying.com AE6KS

Vote

D

D Yuniskis 15 years ago

Yes. And, most of those things aren't designed to *encourage* comprehension. How many things have *no* power switch -- leaving you pushing buttons wondering which will "turn it on"?

I do volunteer work at a place that recycles "stuff". I think we processed 1000 tons last year. Depressing to see the things people just "abandon" (effectively) that still work.

Somewhere (and I'll be damned if I can recall where!), I saw an article discussing natural resources. The point of the article was "whatever is here is *all* there will ever *be* of these things" (unless the alchemists succeed!). I.e., all of the Copper on (in) the planet was formed when the planet was formed; we don't "grow" copper to replace what we use.

For each of these resources (copper sticks in my mind), the article described where it "was", currently.

IIRC, for copper, 1/4 of it is "in use"; 1/4 of it is "in landfills";

1/4 of it is "waiting to be mined/harvested"; 1/4 is unharvestable. I.e., one way of looking at this is: we have used 2/3 of the copper available to us, already (in the past ~100? years) and that half of that is "in the trash".

There are other interpretations that are more pessimistic or less; but, the bottom line is "there is only so much"...

[really... do it!]

Correct. OTOH, there is no motivation *to* be concerned about those things. "It's someone else's problem"...

But why *two* of them (in addition to 9-2, etc.)...

Again playing devil's advocate: what is the alternative? Anything I fix "for myself" is "affordable" (for me). But, if I have to fix something for someone *else*, it quickly becomes prohibitively expensive to do so (I don't work for free). The "system" assigns no cost to discarding items. So:

discarding + replacing

Vote

W

William Sommerwerck 15 years ago

In some cases, yes. Some had setting procedures that went beyond unbelievable.

Vote

J

Jeff Liebermann 15 years ago

I live in the forest in the Santa Cruz Mountains. When the wind blows or it rains, the branches and the power lines tend to meet, resulting in power glitches.

If it makes you feel any better (probably not, but worth a try), after every storm, I get a few calls from customers with hung network and entertainment equipement. DSL modems, cable modems, routers, switches, IP phones, wireless, computahs, security systems, DVR's, printers, TIVO, etc. Just about anything with a microprocessor inside can be made to hang. D Yuniskis covered races and hazzards so I won't go there. Add to that the joy of memory (RAM) glitches. When the power fluctuates, one of the most sensitive components is the common serial or dynamic RAM commonly found in almost everything. A momentary magnetic pulse from a nearby power xformer is usually sufficient to produce a large enough field to flip a few bits. You may not even notice that a few bits have been flipped until perhaps days after the power glitch, when the operating system decides to use those memory cells, and finds them in a bizarre state. This is why many servers have ECC (error correcting) RAM.

The problem of unpredictable processor operation is well known as are some of the band-aids. For low end hardware, usually nothing is done. Just power cycle the box if it hangs. Some clever programmers add in a watchdog timer, which monitors the state of some manner of commonly updated register (i.e. the RTC) and reboots the device if it goes comatose. While clever, it's not very reliable as the dead-mans timer is part of the same processor that it's trying to protect. An external watchdog timer works much better. It usually receives a 1 PPS (one pulse per sec) signal from the processor. If that disappears, it's reboot time.

It isn't just power line glitches that cause hangs. Cosmic rays, alpha particles from radioactive components, external fields, and bit rot all contribute to the general lack of uptime.

Anyway, try not to worry too much. Features and functions are added faster than bugs get fixed, so reliability and uptime rapidly some minimum acceptable value. This value is usually set by when the support phone starts ringing. When the customer complaints arrive, it's probably time to fix the problem. Otherwise, few people complain about ocassional hangs, crashes, and reboots.

# Jeff Liebermann 150 Felker St #D Santa Cruz CA 95060 # 831-336-2558 # http://802.11junk.com jeffl@cruzio.com # http://www.LearnByDestroying.com AE6KS

Vote

D

D Yuniskis 15 years ago

I don't think that applies to anything manufactured in the last

20 years... :<

Rather, I think it is a tradeoff of value for effort. E.g., my TV has a clock in it. I've never set it. Reasoning: it's not normally visible (I would have "to turn it on" to see it and then it would interfere with the picture displayed; setting it requires navigating through four or five screens of settings (i.e., a bit of effort); and, it doesn't offer me any value (that I can't get just by looking over my shoulder to the clock that displays time REGARDLESS of whether or not the TV is on!).

I suspect most VCRs were used for watching movies instead of timeshifting. In that case, there is no value to having the correct time set (it may not even be visible while the movie is playing!). And, since most VCRs wouldn't *retain* their time settings in the face of power interruptions (power outages, unplugging the set, etc.), it doesn't take long for a clock to fall into the "ignored" category.

Finally, too many timepieces in a home ends up relegating most of them to "un-maintained" -- how many of us have *a* clock that we consider The Authority in our homes (i.e., we expect some amount of error in all the others -- intentional or otherwise)

Vote

W

William Sommerwerck 15 years ago

Vote

D

D Yuniskis 15 years ago

Dunno. I haven't used a VCR in more than 20 years :-/ (The one that still is in use here is only used for playing prerecorded tapes)

I had one here -- but it always kept losing signal. So, you get a false sense of security *thinking* it is telling the correct time -- only to discover it wasn't. I guess they are sensitive to where they are located/oriented. Given how "unattractive" this one was (think: functional not decorative), the choices for where it could acceptably be sited were limited. So, it got relocated -- to the trash. :<

(It *was* fun, though, to watch it go into "set" mode... minute hand sweeping across the face of the clock as if it was a *second* hand...)

Vote

W

William Sommerwerck 15 years ago

Vote

B

Brenda Ann 15 years ago

Vote

D

D Yuniskis 15 years ago

I doubt there are any such spots in this house! :-/

This had "hands". I guess driven by a little stepper motor. As I said, the minute hand would race around as if it was a second hand when it was "setting" the clock (I suspect it only gets feedback from the "o'clock" (straight up) position. So, it runs the minute hand to that position, then counts "steps" from there to get to desired position.

Vote

D

D Yuniskis 15 years ago

The frequency of AC power is tightly controlled. Some short term variations in frequency are allowed. But, long term it has to be very accurate (precisely for this reason as a timebase).

I always wondered how wristwatches could be so damned accurate considering how cheap they are (especially these disposable ones). But, then realized they operate at a constant temperature, etc.

By contrast, look at how poorly clocks in cars keep time...

Vote

Power surges and modern electronics.

Join the Discussion

Didn't find your answer?