Watch Dog Fail Safe

This is a feature, not a bug. If a technician visits a site, but forgets to restart the app. it reboots.

What kind of OS are you running? If it supports multitasking, the watchdog application can be a different process than the main application. It can also respond to a dead main application with a different delay than the hardware WD timer.

Is there any sort of comm link between a central office and these sites?

--
Paul Hovnanian     mailto:Paul@Hovnanian.com
------------------------------------------------------------------
Hanlon\'s Razor:
        Never attribute to malice that which is adequately explained by
        stupidity.
Reply to
Paul Hovnanian P.E.
Loading thread data ...

I have about 125 remote Pc's (Advantech PCM9575 667mhz) in the field which can be hundreds of miles away from anyone. Every once in awhile, when powered up, they fail to boot. They sit there with a black screen, no bios or anything, yet the 5 volts going to them is fine. Some times I can cycle the power hundreds of times and not see it, then other times maybe 20 cycles and it will do it, then sometimes on the 1st power cycle. Very unpredictable.

That tells me its SBC hardware, not the OS or the BIOS causing problems.

We have checked the In Rush currents, we have booted hundreds of times with a 100 MHZ scope measuring currents and 5volt power up time (< 2.55 ms). It is so random it is hard to pin point. SBC Manufacturer is claiming they haven't seen it. We have tried several different DC/DC Converters, we currently are using a 40 watt, 5 vdc supply which is twice the size needed.

1) Any Ideas what could be happening?

Unless I can pinpoint the problem, I will have to build a cheap, but effective fail safe.

The SBC has a Bios Watchdog which does not detect this, and the motherboard watchdog does not catch this. So I'm thinking about building a simple External watchdog. This is how it would work. Just like my test bench power cycler.

//Boot up Watch Dog// Watchdog set for 3 minutes. Normal PC boot time is 70 seconds to Application Loaded and running and toggling an output. Application on SBC will toggle a digital output every xx seconds. If a On/Off Pulse is not received within 3 minutes, and external PIC will toggle a relay, breaking the 5volt supply to the SBC. This will force a Cold Boot.

If they system does not boot, then a pulse will never hit the WD. It will then have a forced power cycle.

Delimma, this works pretty good, unless a user has the remote application shutdown for maintenance. So, What if I stretch the timer out to 60 minutes. Sure, 60 minutes is enough for maintenace, but then allows the system to be down for 1 hour before the reboot occurs. This probably would be acceptable since it's much quicker than driving out there to cycle the power.

Any other cool suggestions or solutions?

Richard

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----

formatting link
The #1 Newsgroup Service in the World! 120,000+ Newsgroups

----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Reply to
Richard

Give the application a "maintenance mode" shutdown option that sends a different pulse sequence to the PIC, enabling a one-time 2-hour window.

Richard

Reply to
Richard H.

Richard,

One thing you didn't mention was whether or not pressing the reset button would clean things up. If it will, then your watchdog timer only needs to pulse the reset line.

--
James T. White
Reply to
James T. White

Cool Idea

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----

formatting link
The #1 Newsgroup Service in the World! 120,000+ Newsgroups

----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Reply to
Richard

Thanks, however there is no reset button. Just 5vdc On or Off.

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----

formatting link
The #1 Newsgroup Service in the World! 120,000+ Newsgroups

----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Reply to
Richard

The 5 volts comes from from 0 to 5vdc in 2.66 ms. This is an off the shelf Pentium Single board computer (Embedded System) so I don't have schematics to it.

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----

formatting link
The #1 Newsgroup Service in the World! 120,000+ Newsgroups

----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Reply to
Richard

Hello Richard,

The first thing I would look for is the schematic and how the power-on reset is done. Check all devices that need one, because they may not all be hooked up to a central reset. Sometimes designers fully trust data sheet statements such as "internal brown-out reset". I never do ;-)

I have found some rather primitive ones, like a simple resistor and cap. Needlesss to say, those caused grief. Then there were "professional" reset chips which didn't always behave predictably. So I usually ended up doing my own discrete one. But nowadays there are better reset chips (NCP series?).

So look at the schematics and see what the reset does. Is it a good reset with proper brown-out behavior? Is it a clean pulse or some slobbery slope? Is it long enough? Does it properly work when 5V comes up in a less than stellar fashion, with a stagger or so that could mimic a brown-out? How fast does the 5V come up?

Regards, Joerg

formatting link

Reply to
Joerg

Does it have a "Power good" input? If so, you may be able to reset it that way. The PC/XT/AT computer boards used the power good signal to reset the board on power up, or when you gave the keyboard the three finger salute.

--
Link to my "Computers for disabled Veterans" project website deleted
after threats were telephoned to my church.

Michael A. Terrell
Central Florida
Reply to
Michael A. Terrell

Yes Sir, that is a reason I decided to use a PIC instead of build a 555 missing pulse counter. I wanted the ability to change the logic.

I'm honestly surprised the onboard hardware watchdog does not catch any of this, after, there is power even at the USB port when this happens.

Richard

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----

formatting link
The #1 Newsgroup Service in the World! 120,000+ Newsgroups

----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Reply to
Richard

Richard,

You might find it useful to download and read the PCM-9575 user manual from the Advantec web site. Jumping CN22 Pins 1&2 together momentarily should reset the board.

--
James T. White
Reply to
James T. White

Richard,

Have the watchdog just as you say but make it 'one shot' that is power the PIC up at the same time as the PC then look for one event from the application (say two or three minutes later). The PIC is then dormant until the next power up cycle.

Andrew

bios

cycles

with

It

needed.

motherboard

the

News==----

Newsgroups

=----

Reply to
Andrew Wade

I've seen similar symptoms when ROM was slower than spec, so that it wasn't being read reliably to actually boot up the processor. That *shouldn't* be a problem, but you might check the settings for the BIOS ROM.

I'm not familiar with that particular SBC, but does it have multiple DC power inputs? If so, the order of them coming up may be critical. I seem to recall such a warning with an Advantech SBC I've previously played with.

Cheers.

Ken

Reply to
Ken Taylor

Thanks James, I have the User Manual and I know about the Reset Pins, however as I mentioned, our system does not use these. I'm not saying that it can't be done, and it is a good test, but we have systems already deployed in their own exclosure and everything , plus it's much easier for us to cycle the main power.

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----

formatting link
The #1 Newsgroup Service in the World! 120,000+ Newsgroups

----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Reply to
Richard

I'd take a serious look at your power-on reset circuit. This sounds exactly like what happens when the system isn't getting a proper reset.

Good Luck! Rich

Reply to
Rich Grise

We're saying, you should. Gin up a little daughterboard, maybe with a flying lead to pick up +5 somewhere, that gives a good, solid,

10-20 ms reset pulse there.

Good Luck! Rich

Reply to
Rich Grise

Richard,

Perhaps I'm missing something. You are going to have to build a new board to go in the case to do this new watchdog function. How is using a relay to cycle incoming power easier than just shunting a pin to ground for 20 ms or so?

Good luck.......

James T. White

Reply to
James T. White

Hello Richard,

That's ok but the reset still needs to be much longer than that.

No pun intended but without a schematic and proper documentation I would not use it in any critical application. Anyway, you could try to make out the different chips they use and look up the data sheets. That tells you which ones need reset at which pin(s). Then look at that reset with a scope, on all those pins. It should be a nice clean edge and hopefully asserted for a few hundred msec.

Regards, Joerg

formatting link

Reply to
Joerg

When the system fails to boot up, there is power out the USB Ports and such, but the system never even wakes up the monitor or loads the BIOS or anything. When this occurs, pressing the reset does absolutely nothing, the power must be removed then repowered and it comes right up.

So, using the Remote ON/OFF on the DC/DC converter would be a way to do it with a single transistor and a small pic, and avoid using a relay.

Love to hear some more ideas....

Richard

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----

formatting link
The #1 Newsgroup Service in the World! 120,000+ Newsgroups

----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Reply to
Richard

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.