watchdog

Inspired by another thread here I decided to experiment with the watchdog. Luckily not on my datacenter Pi yet.

I read the docs and installed the package watchdog.

It has 2 config files: /etc/default/watchdog /etc/watchdog.conf

In /etc/default/watchdog I changed watchdog_module to "bcm2708_wdog" In /etc/watchdog.conf I first enabled lots of checks, but when that did not work I started with only:

interface = eth0 max-load-1 = 24

My problem: whenever I start the watchdog process, the system is either rebooted quickly or at most after a minute or two.

One time I even managed to get it into a reboot loop which could only be fixed by putting the flash card in a PC and destroying the configuration (renaming /etc/default/watchdog).

Something is going wrong but it is unclear to me what it is.

I tried strace of the process and the first thing I noticed is that the "interval" setting is not in seconds (as you would assume) but in half-seconds. I set it to 20 to get a 10-second check interval.

Default is 1 so a .5 second check interval. That is way too short as I cannot guarantee there will always be network traffic each .5 second interval. However, 10 seconds should be OK. (I checked a tshark -p trace and there is regular ARP and broadcast traffic)

The documentation of the program is very lacking. It has the usual "obvious" comments for each setting, like:

interval = Set the interval between two writes to the watchdog device. The kernel drivers expects a write command every minute. Otherwise the system will be rebooted. Default value is 1 second. An interval of more than a minute can only be used with the -f com? mand-line option.

Does not even mention the units of the interval... it suggest seconds but that is not true, it is half-seconds.

interface = Set interface name for network mode. This option can be used more than once to check different interfaces.

Not a word about what is really happening in "network mode".

Before I get into trouble again, and worse: before I get problems in my colocated Pi, for which I would have to send have them return the flashcard to me and have it down for a week: Does anyone have experience with this beast, and know how to make it behave correctly?

It of course should reboot/reset the Pi only when things are really wrong, but it seems a bit too eager.

Reply to
Rob
Loading thread data ...

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.