Testing watchdog

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
I usually enable watchdog during boot init code. I usually use the  
internal watchdog that is available in the MCU I'm using. I know an  
external watchdog can be more secure, but internal watchdog is usually  
better than nothing.

Some MCUs can be programmed to enable watchdog immediately after reset  
and stays enabled forever. Other times I enable watchdog timer  
immediately after some initialization code.

During my previous thread "Power On Self Test", Richard Damon said:

 > "Yes, testing watchdogs is tricky."

Tricky? Why?

What I usually do to test what happens if the watchdog timer expires is  
adding a while(1); loop in a certain part of the code, for example when  
a button is pressed or a command is received from a serial line. It is  
very simple to see the watchdog action (the reset) when I force the  
execution flow to pass through the added test code.

Of course, after testing I remove the while(1); loop.

In this way I'm sure the watchdog is enabled and configured correctly.

Do you think it is not enough?

PS: I know there are some risks the watchdog is disabled by an  
unpredictable and bugged code, indeed many MCUs pretend a precise  
sequence of operations to disable the watchdog.
In this post, I'm not refering to these issues, but only to the  
behaviour of the watchdog when it expires. I think it's not so difficult  
to test it.

Re: Testing watchdog
On 27/09/2020 17:56, pozz wrote:
Quoted text here. Click to load it

Certainly it is not hard to test a watchdog trigger by simply not
kicking it for a while.

But think about what the watchdog is for - what are you actually trying
to do with it?  Are you trying to find where you have an infinite loop
in your program, that you have failed to find with other debugging?  The
watchdog /might/ help as a last resort, but you typically have no idea
what has gone wrong.

The "good" reason for a watchdog is to handle unexpected hardware issues
that have lead to a broken system - things like out-of-spec electrical
interference, cosmic ray bit-flips, power glitches, etc.  How do you
test that the watchdog does its job in such situations?  If these were
faults that were easily provoked, you'd already have hardware solutions
in place to deal with them.


Re: Testing watchdog
On Sun, 27 Sep 2020 18:38:13 +0200, David Brown

Quoted text here. Click to load it


You could enable a while loop that just goes when a pin is pulled high
or low and then the processor would reboot after you pull that pin one
way or the other.

I always try to have the code go through at least two or more areas
and set a flag so that more than one piece of code has to be ran in
order to feed the watchdog.
Like one in main and one in  a timer interrupt (at least)






Re: Testing watchdog
Il 27/09/2020 18:38, David Brown ha scritto:
Quoted text here. Click to load it

Yes, you're right, but I don't think watchdog is a solution for every  
issues that could happen in the field. It is a solution for big problems  
with the execution flow of the program, because normal flow feeds the  
watchdog, unexepected flow doesn't.


Quoted text here. Click to load it

I didn't think watchdog was used for hardware issues, but software bugs.  
For example, when you use a wrongly initialized function pointer and  
jump to a wrong address in the code... or similar things.


Re: Testing watchdog
On Monday, September 28, 2020 at 2:38:40 AM UTC-4, pozz wrote:
Quoted text here. Click to load it

I recall a software team who thought it was a good idea to use a timer driven interrupt routine to tickle the watchdog.  

--  

  Rick C.

  - Get 1,000 miles of free Supercharging
We've slightly trimmed the long signature. Click to see the full one.
Re: Testing watchdog
On 28/09/2020 09:17, Rick C wrote:
Quoted text here. Click to load it

That is fine - /if/ the kicking (I kick my watchdogs, rather than tickle
them!) is conditional on other checks.  For example, each other task in
the system sets a boolean flag when it runs its loop, and the watchdog
check in the timer interrupt checks that all flags are on before kicking
the watchdog and resetting all the flags.  It's a standard way to get
the effect of having multiple watchdogs.



Re: Testing watchdog
On Monday, September 28, 2020 at 5:29:29 AM UTC-4, David Brown wrote:
Quoted text here. Click to load it

Yes, but they weren't doing that.  Also, you need to vet the watchdog software very carefully.  A "sometimes" watchdog is about the same as none.  

--  

  Rick C.

  -+ Get 1,000 miles of free Supercharging
We've slightly trimmed the long signature. Click to see the full one.
Re: Testing watchdog
On 28/09/2020 17:34, Rick C wrote:
Quoted text here. Click to load it


Then I agree with your criticism.  It sounds a lot like "The software
spec said implement a watchdog routine.  We made one" without actually
thinking about what would be /useful/.

Quoted text here. Click to load it

Agreed.

Re: Testing watchdog
On 28/09/2020 08:38, pozz wrote:
Quoted text here. Click to load it


Quoted text here. Click to load it


Quoted text here. Click to load it

You are not supposed to use a watchdog for that - you are supposed to
use good software development techniques and comprehensive testing.
(Yes, I know the real world is not perfect.)

In embedded development, treat function pointers like dynamic memory -
something you /really/ want to avoid unless there is no other option,
because it is often simple to get wrong, it is far more difficult to
analyse than static alternatives, and it is also less efficient.


Re: Testing watchdog
Il 28/09/2020 11:26, David Brown ha scritto:
Quoted text here. Click to load it


Quoted text here. Click to load it


Quoted text here. Click to load it

Yes I know, but Good Software Development Techniques aren't perfect, so  
the watchdog is one of the last chance to put the system in a safe state  
(restarting in my case).

I admit it can be useful for hardware issues too.


<ot>

Quoted text here. Click to load it

Why? Poor code (responsability of the developer) or and intrinsic  
characteristics of the function pointers?

Even null-terminated strings can be risky if you process a string  
without the null char at the end. Your "Good Software Development  
Techniques" should help to avoid such errors.

I think function pointers are a good solution in many cases. For  
example, I don't like a long if/else:

   if (me->type == LAMP1) lamp1_func();
   else if (me->type == LAMP2) lamp2_func();
   else if (me->type == LAMP3) lamp3_func();

I prefer to put a function pointer in the struct and call it directly:

   me->func();

This helps to mimic OOP in C too.

Quoted text here. Click to load it

This is true.

Quoted text here. Click to load it

Do you think my example above is less efficient with function pointer?

Re: Testing watchdog
On 28/09/2020 12:12, pozz wrote:
Quoted text here. Click to load it


Quoted text here. Click to load it


Quoted text here. Click to load it

As I said - they are easy to get wrong, and difficult to analyse.  They
make a mockery of call chain analysis, stack size checking, and other
things that depend on a static code flow.  They make it difficult or
impossible to follow code flows backwards, or simply to answer the
question "where can this code be called from?".

Sometimes function pointers /are/ the best way to organise code.  But
they should be used with care and consideration.

(I even recommend minimising the use of data pointers, for similar
reasons, though obviously data pointers have more essential uses.)

Quoted text here. Click to load it

Making mistakes in programming is certainly easy.

Quoted text here. Click to load it



Quoted text here. Click to load it


That, to me, is a very bad design choice.

typedef enum { lampType1, lampType2, lampType3 } lampTypes;

Whatever the "me" struct is, have its "type" field (after a name change)
of type "lampTypes".  Then you have:

    switch (me->lamp_type) {
        case lampType1 : lamp1_func(); break;
        case lampType2 : lamp2_func(); break;
        case lampType3 : lamp3_func(); break;
    }


This way you have a clear structure in the call graphs - you know
exactly what can be called, from where, by what.  You have flexibility -
the lampX_func functions don't have to have the same signature.  You
have better optimisation, as the compiler can combine parts of these (if
they are visible or you are using LTO).  You can't get bad or
uninitialised function pointers.  You can't miss out on anything,
because your compiler or third-party static analysis tool will tell you
if an enumeration is missed in the switch.

Quoted text here. Click to load it

That is not a good thing.  If you want to use OOP, use C++.  (C++
compilers and analysis tools know far more about virtual functions than
about unrestricted function pointers.)

Quoted text here. Click to load it

It is not unlikely, but it depends on the rest of the source code and
organisation.  If the source of these lampX_func functions is known to
the compiler when it generates the switch statement, and they are
declared "static" (or if you are using LTO), then yes, the switch
solution will probably be more efficient.

But efficiency arguments are secondary to making code easier to write
correctly, harder to write incorrectly (these are not the same thing),
easier to analyse, and easier to debug.


Re: Testing watchdog
Il 28/09/2020 12:56, David Brown ha scritto:
Quoted text here. Click to load it

Quoted text here. Click to load it


Quoted text here. Click to load it

Sure, but I imagine this is true for every code.


Quoted text here. Click to load it



Quoted text here. Click to load it

Quoted text here. Click to load it

As a function pointer can assume a bad or uninitialized value, don't you  
think the lamp_type variable could assume a bad or uninitialized value?


You can't miss out on anything,
Quoted text here. Click to load it

But you need to write the possibly-long switch statement every time you  
have to call a function that depends on the type of me.
The code will be cluttered with those switch(), instead of a simple  
function call:

void lamp_on(struct mystruct *me) { me->on(); }
void lamp_off(struct mystruct *me) { me->off(); }
void lamp_dim(struct mystruct *me, uint8_t level) { me->dim(level); }
void lamp_rgb(struct mystruct *me, uint32_t rgb) { me->rgb(rgb); }

Suppose you need to create a new lamp type, you need to touch many parts  
of the code, instead of initialize function pointers in one point only.


Quoted text here. Click to load it

I'm not an expert and I don't pretend to be a programming guru.
However I usually study embedded programming by reading online blogs,  
newsgroups (like this) and books, and I see mentioneds function pointers  
many time.

For example, in Grenning's book "Test Driven Development for Embedded C"  
there's a chapter "SOLID[1] C Design Models" where function pointers are  
used to reduce (completely avoid) switch statement.


[1] Here SOLID stands for
- S Single Responsability Principle
- O Open Closed Principle
- L Liskov Substitution Principle
- I Interface Segregation principle
- D Dependecy Inversion Principle


Re: Testing watchdog
On 29/09/2020 08:44, pozz wrote:
Quoted text here. Click to load it

Quoted text here. Click to load it


Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

True.

But some code techniques have higher "cost" than others, where cost
includes risk of mistakes, risk of misunderstandings (other people may
be maintaining the code), portability cost (a function pointer is
efficient on an ARM but /very/ inefficient on an 8051), challenges in
debugging, difficulty in analysis, and run-time efficiency cost.

The cost/benefit trade-offs for small embedded systems are often very
different from "big" system programming.  Function pointers might be the
right choice in a big system program, but wrong in a small embedded system.

Quoted text here. Click to load it



Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it





Quoted text here. Click to load it


Quoted text here. Click to load it


Quoted text here. Click to load it

No.

How could it have an uninitialised value?  That would be a fault in the
code that first sets up the lamp object, and when this is done in one
place, it's hard to get wrong.  How could it have a bad value?  You've
made this an enumerated type - stick to your programming standard that
says you only use elements of the type, combine it with compiler
warnings or third-party linters and only assign valid lampTypes values.
 (If you are using C++, use a class enum here.)

Of course you could have stack corruption, or runaway pointers, that
result in the memory for lamp_type being overwritten with rubbish - but
then you have serious problems anyway.  (And avoiding function pointers
means you can analyse your stack usage more realistically, and
minimising data pointers makes runaway pointers less of a risk.)

You can also add a clause

    default : panic("lamp_type is %i", (int) lamp_type);

if you think it is an issue.  Note that this will catch /any/ bad value,
unlike any check on a function pointer.

Quoted text here. Click to load it

Yes.


You'd rather clutter it with lots of function pointers in your
"mystruct" struct (wasting ram, code, run-time, source code, and opening
new risks for errors) ?

A good clue that your design choices are questionable (I won't say
"wrong" - there are no absolutes here) is when you have far more
flexibility than you need.  Your arrangement means a lamp can have an
"on" function that uses a GPIO and an "off" function that uses SPI.
With my method, if your lamp_type is "lampGPIO" then that is used for
off(), on(), and everything else.

Quoted text here. Click to load it

Yes.  It is a small price to pay.

Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

Certainly function pointers /are/ used in this way.  But I am telling
you that /I/ do not think they are a good idea in such contexts - I
don't think they are a good cost/benefit trade-off.  I have also read a
lot of online blogs, websites, etc., and while some are good, a lot of
them are crap.  Some dangerously so.

Quoted text here. Click to load it

Based /solely/ on those comments (and I know doing so is grossly
unfair), that book would make an excellent stand for a pot plant.

If you want to do something like this, use C++.  Not C.  Making
half-arsed sort-of-OOP code in C leaves you with something that is the
worst of both languages.  (That doesn't mean you can't do TDD in C - but
it is not the same as doing it in C++.)

If you program in decent, clean C, your code flow is clear and can be
followed in the code, by analysis programs, in call-graphs, by the
debugger, by stack size checkers.  But you have to write your switches
and you have to use manual rules about enums and other restrictions.  If
you program in solid, modern C++, your code flow is harder to track,
your debugging is tougher as there is a wider separation between source
code and object code, but the language and the compiler enforce rules
for you.  You don't need to worry about bad or uninitialised data
because the language won't let you create such objects.

Re: Testing watchdog
On 9/28/20 6:12 PM, pozz wrote:
Quoted text here. Click to load it

What make you so sure that the same guy who messed up the real software  
get's the watchdog right?


Site Timeline