How to avoid a task not executed in a real-time OS? - Page 2

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Re: How to avoid a task not executed in a real-time OS?
On 29/01/19 17:30, StateMachineCOM wrote:
Quoted text here. Click to load it

How about "log error and enter a safe state"?


Quoted text here. Click to load it

The _bugs_ do disappear :)


Re: How to avoid a task not executed in a real-time OS?
On 1/29/19 12:30 PM, StateMachineCOM wrote:
Quoted text here. Click to load it

ome other code, which is ill-prepared to
Quoted text here. Click to load it

Depends.  For instance, in a clusterized EM simulation code from a dozen
years ago, I have an assert-like macro called 'carefully'.  A use
example is in the simulation half-step that calculates the H field from
the E field, which goes

carefully(HfromE());

That's typically fairly time-consuming--tens of milliseconds to tens of
seconds depending on the size of the simulation and the size of the cluster.

The code has a tree-structured distributed supervisor scheme, a bit like
a simple version of ganglia.  So if some thread on some box finds a NaN
or runs out of memory or something, I have to make sure all threads exit
on all boxes, or the thing will hang forever waiting for the dead
thread.  (That's what CIS.CCom.SetStat() does.)


  #define carefully( x ) Carefully( (x), __FILE__, __LINE__)
  ...
     // We can't just abort the simulation when we run into some problem
     // in a subsidiary thread--we need to tell thread 1 before ending
     // the thread
     // NB: Don't use this function in the main thread!
     //
     // This is called via the Carefully macro, which expands into
     // Carefully(somefunctioncall(),__FILE__,__LINE__);

     void Carefully( int what, const char * fn, int Line) {
        if ( EMERROR_OK != what ) {
           if (quiet < 2) {
              char buf[512];
              sprintf(buf,"%s(%d): Chunk error %d--dying....\n", fn,
                      Line, what);
              pemerror(what,buf);
           }
           CIS.CCom.SetStat(dead);
           _endthread();             // fatal error--only destructors do
                     // cleanup
        }   /* End if */
     };

Cheers

Phil Hobbs

--  
Dr Philip C D Hobbs
Principal Consultant
We've slightly trimmed the long signature. Click to see the full one.
Re: How to avoid a task not executed in a real-time OS?
On 30/1/19 4:30 am, StateMachineCOM wrote:
Quoted text here. Click to load it

I know they're not. But if you're leading a team of 40 software  
"engineers" the best you can do is to make it easier to do things right.  
Poor error management is probably the biggest cause of user  
dissatisfaction over the entire history of the IT industry, and I did  
what I could to improve that, in my small corner.

Quoted text here. Click to load it

No. It's forcing the programmer to think about the situations where the  
error might occur, and make decisions about how to notify the user (or  
the calling code) about the problem, the reason, and the possible  
solutions. I.e. avoid just saying "Unknown error 0x80000000".

Clifford Heath.

Re: How to avoid a task not executed in a real-time OS?
On 29/01/19 22:45, Clifford Heath wrote:
Quoted text here. Click to load it

Related, but slightly different...

A good feature of checked exceptions in Java is that they force the programmer
to either catch an exception thrown by a "library function" or declare that it
could be thrown to whatever calls this code. Thus the possibility of errors
has to be explicitly addressed (even if, as I've seen, the exception is caught  
and ignored).

But that's "too much of a burden", so the modern practice is to only throw
unchecked exceptions that aren't declared and checked by the compiler.

Re: How to avoid a task not executed in a real-time OS?
AT Tuesday 29 January 2019 05:22, Phil Hobbs wrote:

Quoted text here. Click to load it

The hammer might be much too big. Would you like the engines in an aircraft  
to just stop because an assertion stopped he engine controller?
I always have a lot of assertions or assertion like constructs in my code to  
make sure during development that any errors get catched. But production  
code is without any such things. They all get removed and then the _changed_  
software gets verified again.  
How can you otherwise make sure that every code path is tested during  
software integration and verification? This testing and verification has tp  
be performed with the same software as will be deploeyed.  
The assertions should never fire, so you would have untested code in the  
deployed software.

--  
Reinhardt


Re: How to avoid a task not executed in a real-time OS?
Den 2019-01-28 kl. 21:31, skrev StateMachineCOM:
Quoted text here. Click to load it

Not really.
As an example of an assertion which can be removed.
We have an FPGA which contains registers.
For various reasons, we may want to change the address or definition of  
registers.
The FPGA tools + scripts will automatically generate a header file with
register addresses.

In my code I want to access the FPGA registers as a struct.

I have assertions checking that the offset of each register
in the struct matches the #defines in the automatically
generated headers.

If all assertions are OK then recompiling without those
assertions will not cause a problem in a production system.

If I however write the following code.

int getnumber(void)
{
    c = getchar();
    assert(c >= '0');
    assert(c <= '9');
    return c - '0';
}

Then I have totally misunderstood what assertions are all about.

Here is another assertion that can be removed.

// Check that table.var can accept all values of char
#ifdef __ASSERT
     for c in char'range do:
       store c in table.var;
       assert(table.var == c)
#endif

// And here is another
#ifdef __ASSERT
     for c in int'range do:
       store c in table.var;
       assert(table.var == c);  // will fail on c == 256
#endif

The latter example triggers an error, so the code needs changing.
The code is not needed in production.




Quoted text here. Click to load it

The Watchdog allows you to reset or interrupt the processing, which can
be used to invoke recovery functions, including logging the error,
allowing you to resume normal operation, and also to pinpoint the  
problem later.  A hanging device will make noone happy.



Quoted text here. Click to load it

AP

Re: How to avoid a task not executed in a real-time OS?
Quoted text here. Click to load it

You guys cannot have it both ways.

If you truly believe that your production code is so good that assertions w
ill NEVER fire in the field, then why are you so afraid of leaving them in  
the code? Of course, assertions cause some overhead, but they give you that
 last line of defense to do SOMETHING when the system goes out of control.

But you apparently ARE afraid that your assertions will fire too often in t
he field. (Otherwise you would not be talking of "the hammer being too big"
). In this case, your solution is to disable assertions? And to do WHAT? Pr
ay that the system will somehow miraculously recover and correct itself? Wh
at are the chances for this to work?

This strikes me as backwards in the software business. Your code is either  
in complete control of the machine or it isn't. There is really nothing in  
between.

Quoted text here. Click to load it

Who said that a failing assertion must always stop the system? In the end Y
OU are designing the "hammer" (assertion handler), and so it is YOUR job to
 design it correctly for the circumstance: put the system in the fail-safe  
mode, whatever that means. And yes, I would prefer that an aircraft engine  
controller would reset and perhaps re-start the engine mid-air as opposed t
o blow up the engine.

Quoted text here. Click to load it

What you do in your development version is your business. If you choose to  
use "assertion-like" constructs, which aren't checking for conditions that  
should NOT happen (like array index out of bounds or de-referencing a NULL  
pointer), then of course you should remove them for production.

But by removing ALL assertions in the final code you are throwing the baby  
out with the bathwater. You lose your last line of defense to do damage con
trol and to protect your system and all its users.

Miro Samek
state-machine.com

Re: How to avoid a task not executed in a real-time OS?
On 1/30/19 3:31 PM, StateMachineCOM wrote:
Quoted text here. Click to load it

< Attempt to generate controversy over an old article noted.> ;)

Afraid?  Who's afraid?


Quoted text here. Click to load it

assert() is too blunt an instrument to use as a general solution,
because calling abort() or doing a hard reset is not necessarily a good
defense in all situations.  It might fit some situations, but it can
make things worse in others.

A more nuanced assert()-like approach, such as the carefully() macro I
posted upthread, is another matter.  It was specifically designed to be
left in the production code.

Quoted text here. Click to load it

Because sometimes it is.  We've supplied examples.

Quoted text here. Click to load it

Straw man alert.

The choices are not limited to "leave all assert()s in production code"
and "all the children will die!"  There are lots of ways of doing
runtime error checking.

For instance, unit tests are loaded with checks for correct program
logic.  Maguire talks about a MC68000 disassembler whose error checking
system included a complete alternative disassembler (logic- vs.
table-driven iirc).  Once you've exercised all the code paths, you don't
need the second disassembler.

Quoted text here. Click to load it

There's a lot of daylight between wanting the software to be in control
and issuing an edict forbidding #define NDEBUG.

Quoted text here. Click to load it
In the event of failure, the C assert() macro calls abort().

Quoted text here. Click to load it

'When I make a word do a lot of work like that,' said Humpty
Dumpty, 'I always pay it extra.'

If by 'assertion', you merely mean 'appropriate error handling code
hidden by a macro', we're in violent agreement.  But to me at least,
'assertion' means the C library assert() facility or a close  
replica--something that applies the biggest available hammer--abort() or  
a hard reset in the event of failure.

Quoted text here. Click to load it

Nobody is advocating for removing all error checking in production code.  
  There are certainly lots of cases where abort() or a hard reset are  
the right response, but that's far from universal.

I use a whole lot of hard assertions in debug code, as I've said--more  
than I could stand in production, especially in an embedded system where  
code space is often at a premium.  And of course since assertions have  
branches, they frequently mess up compiler optimizations.

The least-bad bugs are the ones you find by code reading; next are ones  
the compiler catches, third are the ones assert() finds; fourth are  
runtime error checks in production; and the worst are the ones that just  
make the system crash in some uncontrolled way.

Thus I like careful coding, static analysis, tight compiler flag  
settings, assert(), and carefully designed runtime error checking, in  
that order.

Cheers

Phil Hobbs

--  
Dr Philip C D Hobbs
Principal Consultant
We've slightly trimmed the long signature. Click to see the full one.
Re: How to avoid a task not executed in a real-time OS?
Den 2019-01-30 kl. 21:31, skrev StateMachineCOM:
Quoted text here. Click to load it

Because assertions add code size and reduces performance.

Quoted text here. Click to load it

If you keep an assertion which fires in the field, then you have a bug  
in your program.
The bug is that you did an assertion, instead of validating data.
You basically misused the assertion for something it was not designed to  
handle.

Quoted text here. Click to load it
I do, assertions should only continue execution if it would be dangerous  
to stop.

Quoted text here. Click to load it

Since assertions are supposed to be removed in production code, your  
design is flawed. You used assertions, instead of run-time checks.

Learn the difference between assertions and run time checks.

Quoted text here. Click to load it

AP

Re: How to avoid a task not executed in a real-time OS?
Quoted text here. Click to load it

You need to realize your assertions can have bugs just like your code.
In fact, I'm sure the bug rate for assertions is at least an order of
magnitude higher than for other code, possibly several orders of magnitude.

Say you have a routine which returns a pointer to a structure, and a size of
that structure (so it can get bigger in the future).

    struct Thing *my_thing;
    ret = get_a_thing(&my_thing, &size);
    assert(size == sizeof(Thing));

In all of your testing, this will succeed, since Thing will match the
size being returned for now.

But then an upgrade occurs, and the size returned is bigger.  Now, the
code would work fine (the interface is designed to support this by adding
new fields in the future), but now your assert goes off.  This is bad--you
basically have made your code fail on a system upgrade.

This is a trivial example, but it gets at the issue--asserts can test for
finicky details which actually don't matter.  And for tesitng, this is fine,
actually probably a good thing, since it makes code assumptions clear.

You even point out that assertions make the code less robust--you're more
likely to crash.  And you expose yourself to crashing in cases where ignoring
the "error" would be innocuous.  I get very frustrated at old programs
which now fail due to stupid checks which they should not be doing.
I mean, I'd prefer a program to keep running and think it was the year 1900
then to hard fail once year 2000 hit.

Plus, the dirty secret no one wants to say out loud--buggy code that keeps
running is often more useful than code which "stops".  We've got a lot of
experiential evidence that pretty massive bugs can often be worked around
to keep a system running.  One downside, of course, is buggy code is often
a giant security hole.  I don't think there's a magic wand which easily
balances the need to make the code useful and robust, and secure.

So, what people are clearly saying is: you need at least 2 levels of
assertions, and the assert I gave as an example above definitely should not
be in production code (but is actually fine in your test builds).
In fact, I sometimes use assertions for things I don't know are true, just
to see if testing hits the case.  These should not be in production code.

Kent

Re: How to avoid a task not executed in a real-time OS?
On 2/2/19 11:44 AM, Kent Dickey wrote:
Quoted text here. Click to load it

I think the religious war over assertions is mostly due to sloppy use of  
language, specifically whether "assertion" means "the C assert() macro  
or some near relative that calls abort() or does a hard reset" or  
"appropriate runtime error checking".  All of our positions are closer  
than they appear.

Cheers

Phil Hobbs

--  
Dr Philip C D Hobbs
Principal Consultant
We've slightly trimmed the long signature. Click to see the full one.

Site Timeline