Spirit rover OS problems

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
I'm kinda surprised not to have seen discussion here of the Flash memory/OS
problems suffered by the Spirit rover. It seems noteworthy that several $100
million's worth of kit was crippled for so long by what was reported as a
priority-inversion issue. And you can't get much more hard-realtime than the
Spirit rover.

NASA have said that their biggest problem with the WindRiver OS is
comprehension. For me, this neatly underlines the problems I've had with
RTOSs... and why I avoid them as far as possible.

Discuss ;).

Steve
http://www.fivetrees.com
http://www.sfdesign.co.uk



Re: Spirit rover OS problems
Quoted text here. Click to load it
memory/OS
Quoted text here. Click to load it
$100
Quoted text here. Click to load it
the
The subject of RTOSs is a religious issue, and I make it a rule never to
discuss religion.  ;-)

Tanya



Re: Spirit rover OS problems
Quoted text here. Click to load it

Heh ;).

Steve
http://www.fivetrees.com
http://www.sfdesign.co.uk



Re: Spirit rover OS problems
On Sat, 07 Feb 2004 13:37:55 GMT, "news.bigpond.com"

Quoted text here. Click to load it

Those that use a RTOS do and those who dont do not and never the two
shall meet.


Re: Spirit rover OS problems
Quoted text here. Click to load it

There are some of us who like to think that we use C and/or RTOSes when
they're the right tool for the job, and don't when it's not.  When all
you've got is a hammer, everything looks like a nail, when you swear off
hammers pounding in a nail can hurt your fist.

Cheers,
--
Alf Katz
snipped-for-privacy@remove.the.obvious.ieee.org
We've slightly trimmed the long signature. Click to see the full one.
Re: Spirit rover OS problems
Quoted text here. Click to load it
memory/OS
Quoted text here. Click to load it
$100
Quoted text here. Click to load it
the

Priority inversion (and Wind River) also glitched Pathfinder..  Now that
this enigma clearly has a price tag associated with it, someone may catch on
that configuring priority inheritance is a lot cheaper than a 110,000,000
mile control-alt-delete.

After all, it's not rocket science..




Re: Spirit rover OS problems
On Sat, 07 Feb 2004 15:28:50 GMT, "Ian McBride"

Quoted text here. Click to load it

My company used VxWorks on 68Ks back in 1994 and found priority
inversion problems in their file system code which resulted in cross
linked files and, occasionally, system lockup.   We ended up fixing it
ourselves because Wind River was too slow in responding, but we
submitted our patch code to them and the next version of the 68K RTOS
worked properly.

I haven't been following the story that closely, but if these
spacecraft really used VxWorks and indeed were crippled by priority
inversion in the file system, it means that WindRiver has never paid
much attention to detail because they have been aware of the
(potential for) problem in VxWorks for at least a decade.

George

==============================================
Send real email to GNEUNER2 at COMCAST dot NET

Re: Spirit rover OS problems
On Tue, 10 Feb 2004 14:32:32 -0500, George Neuner

Quoted text here. Click to load it

Wow.

This is the kind of posting I'd love to read in comp.os.vxwoks...


--
Ignacio G.T.

Re: Spirit rover OS problems
On Wed, 11 Feb 2004 06:40:08 GMT, snipped-for-privacy@evomer.yahoo.es

Quoted text here. Click to load it

The company hasn't used VxWorks for a number of years and I wasn't
part of the board support effort at the time - I was doing application
work.   I don't know if the archives go back that far, but you can
search for posts by Mark Fisher (mfisher) re: dosFs - he was quite
active in the VxWorks ngs for a while.  

George
==============================================
Send real email to GNEUNER2 at COMCAST dot NET

Re: Spirit rover OS problems
Quoted text here. Click to load it

First I've heard of a priority-inversion problem there.  I
understood them to be having the equivalent of a buffer overrun in
stored data, but that is very hazy.  I haven't seen anything with
any details of the actual problem.

If they are using a RTOS without having sources, that seems
incredible.  Source is always the ultimate documentation.  I would
assume suitable non-disclosure agreements.  Unfortunately this
use-it blindly attitude infests many areas, especially including
medical devices.  I would expect WindRiver to gladly supply source
(with non-disclosure) for the promotional value of operating on
Mars.

cross-posted to c.l.c, because at least some WindRiver people hang
out there.  FUPs set to take c.l.c off again.

--
Chuck F ( snipped-for-privacy@yahoo.com) ( snipped-for-privacy@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
We've slightly trimmed the long signature. Click to see the full one.
Re: Spirit rover OS problems
Quoted text here. Click to load it

Life is too short to roll everything by hand for every single project,
though. My book is really about this topic, to a large degree. I don't
want to trust an "RTOS" with ultimate control over my ship, but I want
some of the high-level services it can provide. So I do the real trusty
stuff in external OSless micros and the sloppy stuff (cameras,
networking, bulk data storage) is done on an SBC.

For applications with few or no consequences for forced watchdog
reboots, user-intervention-reboots, etc (consumer electronics for
instance), I'd say damn the torpedoes and full speed ahead - use the
RTOS, follow the vendor's suggested best practices, and point the finger
back at them if there's a fatal problem.

Gee, maybe I'm qualified to be a PHB.


Re: Spirit rover OS problems
I agree with you.

Another big problem with RTOSs is driver support. If you are using a
PC based platform you will only find drivers for Windows and Linux.
The drivers for RTOSs have to be handcrafted.

Is there any RTOS that would work directly with drivers written for
Windows/Linux?

Sandeep
--
http://www.EventHelix.com/EventStudio
EventStudio 2.0 - Real-time and Embedded System Design CASE Tool

Re: Spirit rover OS problems
Quoted text here. Click to load it

Linux with realtime extensions.


Re: Spirit rover OS problems
snipped-for-privacy@hotmail.com (EventHelix.com) writes:

Quoted text here. Click to load it

Given the required  level  of  experience  or  understanding  of  the
hardware  and  the RTOS, adapting a driver from another OS is usually
pretty straightforward.

Quoted text here. Click to load it

Sure: use Linux as RTOS.

Best regards,

Wolfgang Denk

--
See us @ Embedded World, Nuremberg, Feb 17 - 19,  Hall 12.0 Booth 440
Phone: (+49)-8142-4596-87  Fax: (+49)-8142-4596-88   Web: www.denx.de
We've slightly trimmed the long signature. Click to see the full one.
Re: Spirit rover OS problems
Quoted text here. Click to load it

I am personally suprised and saddened by the news that Nasa is using:

A. C, vs. a more reliable language for the rovers.

B. An off the shelf RTOS with priority based management, vs. a system
which schedule based management.

Although there were several discussions here of the rover using Java,
the news from Nasa mentions "C code downloads".

Bottom line is that the software industry is in a very poor state
right now for reliability. Most projects are done in C, which, in
the words of its own authors, was designed for system and low level
tasks, and done without concern for reliability, hence the abismal
reliability record of off the shelf and even embedded software.

Although the COTS (customer off the shelf) program has yeilded
cost benifits, that is no excuse for inhaling the worst practicies
of the software industry today.

So the result now is that systems costing the major portion of
a billion dollars of our money are more likely to stop because of
a software glitch than a hardware one. Its ridiculous that such
an expensive vehicle that has survived radiation and heat with
amazing hardware reliability is felled by by such simple nonsense.

What would I do ? I would use a language, any language, with type
security. Java, Ada, Pascal, virtually any language but C or C++.
C has no type security whatever, and C++ only has security if you
refrain from using C constructs within it, which nobody does.

Second, priority based scheduling is NOT deterministic. In fact, it
basically amounts to putting a random number generator in charge
of your scheduling. There have been MUCH better systems detailed
in the literature, such as "deadline" based scheduling that ARE
deterministic. I will admit that I have been evangelistic on this
subject, but the industry determination to stick with a scheme that
has so many demonstratable flaws truly stuns me.

My 2 cents.



Re: Spirit rover OS problems

Quoted text here. Click to load it

For an answer to any question like this, consider why the broadcast
industry in North America stuck with NTSC despite the fact that it is a
system with numerous demonstrable flaws. It solved a problem when it was
introduced, a big investment in knowledge and equipment was made, it's a
huge job to uproot it.

The cost benefits of using COTS stuff are very real. Consider it like
this: If it's going to cost $200 million to send a COTS mission with a
75% likelihood of success, or $600 million to send a proprietary mission
with a 95% likelihood of success, what makes better sense? You could
send three of the cheap missions and get a much better target coverage,
even if one of the cheapo probes is a total failure. Especially since
the definition of "success" in this context is pretty vague.

NASA is under big pressure to spend less per mission. And at the end of
the day, this COTS problem cost us... what? Seven or eight days out of a
nominal 90-day designed mission life, which is really (according to the
reports I've read) going to be a 180-day life? Do you think there is
some specific scientific goal that we're now only ALMOST going to reach
- the arm is going to be stretching out for the rock with the Martian
trilobite fossil in it when the mission runs out of time and the rover
fails?

I have no problem seeing my tax dollars going into C and COTS projects
in "el cheapo" interplanetary missions. Hell, if NASA lost Federal
funding and had to rely on donations, I'd give what I could freely.


Re: Spirit rover OS problems

Quoted text here. Click to load it

I don't have a problem with COTS. The error is in taking the most popular
techniques of the industry as is, to whit C and priority based management
(I left out relying on watchdogs, which are akin to helping a heart
attack patient by hitting him in the head with a hammer once a minute).
Nasa needed to take the "best of" the industry practices. The software
industry is a mess. We are reaching new lows in reliability on a daily
basis, so finding the "best of" reliability practices is certainly a
challenge, but it IS doable.

Quoted text here. Click to load it

No, it killed the last lander. This time they dodged the bullet. The
oddesy has had software problems as well. The pathfinder died entirely.
Thats %100 failure rate in Mars missions. I would fire anyone with that
rate of failure.

Quoted text here. Click to load it



Re: Spirit rover OS problems
Quoted text here. Click to load it

Like another poster said, it's not the tools, it's the craftsman. You're
not appreciating the full sweep of my assertion above: Using /common/
tools and techniques is a substantial part of cost saving. Anti-C
hysteria is not going to achieve any increase in reliability.

Quoted text here. Click to load it

More like whacking him in the chest if he doesn't answer "yes" when you
ask him if he's OK.

Quoted text here. Click to load it

For manned missions, yes. For robots, the criteria are cheap and fast.
There's a bit of "and it has to work enough" mixed in there, so the
public doesn't lose interest. It's a balance of money vs. PR. There are
no lives at stake. Do it fast, do it cheap, try to get it as right as
possible.

Viking lasted a couple of years on the surface. MER-A will last a few
months. The difference isn't the software, it's the fact that Viking had
RTGs to keep it warm all the time, and MER-A has to rely on solar cells,
NiMH backups, and presumably the outpourings of bleeding-heart
environmentalists. Remember what killed the last Viking lander, by the
way - ground operator error. And VO/VLs weren't programmed in C.


Re: Spirit rover OS problems
On Sat, 07 Feb 2004 22:27:35 GMT, "Lewin A.R.W. Edwards"

Quoted text here. Click to load it

Good answer.  It seems to me that some langauges are regarded as
magicaly immune to any screwups the programmer made, and therefore
must be used in any reliability-critical project.

I was told you could liken C to carrying around a loaded .45 in your
holster with the safety off.  You risk shoting yourself in the foot if
you aren't carefull, but when you need absolute power, little will
beat it.  Safety critical langauges would be a spud gun with tripple
redundant interlocks on the safety catch.  Less risk of accidents, but
a dissapointing "phut" when fired.

Mike

Re: Spirit rover OS problems

Quoted text here. Click to load it

Because trapeze artists are good is not an excuse to remove the net.
There is nothing wrong with adding an extra layer of checking by way of
use of secure programming language.

Quoted text here. Click to load it

Again, here is the standard idea that C is somehow capable of more than
other languages. There is no net difference in what each language can
accomplish. Even TCL can push applications. It all comes down to ease
of use and fitness of purpose.



Site Timeline