NASA proves once again that, for it, the impossible is not even difficult.

- R
- Robert Myers
  
  Contact options for registered users
posted
13 years ago

Wed, Feb 9, 2011 4:53 PM

In this case, proving a negative about unexpected acceleration of Toyota cars:

formatting link

Sudden acceleration in Toyota vehicles not an electronic issue, U.S. study finds

I had thought of calling this post a Car Talk question for comp.arch.

When I got my car back from the dealer from servicing the last time, it did something weird and a little scary, like suddenly decide that it didn't want to go anywhere right after I had pulled out of a parking spot. The service manager offered to examine the car again, but I told him I'd rather see how difficult it was to reproduce the problem.

The problem occurred just twice and hasn't happened again in three months. My suspicion is that the "servicing" put my stateful vehicle into some weird state that was not easy to duplicate, and that, having gone twice through some critical section of code, the car has forgotten all about the experience.

The only people I know of who do enough testing with machines that are stateful in ways that are hard even to enumerate would be people who work with microprocessors. I wonder if anyone at a water cooler at NASA has noted that it wasn't Intel, but someone else, who discovered the problem with the Sandy Bridge chipset.

Now, I'm a bit of a cranky old carmudgeon, and my instinct is to shrug my shoulders at my car dealer and at NASA and say, "Oh, well, I never thought you knew what you were doing, anyway." Intel doesn't know what it's doing either, but by now it either knows it or should know it. NASA has proven beyond a reasonable shadow of a doubt that it is uneducable.

My question is this: does anyone in this business believe that it is possible to uncover all corner cases by any plausible testing protocol that someone would actually spend the money for? By "all corner cases," I mean enough that corner cases would never be discovered in the wild after having escaped discovery in the testing labs.

Robert.

- T
- Terje Mathisen
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Wed, Feb 9, 2011 5:44 PM

"cranky old CARmudgeon"? :-)

[snip]

The only way to do this is to manually inspect the code and design test inputs to trigger every possible permutation of internal states.

The "slight" problem is that the permutation count quickly becomes rather unmanagable. :-)

Terje PS. I recently hit on this very problem during Facebook's "Hacker Challenge": All the real problem inputs were designed to flush out any attempts to brute force a solution by trying all permutations.

PPS. I registered with an alias, you won't find me on fb.com. :-)

--
- 
"almost all programming can be viewed as an exercise in caching"

- E
- EricP
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Wed, Feb 9, 2011 6:10 PM

The horses' mouth...

formatting link

The NASA report is 177 pages and details the analysis, the NHTSA is 77 pages and appears to be a summary.

Eric

- G
- George Neuner
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Wed, Feb 9, 2011 6:56 PM

My first thoughts upon hearing the NHTSA announcement was that the spokesman was quite selective in saying "electronics" were not the cause while at the same time carefully avoiding any mention of software.

All along my suspicion has been buggy control software. I haven't yet looked at the NASA findings, but (what I perceive to be) the selective wording of the announcement makes me wonder. It's perfectly possible to prove an electro-mechanical design in isolation ... but a number of the mechanisms in question are software controlled. Did NASA also prove Toyota's software? And if so, how?

George

- E
- EricP
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Wed, Feb 9, 2011 7:55 PM

"Extensive software testing and analysis was performed on TMC 2005 Camry L4 source code using static analysis, logic model testing, recursion testing, and worse case execution timing. With the tools utilized during the course of this study, software defects that unilaterally cause a UA were not found." [page 152 of NASA analysis report]

So there... obviously your concerns were unfounded.

Eric

- M
- MitchAlsup
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Wed, Feb 9, 2011 7:56 PM

Yes, but only for the case where the entire program can be printed in its entirety on a single piece of paper and readable at the std 18" of distance.

For actual devices where the device enumerates a program, and also contains custom layout, no. Surveyed to death, yes; no-bugs-found, yes, no-bugs-possible, no.

Mitch

- G
- George Neuner
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Wed, Feb 9, 2011 9:57 PM

The fact that they looked at the software does not make my concerns about buggy software "unfounded". I have begun reading the report and it states that the software examined consisted of 280,000 lines of code - presumably assembler though that wasn't explicitly stated. Regardless, nobody can fully understand 280,000 lines of any kind of code.

Moreover, according to the summary (pages 13-17), the team only considered scenarios described by incident reports. People under stress, in general, make for poor witnesses and their recollections of event detail and event sequencing are questionable to begin with.

From page 17: "There is a single failure mode found that, combined with driver input, can cause the throttle to jump to 15 degrees in certain conditions and may not generate a DTC." and "For the small throttle openings, the NESC team found single failure modes with the ETCS-i that can result in throttle openings less than 5 degrees."

So, in fact, there WERE software faults. It seems that the faults that were found are annoying rather than life-threatening, but that doesn't mean that other life-threatening faults do not exist.

From page 20: "Due to system complexity which will be described and the many possible electronic hardware and software systems interactions, it is not realistic to attempt to "prove" that the ETCS-i cannot cause UAs."

I've only just started to read the analysis, but the "executive" summary already has validated my concerns!

George

- K
- kym
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Wed, Feb 9, 2011 10:12 PM

...

Yar. I saw this in various versions on google news.

The most carefully-worded version simply indicated the only known problems were mechanical. It doesn't seem electronic or software problems were ruled out -- just unproven at this point.

I'm also suspicious there are some bugs in the system. If a car makes it onto the market and it has a tendancy to catch the accel pedal under the floor mat, I think it's a fair bet there are also going to be "hidden" problems.

--
Of course "global temperature are rising", we're emerging from an ICE AGE!!
  -- BONZO@27-32-240-172.static.tpgi.com.au [86 nyms and counting], 8 Feb 2011
12:22 +1100

- W
- Walter Banks
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Wed, Feb 9, 2011 10:30 PM

This is a good read. The claims made by NASA are clear in the it identifies what they specifically tested and what they were able to determine.

It is an interesting document that describes a few years old engine controller.

There have been other automotive incidents as well. At least one related to putting an automatic transmission in gear in a cold engine. The car begins to move brakes stop the car and idle control loop increased the engine speed to meet the minimum RPM more brakes are needed to hold the car resulting in a competition between brake and idle settings.

Other than the obvious NASA failure investigations there have been two other NASA software situations that could have been fatal The first moon landing and first shuttle flight both had significant software problems.

Regards,

w..

-- Walter Banks Byte Craft Limited

formatting link

- R
- Rick Jones
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Wed, Feb 9, 2011 11:03 PM

Is your main complaint the general "cannot prove a negative" or is it based more on your belief that there must be a software problem involved in the unexpected acceleration?

rick jones

-- I don't interest myself in "why". I think more often in terms of "when", sometimes "where"; always "how much." - Joubert these opinions are mine, all mine; HP might not want them anyway... :) feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

- E
- EricP
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Feb 10, 2011 12:32 AM

I was being tongue-in-cheer wrt your concerns. The NASA summary statement on the software does not say there are no flaws, nor does it say that UA is impossible. And it is certainly does not say it is "provably correct". (It wasn't NASA's job to prove the Toyota software is correct.)

It says only that the current analysis and tools did not detect a "unilateral cause" for UA. (Those tools being provided by Toyota and presumably being the same tools as used by Toyota during development.)

That is significantly weaker than unqualified statements made by the politicians, bureaucrats and journalists.

There is a nuance to NASA's statement that was lost as it traveled up the ladder.

The report says they used a tool called Coverity to do static code analysis. Their web page says they handle C/C++, Java and C#.

formatting link

I only briefly scanned the report. There are lots of things they found.

Like the keyless ignition button. You press to start, you press to stop.

***Except*** that in order to prevent accidental shutoff the button must be held for 3 seconds to turn off if the vehicle is in motion. Owners may not know this, and in a panic with the car running away just stab at the button trying unsuccessfully to stop the car.

Eric

- A
- Andrew Reilly
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Feb 10, 2011 12:34 AM

That's not as simple as it sounds...

Precicely. In loopy, branchy, logic-style code, this is clearly a combinatorial problem of considerable magnitude, but there are code coverage and profiling tools available to make sure that you pass through all pices of code.

In system-control or DSP code, though, all of the state variables have a very large number of potential states. Numeric issues dominate. You can have a piece of control code that works perfectly under some circumstances but fails in others, and you can't tell by looking at the code. You can't even necessarily tell by looking at the code in combination with the pages of filter coefficients and what-not. You need to be able to characterise the input(s) in terms of both dynamic range and frequency range, and you also need to keep an eye on numerical issues: rounding, can state variables drop into denormal ranges and blow up the real-time constraints (if floating point), etc... I don't believe that proof of this sort of control system can be done by looking at the code: it requires an analysis of the dynamic system as a whole.

So: every possible permutation of the internal states involves every possible legal value of every word of state memory. I've never seen anyone even suggest that it was plausible to test that space exhaustively, for anything like a 280000-line control program...

Unmanageable, as you say.

Cheers,

--
Andrew

- R
- Robert Myers
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Feb 10, 2011 12:46 AM

When I first took a NASA-type job at mumble-mumble, what seemed to be going on there made absolutely no sense to me.

Now, what is really going on is perfectly clear to me, and it should be perfectly clear to anyone who has been part of such an organization.

You can file these reports with Arthur Andersen audits of Enron. They serve the same purpose and have the same level of credibility.

As to what I'm complaining about, you know me. I just complain.

Robert.

- E
- EricP
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Feb 10, 2011 1:36 AM

I think that is unfair. If you are thinking of buying a building, you hire a structural engineer to do a review. You can spend $10,000 or $20,000 or $30,000. Obviously the $20,000 review is more thorough than the $10,000 one. But there is no guarantee that spending more will find something, or that not spending $30,000 misses something.

And no one with any brains would ever say "There are no defects". Only that "Our review, as commissioned, uncovered no defects".

Eric

- W
- Walter Banks
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Feb 10, 2011 3:39 AM

In another automotive example I mentioned elsewhere in this thread the error integration in a a separate parallel control loop (Idle loop) dominated the system and the engine powered up to the point that could not be stopped except with the ignition switch.

Engine controller code has improved significantly in the last few years and the testing has has become a lot more exacting. Probably more important most current controllers can be reprogrammed at service centers.

Regards,

w..

-- Walter Banks Byte Craft Limited

formatting link

- R
- Robert Myers
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Feb 10, 2011 4:26 AM

You think my attitude is unfair and I think yours is indefensible.

When you design buildings (or commercial aircraft) and review structural designs, you have past experience to draw on and some a priori idea, based on experience, of how much you have to budget to have a reasonable chance of avoiding errors that have been experienced in the past.

Sometimes, as in the case of the 787 Dreamliner, you have essentially no relevant experience because no one has built such an aircraft with those kinds of materials before. In such a case, and only in such a case, the kind of analysis you propose would obtain. You simply have to make an arbitrary decision as to how much risk you are willing to take and how much you are willing to spend to mitigate the risk. It's pure art and engineering judgment. More like betting on horses than anything else.

Most engineering is not like that because there is relevant past experience. You can foresee whether a given method of analysis is likely to uncover a critical error and plan accordingly. You've designed bridges that look like that before, and you have a really good idea as to how to set up your finite element analysis to capture likely problems. The analysis budget is not "Well, this is how much we have available" but, based on experience, "This is what we have to do to have a high probability of producing a safe design."

In the case of fly-by-wire systems, you have to assume the worst. The system *will* eventually fail by some method that you never would have guessed or predicted, and you *must* have the capacity to fall back on clumsy mechanical controls that do not depend on electronics in any way. I don't understand why drive-by-wire is any different. What's the point of a study that can't prove (or is very unlikely to prove) anything? What has happened other than that a lot of money has been spent?

Robert.

We can't prove that drive-by-wire systems can be counted on, so then why are we counting on them? Until someone can prove otherwise or until we have a lot more experience than we already have, safety analysis at the government level should be focused on planning for the worst.

- D
- Del Cecchi
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Feb 10, 2011 4:40 AM

We've designed bridges for years and years and analyzed the crap out of them and the I35W bridge in Minneapolis fell into the Mississippi because a gusset plate was half thickness somehow.

As a famous guy once said, "we don't know what we don't know. We only know what we know."

However this all reminds me of the run away audis. It was a design defect until they put an interlock on the transmission and that made it go away. Apparently the design defect was that folks didn't actually have their foot on the brake like they swore up and down they did, but on some other pedal.

- R
- Robert Myers
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Feb 10, 2011 5:02 AM

Yes, indeed, people will make mistakes. A truly conservative design would have had no single point of failure.

So we *do* have relevant experience, and the solution is not to go chasing unknown bogey-men but to plan for (and prevent) the worst.

Robert.

- E
- EricP
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Feb 10, 2011 5:59 AM

Wasn't that Rumsfeld?

Eric

- D
- Didi
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Feb 10, 2011 7:08 AM

L4

g,

se

und."

I believe you are quite correct, what is quoted (I never read the whole thing) sounds exactly like that. They have tested electronics and software, and found no electronic reason for failure, how cute.

Dimiter

------------------------------------------------------ Dimiter Popoff Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link