Books / Articles on Embedded SW Architecture

- S
- Steve at fivetrees
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sat, Jun 24, 2006 11:34 PM

Hmmm.

I know the point you're making, and I sympathise, but - let me take the harsh view and say that if an addition requires a rethink of the decomposition, then there was a problem with the decomposition. Nowadays I take the view that this is a Good Thing - these kinds of Wrong Models tend to come back and bite sooner or later. Sooner == better, and most certainly cheaper.

Having said that, let me add that it's taken me *years* to get to a point when I'm (mostly) happy with my own work in this regard. It's been quite a long time now since any last-minute change, or later enhancement, has caused me to substantially restructure a design. (One of those projects has had something like 40 major additions in the last 17 years, and is still going strong. I *did* do a major restructure, mostly to decouple [1] the human interface further, about 10 years back, and it's held me in good stead since.) It's down to what CBFalconer said: one learns from one's mistakes [2].

[1] Now there's a word: "decouple". I put a lot of effort into decoupling things nowadays. Again it's part of the "avoiding unwanted interactions" thing. Vital, in my opinion. [2] As I get older (I'm 50 next birthday), I get more desperate to pass my hard-won skills on. Hence the book idea. Occasionally I get to do some mentoring, and I absolutely love it. At the last company I worked for, I wound up being the elder sage that the young 'uns would come to if they were stuck. I was honoured. And very pleased to find that I could almost always help - not by sermonising (as I do here ;)), but by asking awkward questions... and getting them to think differently. More, please.

Steve aka Yoda

formatting link

- S
- Steve at fivetrees
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, Jun 25, 2006 2:25 AM

I'm feeling a tad guilty for hijacking your thread with my usual "bug-free code! bug-free code!" rant. So, let me try again.

I've done plenty of this - serially, where it's a system-level issue. I've yet to have a real need to share a bus or a dual-port RAM. Serial may be slower, but it does simplify things.

See above.

You mean serial (including TCP/IP) protocols? I assume so. Every project I do has at least one of these. Some golden rules which work for me: - Always deal character-by character. Never try to map a C structure onto a comms protocol (i.e. never deal in blocks). - Use a state machine. Start in a "waiting" state, and advance from there. Almost every state needs a timeout. - Use a "while characters pending" loop (maybe limited to a maximum to reduce busy time), with a retry flag if we need to re-use the same character (end of one state signifies start of the next etc). Get characters in one place only, and deal with any errors arising there.

To summarise: a parser. ( I keep trying to generalise enough to use a general-purpose parser, but there are so many exceptions and so many ad-hoc non-homogenous protocols I nowadays accept I have to write a parser for each case - the rules above keep me sane. Ish..)

I often do a RAM test (a real one; not the bluff that PCs do), and a checksum test of the ROM. If either fail, I halt. Better a non-functional product than a strangely-behaved product. If there's a display etc, I'll do a (quick) display test.

[It used to be that I could run a full pattern-independent RAM test - in the days of 128, 192, or 256-byte RAM. Nowadays, I have to aim lower - such tests take too long with 128Kb, let alone 128Mb.]

Ah. This is the tip of a bigger iceberg. I prefer co-operative multitasking (I think in terms of synchronous systems, so non-preemptive helps). I use a round-robin that tests for events, in a prioritised loop (I'm skipping quite a lot of detail here). If one of the task encounters an event it should advise others of, I explicitly distribute it - usually by semaphores, i.e. "change" flags that have one "set" source, and are cleared only when dealt with. (Again this is a very brief summary of a larger topic.)

You mean OSI?

formatting link

Nice idea, but TCP/IP (i.e. the real world) is a 4-layer model. However, layering is vital, however many you use.

I tend to think of OSI, or TCP/IP, as a starting point. In practice, I use as many layers as the application/implementation calls for. This is often more than 4 or 7; I like thinking vertically/hierarchically. And I like decoupling.

I've talked about this elsewhere. A bit more detail: - Use macros that get compiled to anything at all only with "Debug" defined - resulting in explicit (via whatever means - stdio if there is one, a serial port, whatever) trace output. (I.e. if "Debug" is not defined, no code is generated.) Sometimes allow various levels/categories of "Debug" - maybe only one aspect of the system needs looking at. - More generally, ensure that all run-time errors are LOUD.

See above. *All* errors must be dealt with. Never use e.g. malloc without checking the return (better still, don't use malloc at runtime - but that's another story). Always check for (and deal with) all possible errors at every stage. We used to have a saying: "one can judge the quality of the code by its error handling". A runtime error is simply *not* allowable in embedded work. Not even slightly. Ever. At all.

Errrr.... the word "benchmark" conjures up "compared to" for me. If you mean "does this system deal appropriately with the required data throughput/latency", then that's another story, and pretty basic. If not, who cares?

Aha. See above. All my systems are event-driven. Is there any other way?

Books? None, really, except when dealing with the Outside World (e.g. recently had to look into how BSD sockets were handled in terms of their possible states). I read quite a lot about the theory of s/w development (e.g. "Safer C", anything I can get my hands on re reliability), and about e.g. TCP/IP, or about the job in hand. The best book about embedded development hasn't been written yet, but I'll get around to it... ;)

How to write bug-free code, but we've done that already ;).

It is a different headstate. Desktop guys tend to think I/we are being pedantic. And we are. We have to be.

Hope this helps,

Steve

formatting link

- D
- Darin Johnson
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, Jun 25, 2006 8:32 AM

That's true. But I also think that the majority of programmers spend the majority of their time working on code that someone else wrote. They don't get to do the design or the decomposition. So good programmers have to know how to deal with it and make the best of a bad situation.

So back to the original topic perhaps - it would be great if there were more books that dealt with this aspect of programming. The Mythical Man Month is one at least. Others?

-- Darin Johnson

- P
- Paul E. Bennett
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, Jun 25, 2006 11:12 AM

The thing about decomposition to reveal the structure of the problem and develop a structure for the solution is that such structures really are just a decent framework from which the rest of the edifice of the application can be supported. Think of it in architectural terms. You wouldn't go changing the steel framework of a building to something that was so radically different halfway through a construction project. It is, though, always possible to accommodate changes to the building within the supporting framework or by additions and extensions to the framework. Knowing the type of application you are dealing with is important to knowing how you should structure the framework of your system so that it is identifiable where likely client requests for changes can be accommodated with relative ease. The nearest text I have seen for component oriented development (the style I tend to employ) is those related to .NET. If there are others that are more related to embedded systems I would be interested to hear of them.

Having worked for industries where last minute changes are highly discouraged (due to the cost and complexity of re-certification efforts) I have only had one occasion in 38 years where a last minute change was demanded and implemented. This involved significant effort to put in a change that ended up affecting just 10 lines of code. When I say last minute, the request came in 6 weeks before intended ship date and the client accepted the slippage of 2 weeks in delivery before we began the change process. The change did improve the system useability though.

That was always the mantra way back. "High coherence minimum coupling". It still holds very well as a guiding principle.

Between CB-Falconer, Lewin, you and I, I think we have almost written a book on the subject just by contributing to this type of discussion in these newsgroups. At approaching 50 you are still a young whippersnapper with plenty of time ahead of you for the book writing. I have been considering it myself and have already written a few parts of several chapters. Must find some more spare time to continue with it.

and Darin Johnson wrote:-

To implement a change in any design (hardware or software) the starting point should be a full and thorough technical review of the existing system. Without this you are just prodding the parts that stick out without full knowledge of the knock-on effects for the rest of the system.

The best pair of books on general analysis and design topics I used way back are:-

"Introducing Systems Analysis" and "Introducing Systems Design" which are both by Steve Skidmore and Brenda Wroe of the NCC. The first one's ISBN is

0-85012-630-4 but as someone borrowed the other and hasn't yet returned it I do not know the ISBN of the second one.

Forth related but a good one generally is "Thinking Forth" by Leo Brodie. This is now, fortunately, on-line at:-

Also good on general problem solving technique is "How to Solve it" by George Polya. The seven I have thus far suggested in this thread have stood me in good stead and all bear re-reading once in a while. Some of them may have been old publications but I see no problem in that. The message remains the same. Doing a decent job of systems development requires a certain attitude to doing that job and can be immensely enjoyable and rewarding seeing your creations doing what they were designed to do with little fuss or on-going attention required.

--
********************************************************************
Paul E. Bennett ....................
Forth based HIDECS Consultancy .....
Mob: +44 (0)7811-639972
Tel: +44 (0)1235-811095
Going Forth Safely ..... EBA. www.electric-boat-association.org.uk..
********************************************************************

- T
- Tom Lucas
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, Jun 26, 2006 10:41 AM

Sadly these metrics are rarely used for process improvement and more for empire building and whip-cracking at the engineers. I'm sure there must be managers somewhere who use the knowledge for good but in my experience most use it for evil :-(

- T
- Tom Lucas
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, Jun 26, 2006 10:54 AM

It's things like this where I wonder about the vlaue of redundancy when both lanes are running the same code. Sure, hardware faults are dealt with but if the code crashes then regardless of how many copies of it are running they will all crash at the same time.

- C
- Colin Paul Gloster
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, Jun 26, 2006 11:08 AM

"[..]

Of course the main blame is for not requalifying the software for the different input values of Ariane 5, but is it the only thing to blame?"

No.

Read news: snipped-for-privacy@earthlink.net by Robert I. Eachus timestamped 2000/07/23:

"[..]

[..] But then, and I'll quote the report on this:

"It is even more important to note it was jointly agreed [by CNES and its contractors] not to include Ariane 5 trajectory data in the IRS' requirements and specifications."

There is the disaster in one sentence. The new IRS for the Ariane

5, was designed for, and only for, the Ariane 4. And since EASAMS Ltd. was a major contractor on the Ariane 4, but a sub to Aerospatiale on the Ariane 5, they were not directly involved in the decision not to get the Ariane 5 data, and according to my sources, the decision was directly due to the unwillingness by the French to release the data to a British firm--even though most of the work was done in Germany. (I heard the later after two German beers. So while some of this is very well documented, I can't vouch for that. I can vouch for the fact that EASAMS never got the Ariane 5 specs. Not they were told not to use them, but they were not permitted to see them.) And I also know, and it is very well recorded, that the issue of horizontal position post launch was NOT decided by the programmers. They objected, and it went to a multi-contractor review where it was decided not to protect it. But this was at an ARIANE 4 meeting, not an Ariane 5 meeting, because the development of the new IRS was officially part of the Ariane 4 project. [..]"

- P
- Paul E. Bennett
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, Jun 26, 2006 7:32 PM

Which is why engineers should get to know how to build and present such metrics. Then they can ensure they hit at issues that grab management attention. I know we all fail in this aspect some of the time much to our own detriment.

--
********************************************************************
Paul E. Bennett ....................
Forth based HIDECS Consultancy .....
Mob: +44 (0)7811-639972
Tel: +44 (0)1235-811095
Going Forth Safely ..... EBA. www.electric-boat-association.org.uk..
********************************************************************

- P
- Paul E. Bennett
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, Jun 26, 2006 7:48 PM

when deciding on a redundancy policy one has to be aware of the reasons for considering the redundancy. Protecting against just hardware failures or protecting against design failures. To do the latter requires diverse hardware and software development streams for each of the redundant systems. Hopefully you will end up with different hardware and different software both fulfilling the requirements specification.

--
********************************************************************
Paul E. Bennett ....................
Forth based HIDECS Consultancy .....
Mob: +44 (0)7811-639972
Tel: +44 (0)1235-811095
Going Forth Safely ..... EBA. www.electric-boat-association.org.uk..
********************************************************************

- P
- Paul Keinanen
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, Jun 26, 2006 9:50 PM

The US space shuttle has one additional flight computer with completely different software for handling a launch abort and emergency landing.

However, as far as I know, this hardware/software has never been tested in actual flight, not even in drop tests from the B747.

Paul

- T
- Tom Lucas
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Jun 27, 2006 9:18 AM

They can make it shiney ;-)

- T
- Tom Lucas
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Jun 27, 2006 9:22 AM

I heard that for the Boeing 777 flight control computer then they specified that the two lanes should be designed by different companies with different engineers who went to different universities to reduce the chance of the two computers crashing at the same time. However, becaue they were both derrived from the same specification, apparently the resultant code was very similar.

Urban legend?

- P
- Philip Koopman
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Jun 28, 2006 1:24 PM

It's real and that is more or less the story. See section 3.2 of this paper spells it all out. Yeh, Y.C.; " Design considerations in Boeing 777 fly-by-wire computers", HASE 1998.

formatting link

(requires a subscription)

Phil Koopman -- snipped-for-privacy@cmu.edu --

formatting link