Help devising software tests

A

Arno Nuehm 21 years ago

Hi all,

I have to write a test specification to define tests for a piece of software (a small kernel for an embedded system) that is currently under development. Given is an API specification (basically function calls specified as C prototypes, along with textual descriptions of what these calls are supposed to do).

Now I have to devise a strategy to define tests for this software.

One obvious procedure is to go through the API functions one by one, identify requirements from the C prototypes and the textual descriptions and define a test for every single requirement.

But then there are also "cross cutting" concepts, i.e. concepts that manifest themselves at different places throughout the API. As an example, there is the concept of a set of rights that is attached to an object (a task, in this special case). The API has functions to manipulate this set of rights, and, depending on the current settings of these rights, the object may or may not be entitled to access certain other API functions.

I feel that by just looking at each API function one by one I would miss these "cross cutting" effects, and that consequently, my test spec would be incomplete.

I wonder if people here can give me some insights or recommend any books/online documents worth reading.

Thanks in advance for any help!

Arno

Vote

N

Not Really Me 21 years ago

If your goal is a tested kernel, my company

formatting link

offers a complete set of tests for the MicroC/OS-II RTOS. It might save you a lot of time and effort over developing a new kernel.

Scott

Vote

J

Jonathan Kirwan 21 years ago

I agree, Scott, that it may be just the thing for them. So I'm glad you post this option. Still, at the risk of you and I talking at cross-purposes to the OP's need, I'm motivated to add a few thoughts stimulated by the idea of using a 3rd party kernel in an embedded application.

...

(1) If the product is for some critical area, such as medical use, it may not be less effort to source a 3rd party 'kernel.' The OP did say "small kernel" and the kind of exhaustive testing required for some medical purposes (such as where every conditional branch edge must be exercised and tested for impact) would make this *much* easier to do for a small, narrowly focused fragment of code designed and written for the specific system than it would be for something targeted at a vague, non-specific marketplace. Making an honest effort of testing every branch edge in every part of the object code of an O/S would require understanding an often-much-larger-than-needed body of code. I suppose that if the O/S were carefully crafted so that, at compile time, exactly and only those parts needed would be included that then this might be not so bad. But otherwise, it adds a LOT of extra and unnecessary work on the testing side.

(2) I just recently wrote from scratch a small, cooperative kernel with thread priorities, an accurate delta-time sleep queue with a guaranteed start time relative to the clock edge, non-synchronized thread messages, the ability to support exception handling local to the thread stack, and a variety of support functions between the time of 9AM and 2PM on the same day. In the interim, it has undergone two walkthroughs with skilled programmers, and it has been running two months without a line of code change since that day other than adding the feature of supporting thread-local heap storage and changing the hardware timer source. It does exactly and no more than is required for the application. Writing a small kernel tailored for embedded applications should be a personal skill almost as unconsciously applied to a job as is not having to look at the keys as one types out program code. Certainly, not as something thought of as too big and too fearful that one must buy from a 3rd party who really doesn't and can't have any idea what's important and what's not important for the application at hand and where they must instead struggle to broaden their own market by growing feature sets (and, of course, expanding the documentation required to understand it all.)

...

I wrote the above essentially because the OP mentioned a "small kernel." Some folks *want* tested TCP/IP and networking support, tested FAT file support, tested wear-leveled flash file systems, tested support. But that isn't what the OP started out saying. Small was the guide. Or, at least, that is how I first read the OP.

But on looking back again, I see more clearly the OP talking about the rights of tasks to the API. And this smells of a larger-than-small kernel. Worse, it isn't something I'd add to a kernel for critical applications -- there would be no clear point in doing so. The code should be right, regardless of a task's rights to the API. So I see no point in providing this feature, which itself only pointlessly adds extra testing requirements, if it were a critical application like medical.

So maybe you are right to suggest a well-tested 3rd party system.

Jon

Vote

J

Jonathan Kirwan 21 years ago

Is this a medical or similarly critical application?

Jon

Vote

A

Arno Nuehm 21 years ago

Thanks for the offer. Unfortunately, using MicroC/OS-II is not an option for this project (it simply doesnt fit the requirements). Besides, my personal part in this project is not to develop a kernel, but to define tests for a given kernel. So, I guess what I'm looking for is more a methodology than a result produced by it.

If you could shed some light on the process that was used to define those tests for MicroC/OS-II, it would probably help me a lot. But I can see pretty well that you can't do that as it's probably the basis of your company's business.

Thanks

Arno

Vote

A

Arno Nuehm 21 years ago

[snip good discussion about why not to use a custom kernel, most of which hits the nail on the head -- Thanks!]

Believe me, there is quite some point in doing so....

Well, the kernel's main (only) job is to implement tasks and to provide secure execution environments (i.e. isolate the tasks from each other). The idea is to allow programs with different trust levels to coexist (i.e. potentially malicious, non-trusted programs along with trusted ones). I've seen this this concept being referred to as a "seperation kernel", though I'm more inclined to just call it a "microkernel". So, yes, it does implement tasks, but no, the kernel is definitely not "larger-than-small".

BTW, I would prefer not to tell the application, but it is not medical, and it is security (as opposed to safety) related.

Thanks

Arno

Vote

L

Lanarcam 21 years ago

Software tests are performed as black box tests and/or white box tests.

You usually perform black box tests first where you consider the system under test as a set of functions with external access. You are interested only in externally visible behaviour, you don't test explicitely internal features. You must ensure that every function call is tested, that all possible combinations of parameters are used, you define ranges of values for this, and if states diagrams are known from the outside, you must ensure that every combination of states and conditions is tested. A black box testing is in short a test against external requirements.

White box testing is used to ensure that every path in the code is taken. You must exercise all output branch for every condition. For loops, depending on the criticity of the system you test one or several pass. There are tools available for the testing of the code coverage.

Vote

A

Arno Nuehm 21 years ago

I see white box tests as a supplemental thing, i.e. you first do a requirements-based black-box testing and then, to ensure that nothing was missed, you do a coverage analysis. If the coverage analysis finds any white spots, it means that either the code in question can not be made to execute, which means that it is superfluous and can be removed, or there must be some requirement that this code fulfils and that is missing from the specification or has been overlooked.

IMHO, it is important to do it this way around (i.e. black box followed by white box test). IOW, I think it is plain wrong to define test cases with the sole purpose of achieving code coverage.

My problem is that there are certain functions that manipulate the system's state, and, depending on that state, some *other* functions change in behavior. Therefore, I suspect that I will not catch all necessary test cases by just exercising every single function on its own. I was hoping for some advice how to deal with such situations.

Cheers

Arno

Vote

G

gooch 21 years ago

This is generally correct except for the fact that you are almost never going to be able to test all possible combinations of parameters. In almost all cases this would take many more years than you, or even your great grandchildren, are going to be alive. You need to select an apropriate subset of possible combinations that has a reasonable likelyhood of discovering any problems in the SW. You generally want to keep good track of the number of defects you are finding along the way and perform regression tests as needed. The number of defects being found should begin to approach zero as you progress. At some point, usually dictated more by contractual schedule restraints than anything else, you decide that you have reached a point of diminishing returns. In other words you get to a point where you are putting in a lot of effort to locate a relatively small number of defects that are very unlikely to occur in a real life situation, at this point you stop.

As far as how to go about developing test cases, that is pretty dependent on the system in question and without seeing it I don't know what real advice anyone here is going to be able to give. Generally you want to define some set of incremental builds that expand upon each other adding new components and functionalities to the system as you go. If there are interdependencies between component the order of integration becomes a key consideration and you must fully test a component and then add the new one and if needed change the original component in such a way so as to test the new one. This becomes difficult if there are large numbers of circular dependencies which sounds like what you are desribing. In general the best approach is to remove the circular dependencies at design time so they will not become an issue when you test. This is one reason why a test engineer or someone similar should be involved in the process early on. If problems with testability can be caught early they are easier and cheaper to fix.

Vote

L

Lanarcam 21 years ago

Black box testing is not testing individual C functions one by one. You must test functionalities. If your system exhibits a state, you must test all C functions in every accessible state. This seems a lot of work but you can't avoid it if you need to prove the functionnal correctness.

If you can show that some functions behave independantly of the various states and that they don't modify the state, you can exclude them from the set. But in order to do so you must perform some analysis of the code and prove it.

Vote

A

Arno Nuehm 21 years ago

.. and functionalities should be stated as requirements in the specification. I think I'm beginning to understand now. The problem I was facing is that my specification has some requirements stated along with the functions they correspond to, but there are also requirements which do not correspond directly to any single function. So, the obvious approach of using the list of functions as the main structure for the test document won't work here. I'll have to start from the list of requirements.

OK. I could treat the state affecting a function's behavior as if it were another "virtual" parameter to the function. Sounds good.

Right (except that (please excuse my nitpicking...) functional correctness cannot be *proven* by testing)

Thanks for your help

Arno

Vote

N

Not Really Me 21 years ago

You are correct. You cannot prove correctness, but you must try to prove incorrectness.

We use this adage, "A developers job is to make their software work. A testers job is to try to prove that it doesn't!"

Scott

Vote

N

Not Really Me 21 years ago

The response by Lanarcam says it pretty well. Regardless that it is not a safety-critical app, I highly recommend that you get copies of the RTCA specs DO-178B and DO-248. While not aimed directly at security, they do identify the steps that you will need to follow.

Simply the rule of thumb is "have a plan and test everything". If your company doesn't have them, generate configuration management procedures and plans, software QA procs and plans, specs for requirements, designs and tests, test plans, and keep everything under source/document control. Test the requirements - are they complete and correct. Test the design - does it match the requirements. Test the code - does it match the design. Test the white box tests - do they do an adequate job, do they test all the low-level (design) requirments. Test the build and test procedures - Can you repeat everything if necessary? Test the black box tests - do they test every requirement? Argh! Generate a tracability matrix. This is a matrix that correlates the requirments with the design with the code with tests. Oh, and good luck. (Contact us if you need help/advise on any of the above)

Scott

Vote

L

Lanarcam 21 years ago

This is right, my use of the verb prove was sloppy;)

This leads to the nightmare of the conscientious programmer who can never get a good sleep while his creation is in the wild.

You stop testing sometimes because of the project schedule or because you become short of funds. In this case you can never be sure that all bugs have been removed.

When you write code that will be part of a certified system you have at least the approval of the official certification body.

A colleague said that a software is free of bugs only when the customer has got rid of it;)

Vote

C

CBFalconer 21 years ago

... snip ...

I consider that fundamentally flawed. A developer should write code that is obviously correct. That means short routines that can be described in a few sentences, with close type checking on parameters. Brinch Hansen was a master at this.

This leaves the tester checking boundary conditions, and for proper usage of those routines.

"If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson

Vote

L

Lanarcam 21 years ago

I tend to agree about the idea that testing is looking for errors, not verifying the correctness. But this is more of a philosophical debate and I should perhaps stick to facts.

I agree that what you suggest is sound advice and if programmers respected that there would be fewer bugs.

But even then you can never guarantee that you have checked every conditions except in functions that are trivial. You could do that but you would have to use formal methods and they are not pratical in general. I know some who used them in rail transport systems but they were gurus.

Even if you have proved the correctness of functions, when you assemble the whole thing you can find unexpected errors. If the design is simple they should be caught easily. But if you have several interrelated tasks of some complexity, you can't assume you have no bugs simply because you have written your code carefully.

There are what are called real time bugs caused by unexpected events that no one had suspected. And the problem is when do you know you have corrected the last one?

This is true for linear calls of functions, but when you have asynchronous execution this is not sufficient. For instance when you have interrupts or preemptive tasks.

It is true however that what you suggest should be used for safety critical systems. For these systems you have generally a main loop that executes functions cyclically and no interrupts. It is much easier to prove the correctness of such designs.

But for other types of systems this is not always practical.

Vote

C

CBFalconer 21 years ago

... snip ...

Those can generally be characterized with loop invariants, or just plain invariants. A producer/consumer relationship is typical. The real bear is when you have to guarantee worst case delays.

You may note I cited Per Brinch Hansen earlier. He and his staff could write a complete multiprocessing operating system and be virtually convinced of correctness out of the gate. He had (has) a genius for this sort of organization and simplification. Yet the overall system can be highly complex and deal with many interrupts and processes.

"If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson

Vote

Help devising software tests

Join the Discussion

Didn't find your answer?