Style (?) issue

- D
- D Yuniskis
  
  Contact options for registered users
posted
13 years ago

Mon, Jan 3, 2011 9:50 AM

Hi,

(sigh) Another *@&^#($ night of "tree sitting". This gets old REALLY FAST! :-/

Anyway, between trips outside, I've been dressing up some old bits of software -- cleaning up commentary, etc.

One of the things I do to help with debugging as well as making interfaces more explicit is to place DEBUG-only invariants at the start of each function that "verify" the terms of the invocation contract. E.g., if a pointer can't be NULL, I'll do something like:

ASSERT( (src != NULL), "Pointer to source must be non-NULL" );

At run time, these go away (usually... depends on the market I am serving). So, they don't "cost anything". And, they make the interface rules *very* explicit (while the commentary could possibly be ambiguous, the conditionals in the invariants are hard to argue with!).

So, the start of each function is usually a list of several ASSERT()'s -- at least one for each parameter (usually). Then, the meat of the function begins.

One of the routines I was dressing up today had some constraints placed on a pointer passed to it. In particular, requiring the pointer to reference a valid memory location in a certain range (defined by manifest constants).

ONCE THROUGH THE INVARIANTS, "live code" would further qualify the parameters passed to the routine. In this case, enforcing some "arbitrary" alignment criteria. Conceivably, the aligned pointer could now reference something *outside* the range of memory locations enforced by the invariants. The code clearly has to test for this -- since the whole point is to ensure those "out of bounds" areas are not referenced at run-time.

Now, the nit pick... *technically*, the caller has complied with the contractual requirements of the function (the contract doesn't enforce alignment issues -- that is part of the role of the function itself). As such, I maintain that the original set of invariants are *all* that should be enforced (to determine if the invocation "complies"). The *further*, non-contractual alignment restriction that the function imposes should be treated as a run-time error -- just like trying to resolve "ptr[index]" when the location referenced extends outside of the valid range.

In other words, I should *not* include a second set of ASSERT's to examine the "aligned" parameters after their initial validation.

Agreed? Else, where does it end?? (it is really "clean" having the invariants clumped in one cohesive group)

Thx,

--don

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 9:54 AM

Sorry, I should clarify that routines typically return status of: SUCCESS FAILURE TIMEOUT QUALIFIED with the last being "success but with modified arguments" (which would be signaled, in this case, where the pointer is modified to enforce alignment constraints *if* it still fell within the "valid" range; "FAILURE" if the aligned pointer was outside the range)

- A
- Arlet Ottens
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 11:46 AM

Putting an ASSERT in the code has a couple of benefits:

- Damage control (an ASSERT may be better than unpredicted result from unchecked code)

- Reducing number of bugs in final product

- More information during testing/debug, saving time.

There are also some disadvantages:

- More resources used (development time, execution time, code space)

- Cluttering of the main code path with noise

- Extra bugs due to bad asserts.

For each ASSERT, you'll have to weigh the advantages against the disadvantages, and multiply by chance that you'll actually hit that assert.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 11:58 AM

One other thing you can use is static asserts. The only disadvantage these have is cluttering the code, but it is not much worse than the typical comment (i.e., you write "static_assert(sizeOfBuffer > 100);" rather than "/* sizeOfBuffer should be at least 100 */".

Of course, until static asserts get fully ratified in the various C and C++ standards, and then get implemented in compilers, then you have to make them yourself with some messy macros. But the resource usage is zero.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 12:32 PM

If your compiler supports checking these sorts of things directly, then it is a good idea to do so (via macros if necessary to handle portability issues). If you are using some sort of lint program, then make use of that too. This will give you better compile-time checking, without run-time costs. If the compiler can do such checking, then it can even give run-time benefits as the compiler may be able to make more optimal code.

For example, with gcc you would write:

extern void foo(unsigned char *src) __attribute((nonnull(1)));

That way the compiler will never knowingly call the function with a null pointer.

Of course, you would want to check the exact functionality of such checks, and whether you might get false positives or false negatives.

Where possible, you want to do the checks /before/ the function is called. If the function is called with incorrect parameters, it is the /caller/ that is wrong. Identifying the error at run-time in the called function may be better than nothing, but it is not nearly as good as identifying it at compile-time in the caller. If you can't put the requirements into the "extern" declaration (as above), consider wrapping the real function with an inline function:

extern void do_foo(unsigned char *src); static inline void foo(unsigned char *src) { ASSERT((src != NULL), "src must be non-null"); do_foo(src); }

If foo() is compiled with a src that is known to be non-NULL, this results in zero overhead, and if it is known to be NULL, then there will likely be a compile-time error or warning when the caller code is compiled.

For even more advanced usage, you can use something like gcc's "__builtin_constant_p" function to distinguish between compile-time known values - this would let the compiler choose between static assertions and a call to a "safe" version of the function, or a call to a dynamic assertion version of the function (which would have the ASSERTs, then call the safe version).

Of course, you would want to wrap all this in some macros to hide the messy bits.

This I don't understand. Either it is an error to call the function with an unaligned pointer, or it is not an error. If it is an error, then it is part of the requirements for calling the function, and should be checked and enforced as soon as possible. If it is not an error, then it is not an error. The function may use alignment information to run faster, or it may give you some debugging information to help improve the code, but it is not an error.

Yes, it is "clean" having the requirements for a function listed in one place (you are talking about "requirements" here, not "invariants"). Requirements are part of the declaration of the function, even if C doesn't have a way to express them beyond the type declarations.

- M
- Marco
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 1:27 PM

I agree with this poster: "Either it is an error to call the function with an unaligned pointer, or it is not an error. " Unless that is platform specific.

This is Design by Contract - make sure the contract is spelled out in the interface (.h file in your case).

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 3:32 PM

Implementing them as macros lets me (usually) have them "magically disappear" from production builds (e.g., #ifdef DEBUG # define ASSERT(requirement, message) do { if (!(requirement)) { ... #else # define ASSERT(requirement, message) /* this space left blank */ #endif

So, it takes up resources when I need it to do so and not thereafter (#undef DEBUG)

Yes. Though, to my credit, I only check for each criteria once.

The issue I am trying to address is not whether to add more asserts but, rather, whether to use the same mechanism that I use to enforce contractual obligations (instead of ASSERT, call it CONTRACT) when I am not *technically* enforcing contractual obligations.

I.e., once through the ASSERTs (CONTRACTs), the user has met is contractual obligations. Technically, any further tests that I use/impose are now run-time checks and/or optimizations. As such, handled as (worst case):

if (!(condition)) { /* whatever */ return FAILURE; }

The ASSERTs typically have much steeper consequences -- forced termination/crash -- as they are NEVER SUPPOSED TO HAPPEN (if the developer adhered to contractual obligations).

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 3:36 PM

This is what I currently do (hence "ASSERT()" instead of "assert()"). However, they may or may not generate code in the production release (depending on how I define them). Sometimes they push an error to my "black box" and abort the function call (since the body of the function thereafter EXPECTS never to see input parameters that violate those stated "requirements"). Other times they may just "vanish"

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 4:00 PM

This is rarely the case ------------------^^^^^^^^^^^ hence the need for the run-time check.

The compiler can't always make decisions at compile time. I.e., there will be (many) cases where the extra code generated by the ASSERT() gets included in the build image. (remember, I do this on *every* function I write) This really eats up a lot of space (especially for the strings)

The advantage it has is that it "forces" this into header files where it technically belongs (if you assume the headers to define the interfaces). *And*, makes it more obvious where the "bug" lies (since I can __FILE__ & __LINE__ to point me directly to the offending invocation instead of having to examine the call trace)

Not an error (in this case). Think: "align to cache-lines", etc.

The *caller* didn't "make an error". But, the actions that the function WILL TAKE with the data provided could result in those parameters being modified to what would *now* be considered an error (had the caller supplied them in that "form" originally).

Contrived example:

Caller invokes with argument that resolves to "249". Function will ALIGN_UP to 100 byte boundaries so that

249 becomes "300". (easier for me to deal with decimal numbers at this time of night)

ASSERTs contractually constrain the parameter to the range [1,299].

So, HAD THE CALLER PROVIDED THE 300 THAT IS *EFFECTIVELY* BEING USED, this would have been a contractual violation. (but, he *didn't*).

I am claiming that this should, instead, be a run-time "error" condition (i.e., "return FAIL;") instead of a contractual violation (i.e., "fall into debugger").

(assume FAIL is a legitimate response from the function)

[time for bed]

- H
- Hans-Bernhard BrÃ¶ker
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 6:36 PM

Actually no, you don't. A static assert is resolved at _compile_ time. That's not what your following description says you're doing:

An actual implementation of a static_assert these days (i.e. without special support by the language) looks roughly like this:

#define STATIC_ASSERT(condition) do{int a[1-2*(!(condition))];}while(0)

The trick is that if the condition doesn't hold, this will cause a compile-time error because it defines an array of negative length. The problem with it are:

1) if the conditions holds, this creates an unused variable (which the optimizer will hopefully remove again) 2) if the conditions doesn't hold, the error message you'll get from the compiler has no apparent connection to the actual problem.

- G
- Grant Edwards
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 7:37 PM

Use a typedef instead of an actual variable definition.

Here's what I use:

#define __CAssert(expr,line) typedef int __compile_time_asserter_line ## line [((expr) != 0)*2 - 1] #define _CAssert(e,l) __CAssert(e,l) #define CAssert(expr) _CAssert(expr,__LINE__)

I don't know about other compilers, but gcc's output will include the name of the typedef (IOW __compile_time_asserter_lineXX) in the error message.

You'll need to ad the do{}while(0) wrapper if you want to use it where declarations aren't allowed. If you want to get facier, you can probably "stringize" the expression parameter and coax the compiler to spit it out somehow in the error message...

--
Grant Edwards               grant.b.edwards        Yow! Psychoanalysis??
                                  at               I thought this was a nude
                              gmail.com            rap session!!!

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 7:44 PM

For me the questions are "where is this contract written", "why does the contract, as written, allow clearly erroneous operation" and "how soon are you going to fix the contract?".

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 8:28 PM

Actually, I think it is often the case - code is often written in a form such as "if (p) foo(p)", because the caller function doesn't want to bother calling foo if it doesn't have to. It is also very common to write code such as "foo(&x)" - the compiler knows when calling foo that the pointer is non-NULL. Of course, this is entirely dependent on the type of code and functions.

Feel free to use a "DEBUG" define to enable or disable the run-time checks.

Correct.

If it is not an error, then it is not a problem.

Then either the caller /is/ making an error, or the called function is in error, or the specifications for the function are in error. If something is going wrong, somebody made a mistake!

Your function's specifications are wrong - your claimed requirements on the inputs don't match the actual requirements.

If this can be caught at compile time, that's good. (If it can be caught at design time, that's better.) If it can only be caught at run-time, then so be it. It's still an error.

I'll not argue there!

- S
- Stefan Reuther
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 9:54 PM

I declare an external variable instead, i.e. #define static_assert(condition) \ extern int static_assert[(condition) ? 1 : -1] So even if the compiler insists on a definition of that variable (none so far did), all I lose is one word.

In C++, I use a typedef instead (i.e. replace 'extern' with 'typedef'). This has the advantage of working even in class definitions. C++ allows redundant typedefs, so you don't even have to mess with __LINE__.

This happens so often, do you really care?

----8

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 11:00 PM

[refering to example shown in another post]

For small projects, the contract is formally specified by the ASSERTs and informally by the associated commentary. I often can't afford to prepare formal specifications for every function in a project :>

The operation isn't "clearly erroneous". Rather, it's "unable to provide the desired functionality with the inputs given". Like "divide(4, denominator)" -- where denominator happens to be zero or very close thereto. In many systems, an "interface error", in some, caught *in* the invocation ("attempt to divide by zero") and in others, "successfully" handled as "undefined/infinite".

The contract doesn't need "fixing". The interface will inform the caller of how it interpreted the inputs handed to it. E.g.,

Result /* PASS, QUAL, FAIL, ARGS, TIME */ foo( void **location Size *size ... )

where "location" and "size" are in/out's. Result tells the caller if the call succeeded *exactly* as specified (PASS), or by qualifying it's arguments (i.e., check size and location for the actual values used), FAILed because the service couldn't be provided, had problems with ARGumentS (like passing in a NULL for either of the above) or TIMEd out.

In the example given, the caller has complied with the terms of the contract. And, the "return FAIL" result is something he should be prepared to handle. Likewise, the "return QUAL" with the details of the actual "qualifications" that the routine placed on his original parameters. Hence my belief that this is a rational way to handle what is actually a pathological case (the alternative is to push extra restrictions onto every caller).

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jan 3, 2011 11:22 PM

8<

Sorry, I understood your point. I am commenting that, in my case(s), the compiler can't tell whether "src is non-NULL" (e.g., imagine walking a tree and chasing pointers to next branches; you can't know at compile time which pointers will be NULL or "valid"). So "probe_subtree(pointer)" must either *itself* check that it doesn't dereference "pointer" *or* the caller does so before invoking probe_subtree(). In either case, the compiler can't help.

Yes. But, I typically do an entire DEBUG *build* -- so that

*everything* is checked (e.g., if a module that I am working on references "foo()", I could conditionally include the prepended checks for all foo() invocations. But, when foo() references "bar()", I would have to rebuild foo() with #define DEBUG, etc. I.e., you never know (prior to release[1]) the full consequences of a particular invocation. [1] in some systems, the debug code remains live in the release as it must be sold "as tested" (the alternative is to remove the DEBUG code *prior* to test -- but it is often seen as providing greater assurances that the system being tested is complying at all levels)

The "problem" is a pathological one. What looks to be valid argument proves to be, um, "inappropriate" -- but, only because of the requirements that the called function "later" imposes on it.

I think I agree that it is "not a problem" but, rather, a case where the function simply returns "FAIL". It may be a surprise to the caller but not something that he shouldn't be prepared to handle. (e.g., had the below example been "149", the result would have been "PASS" while "349" would have tripped the ASSERTions)

I don't see that.

E.g., you are implementing an "add()" routine for an 8 digit calculator. How do you define valid inputs? "must be no more than 8 digits" So, add(99999999, 99999999) satisfies the contractual requirements of the interface. Yet, the result add() returns is "FAIL" (overflow). How else would you specify the contract, "must be no more than 8 digits and the sum must also fit in 8 digits"? And, how would you enforce this *without* the "add" functionality?

Generalizing the example I originally provided: how do you create an "align()" function whose "output" can be unconditionally used by the caller? There are some pathological inputs that simply result in "FAIL" instead of a valid "address".

(sigh) breakfast. Pasta sounds good!

- P
- Przemek Klosowski
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Tue, Jan 4, 2011 4:19 AM

So, the first set of invariants is augmented by the second set of invariants. Wouldn't the right solution be to reformulate the first set? From your description I can't think of an example, but I imagine it could be something like checking that the 4-byte variable starts below A, and then in the second step constructing an aligned access. If the variable starts at A-1, the aligned address has to fetch bytes A-4..A-1 and then another fetch of A..A+3. I would argue that the right way is to modify the first condition to check that the variable start is below A+3.

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Tue, Jan 4, 2011 5:17 AM

Sorry, overtired so my replies have been lacking precision. :<

Most of the checks can't be resolved "at compile time" so static checks just won't do the trick. E.g., in David's above example, the compiler doesn't know "sizeOfBuffer" (as "sizeOfBuffer" -- if such a beast existed -- is computed at run-time)

My "following description" was intended to address his comment re: "The only disadvantage these have is cluttering the code, ..." as that *is* the cost my ASSERTs typically bring with them. Furthermore, they act *as* the "typical comment" (again, his words/example). This makes them a good mechanism for precisely defining what the function can rely upon -- e.g., location[size-1] won't extend past the end of the region in question (which might be "all of memory") so the function can use that knowledge in its implementation. Just like ASSERT(ptr!=NULL) means the function can freely dereference ptr without having to bear the cost of *checking* to see if it is NULL. (contracts benefit the caller as well as the function itself)

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Tue, Jan 4, 2011 5:26 AM

No. I'm arguing that the second set of invariants shouldn't be there. I.e., that the "problem" they are trying to detect is caused by something not exported by the function's interface (or its specification).

Note an example I mentioned "someplace" regarding an 8 digit "add()". It makes sense to define the interface to ensure both arguments have fewer than 8 digits. But, it *doesn't* make sense (IMO) to define the interface such that "the SUM of these two arguments must fit in 8 digits". Rather, if the sum *happens* to not fit, you signal an *error* (overflow), not a contractual violation.

Otherwise, you start writing "recursive" interface definitions; in effect, saying, "Use this function but only in a way that the function applied to the inputs WILL HAVE caused the *actual* function invocation to work properly..." :-/

I.e., in the example I used (alignments), I would have to export the details of "alignment" to the user in order to give him the information he needs to ensure he never passes a pointer to me that, once aligned, would cause this problem. Doing that forces all of the alignment tests back into the main body of the code instead of encapsulating them here. (the interface already provides mechanisms to tell you *how* the function has interpreted the parameters you provided *if* it wasn't able to directly use them "as is")

- A
- Arlet Ottens
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Tue, Jan 4, 2011 6:46 AM

I don't really see what the problem is though. It just depends on whether these cases fall within the overall specification, and whether there's a benefit in recovering gracefully (which includes that the caller knows how to handle the error status).

Basically, you have 3 choices:

ignore and get bad results
return error and handle that
put in an assert and handle that

Either one of these choices could be acceptable, depending on the specific situation.