Making Fatal Hidden Assumptions

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
We often find hidden, and totally unnecessary, assumptions being
made in code.  The following leans heavily on one particular
example, which happens to be in C.  However similar things can (and
do) occur in any language.

These assumptions are generally made because of familiarity with
the language.  As a non-code example, consider the idea that the
faulty code is written by blackguards bent on foulling the
language.  The term blackguards is not in favor these days, and for
good reason.  However, the older you are, the more likely you are
to have used it since childhood, and to use it again, barring
specific thought on the subject.  The same type of thing applies to
writing code.

I hope, with this little monograph, to encourage people to examine
some hidden assumptions they are making in their code.  As ever, in
dealing with C, the reference standard is the ISO C standard.
Versions can be found in text and pdf format, by searching for N869
and N1124. [1]  The latter does not have a text version, but is
more up-to-date.

We will always have innocent appearing code with these kinds of
assumptions built-in.  However it would be wise to annotate such
code to make the assumptions explicit, which can avoid a great deal
of agony when the code is reused under other systems.

In the following example, the code is as downloaded from the
referenced URL, and the comments are entirely mine, including the
'every 5' linenumber references.

/* Making fatal hidden assumptions */
/* Paul Hsiehs version of strlen.
   http://www.azillionmonkeys.com/qed/asmexample.html

   Some sneaky hidden assumptions here:
   1.  p = s - 1 is valid.  Not guaranteed.  Careless coding.
   2.  cast (int) p is meaningful.  Not guaranteed.
   3.  Use of 2's complement arithmetic.
   4.  ints have no trap representations or hidden bits.
   5.  4 == sizeof(int) && 8 == CHAR_BIT.
   6.  size_t is actually int.
   7.  sizeof(int) is a power of 2.
   8.  int alignment depends on a zeroed bit field.

   Since strlen is normally supplied by the system, the system
   designer can guarantee all but item 1.  Otherwise this is
   not portable.  Item 1 can probably be beaten by suitable
   code reorganization to avoid the initial p = s - 1.  This
   is a serious bug which, for example, can cause segfaults
   on many systems.  It is most likely to foul when (int)s
   has the value 0, and is meaningful.

   He fails to make the valid assumption: 1 == sizeof(char).
*/

#define hasNulByte(x) ((x - 0x01010101) & ~x & 0x80808080)
#define SW (sizeof (int) / sizeof (char))

int xstrlen (const char *s) {
const char *p;                                /*  5 */
int d;

    p = s - 1;
    do {
        p++;                                  /* 10 */
        if ((((int) p) & (SW - 1)) == 0) {
            do {
                d  = *((int *) p);
                p += SW;
            } while (!hasNulByte (d));        /* 15 */
            p -= SW;
        }
    } while (*p != 0);
    return p - s;
}                                             /* 20 */

Let us start with line 1!  The constants appear to require that
sizeof(int) be 4, and that CHAR_BIT be precisely 8.  I haven't
really looked too closely, and it is possible that the ~x term
allows for larger sizeof(int), but nothing allows for larger
CHAR_BIT.  A further hidden assumption is that there are no trap
values in the representation of an int.  Its functioning is
doubtful when sizeof(int) is less that 4.  At the least it will
force promotion to long, which will seriously affect the speed.

This is an ingenious and speedy way of detecting a zero byte within
an int, provided the preconditions are met.  There is nothing wrong
with it, PROVIDED we know when it is valid.

In line 2 we have the confusing use of sizeof(char), which is 1 by
definition.  This just serves to obscure the fact that SW is
actually sizeof(int) later.  No hidden assumptions have been made
here, but the usage helps to conceal later assumptions.

Line 4.  Since this is intended to replace the systems strlen()
function, it would seem advantageous to use the appropriate
signature for the function.  In particular strlen returns a size_t,
not an int.  size_t is always unsigned.

In line 8 we come to a biggie.  The standard specifically does not
guarantee the action of a pointer below an object.  The only real
purpose of this statement is to compensate for the initial
increment in line 10.  This can be avoided by rearrangement of the
code, which will then let the routine function where the
assumptions are valid.  This is the only real error in the code
that I see.

In line 11 we have several hidden assumptions.  The first is that
the cast of a pointer to an int is valid.  This is never
guaranteed.  A pointer can be much larger than an int, and may have
all sorts of non-integer like information embedded, such as segment
id.  If sizeof(int) is less than 4 the validity of this is even
less likely.

Then we come to the purpose of the statement, which is to discover
if the pointer is suitably aligned for an int.  It does this by
bit-anding with SW-1, which is the concealed sizeof(int)-1.  This
won't be very useful if sizeof(int) is, say, 3 or any other
non-poweroftwo.  In addition, it assumes that an aligned pointer
will have those bits zero.  While this last is very likely in
todays systems, it is still an assumption.  The system designer is
entitled to assume this, but user code is not.

Line 13 again uses the unwarranted cast of a pointer to an int.
This enables the use of the already suspicious macro hasNulByte in
line 15.

If all these assumptions are correct, line 19 finally calculates a
pointer difference (which is valid, and of type size_t or ssize_t,
but will always fit into a size_t).  It then does a concealed cast
of this into an int, which could cause undefined or implementation
defined behaviour if the value exceeds what will fit into an int.
This one is also unnecessary, since it is trivial to define the
return type as size_t and guarantee success.

I haven't even mentioned the assumption of 2's complement
arithmetic, which I believe to be embedded in the hasNulByte
macro.  I haven't bothered to think this out.

Would you believe that so many hidden assumptions can be embedded
in such innocent looking code?  The sneaky thing is that the code
appears trivially correct at first glance.  This is the stuff that
Heisenbugs are made of.  Yet use of such code is fairly safe if we
are aware of those hidden assumptions.  

I have cross-posted this without setting follow-ups, because I
believe that discussion will be valid in all the newsgroups posted.

[1] The draft C standards can be found at:
   <http://www.open-std.org/jtc1/sc22/wg14/www/docs/

--
"If you want to post a followup via groups.google.com, don't use
 the broken "Reply" link at the bottom of the article.  Click on
We've slightly trimmed the long signature. Click to see the full one.
Re: Making Fatal Hidden Assumptions

CBFalconer wrote (in part):
Quoted text here. Click to load it

None of these objections is warranted in the original context,
where the code is given as transliteration of some x86 assembly
language.  In effect, the author is offering a function that
might be used as part of a C implementation, where none of the
usual portability considerations need apply.  Your commentary
really should have included some statement along those lines.


Re: Making Fatal Hidden Assumptions
Quoted text here. Click to load it

Mentioned by Paul.

Quoted text here. Click to load it

Paul says it wouldmentionedtaken care of.

Quoted text here. Click to load it

Much as I dislike some of what Paul posts, on that site Paul explicitly
states in the text that the code is non-portable and mentions as least a
couple of issues.

Personally, I would not expect a function documented as being
non-portable and optimised for a specific processor to specify every
single assumption.
--
Flash Gordon, living in interesting times.
Web site - http://home.flash-gordon.me.uk /
We've slightly trimmed the long signature. Click to see the full one.
Re: Making Fatal Hidden Assumptions

Quoted text here. Click to load it

I haven't read your entire treatise, and I surely wouldn't want to
defend
Mr. Hsieh (whose code, he's led us to understand, was written by his
dog)
but I do have some comments.

Quoted text here. Click to load it

The fact that sizeof(char)==1 doesn't make this line bad *if* the
programmer feels that viewing SW as chars per int is important.
I don't defend this particular silliness, but using an expression for
a simple constant is often good if it makes the constant
self-documenting.

Quoted text here. Click to load it

Mr. Hsieh immediately does p++ and his code will be correct if then
p == s.  I don't question Chuck's argument, or whether the C standard
allows the C compiler to trash the hard disk when it sees p=s-1,
but I'm sincerely curious whether anyone knows of an *actual*
environment
where p == s will ever be false after (p = s-1; p++).

Many of us fell in love with C because, in practice, it is so much
simpler
and more deterministic than many languages.  Many of the discussions in
comp.lang.c seem like they'd be better in a new newsgroup:
  comp.lang.i'd_rather_be_a_lawyer

:-)  :-)

James Dow Allen


Re: Making Fatal Hidden Assumptions
Quoted text here. Click to load it
 
Quoted text here. Click to load it

If you don't want to know
whether or not (s - 1) causes undefined behavior,
then don't let it bother you.
You can write code any way you want.
But, if you want to discuss whether or not
(s - 1) causes undefined behavior, it does,
and this is the place to find out about it.

--
pete

Re: Making Fatal Hidden Assumptions
James Dow Allen said:

Quoted text here. Click to load it

Specifically, there is a good argument for char *p = malloc(n * sizeof *p).

Quoted text here. Click to load it

Not so, since forming an invalid pointer invokes undefined behaviour.

Quoted text here. Click to load it

It does.


Since it won't compile, the question is academic. But assuming you meant to
have a test in there somewhere, my personal answer would be: no, I don't
know of such an environment off the top of my head, but there are some
weird environments out there (the comp.lang.c FAQ lists a few), and it
would certainly not surprise me to learn of such an environment. In any
case, such an environment could become mainstream next week, or next year,
or next decade. Carefully-written code that observes the rules will
continue to work in such environments.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
We've slightly trimmed the long signature. Click to see the full one.
Re: Making Fatal Hidden Assumptions
 > [...] but I'm sincerely curious whether anyone knows of an *actual*
 > environment where p == s will ever be false after (p = s-1; p++).

The problem is that evaluating s-1 might cause an underflow and a
trap, and then you won't even reach the comparison. You don't
necessarily have to dereference an invalid pointer to get a trap.

You might hit this behavior on any segmented architecture (e.g.,
80286, or 80386+ with segments on) and you are virtually guaranteed to
hit it on any architecture with fine-grained segmentation. comp.std.c
periodically reminisces about the old Burroughs architecture, and
it's always possible something like it might come back sometime.

You will also see this behavior in any worthwhile bounds-checking
implementation.

 > Many of the discussions in comp.lang.c seem like they'd be better
 > in a new newsgroup:
 >    comp.lang.i'd_rather_be_a_lawyer
 >
 > :-)  :-)

Yes, well, that's what comp.lang.c is about...

--
   - David A. Holland
     (the above address works if unscrambled but isn't checked often)

Re: Making Fatal Hidden Assumptions

Quoted text here. Click to load it

I'm certainly no x86 expert.  Can you show or point to the output
of any C compiler which causes an "underflow trap" in this case?

At the risk of repetition, I'm *not* asking whether a past or future
compiler might or may trap (or trash my hard disk); I'd just be curious
to
see one (1) actual instance where the computation (without dereference)
p=s-1 causes a trap.

James


Re: Making Fatal Hidden Assumptions

Quoted text here. Click to load it

I don't know of any case where an pet grizzly bear who escaped has eaten
anyone in the Netherlands, but I'm still not short-sighted enough to use
that as an argument to allow grizzly bears as pets.

Richard

Re: Making Fatal Hidden Assumptions
 >>> [...] but I'm sincerely curious whether anyone knows of an *actual*
 >>> environment where p == s will ever be false after (p = s-1; p++).
 >>
 >> The problem is that evaluating s-1 might cause an underflow and a
 >> trap, and then you won't even reach the comparison. You don't
 >> necessarily have to dereference an invalid pointer to get a trap.
 >>
 >> You might hit this behavior on any segmented architecture (e.g.,
 >> 80286, or 80386+ with segments on) ...
 >
 >  I'm certainly no x86 expert.  Can you show or point to the output
 >  of any C compiler which causes an "underflow trap" in this case?

Have you tried bounds-checking gcc?

I don't think I've ever myself seen a compiler that targeted 286
protected mode. Maybe some of the early DOS-extender compilers did,
before everyone switched to 386+. If you can find one and set it to
generate code for some kind of "huge" memory model (supporting
individual objects more than 64K in size) I'd expect it to trap if you
picked a suitable location for `s' to point to.

That assumes you can find a 286 to run it on, too.

Otherwise, I don't know of any, but I'm hardly an expert on strange
platforms.

(Note: Coherent was a 286 protected mode platform, but it only
supported the "small" memory model... and it had a K&R-only compiler,
so it's not a viable example.)

--
   - David A. Holland
     (the above address works if unscrambled but isn't checked often)

Re: Making Fatal Hidden Assumptions
Quoted text here. Click to load it

There are actual environments where 's - 1' alone is enough to cause a
crash. In fact, any non-flat memory model environment (i.e. environment
with 'segment:offset' pointers) would be a good candidate. The modern
x86 will normally crash, unless the implementation takes specific steps
to avoid it.

--
Best regards,
Andrey Tarasevich

Re: Making Fatal Hidden Assumptions
Quoted text here. Click to load it

This illustrates the fact that usenet threads are uncontrollable.
I wrote the original to draw attention to hidden assumptions, and
it has immediately degenerated into thrashing about the one real
error in the sample code.  I could have corrected and eliminated
that error by a slight code rework, but then I would have modified
Mr Hsiehs code.  There were at least seven further assumptions,
most of which were necessary for the purposes of the code, but
strictly limited its applicability.

My aim was to get people to recognize and document such hidden
assumptions, rather than leaving them lying there to create sneaky
bugs in apparently portable code.

--
"If you want to post a followup via groups.google.com, don't use
 the broken "Reply" link at the bottom of the article.  Click on
We've slightly trimmed the long signature. Click to see the full one.
Re: Making Fatal Hidden Assumptions
On Wed, 08 Mar 2006 11:52:52 -0800, Andrey Tarasevich

Quoted text here. Click to load it

Exactly which x86 mode are you referring to ?

16 bit real mode, virtual86 mode or some 32 mode (which are after all
segmented modes with all segmented registers with the same value) ?

If s is stored in 16 bit mode in ES:DX with DX=0, then p=s-1 would
need to decrement ES by one and store 000F in DX. Why would reloading
ES cause any traps, since no actual memory reference is attempted ?
Doing p++ would most likely just increment DX by one to 0010, thus
ES:DX would point to s again, which is a legal address, but with a
different internal representation.

IIRC some 32 bit addressing mode would trap if one tried to load the
segment register, but again, how could the caller generate such
constructs as s = ES:0 at least from user mode. In practice s = ES:0
could only be set by a kernel mode routine calling a user mode
routine, so this is really an issue only with main() parameters.

Paul
 

Re: Making Fatal Hidden Assumptions


Quoted text here. Click to load it

I don't think this is what happens. Instead I think DX will wrap to FFFF,
and ES will remain unchanged.

-Michael.



Re: Making Fatal Hidden Assumptions
Quoted text here. Click to load it

More easily answered if the original statement had been preserved
in context.  However in this case if the ES segment has range
limits, a segment fault will be triggered.  Due to the happy
looseness of the term "undefined behavior" no specific action is
needed, which is what enables the loosing of nasal demons or
launching of missiles.

--
"If you want to post a followup via groups.google.com, don't use
 the broken "Reply" link at the bottom of the article.  Click on
We've slightly trimmed the long signature. Click to see the full one.
Re: Making Fatal Hidden Assumptions

Quoted text here. Click to load it

Nah.  Seg faults only happen on access, in that situation.  The value of
DX can't matter: it isn't part of the "pointer" until an access happens.

It looks as though the AS/400 is the only machine that that particular
restriction in the standard is there to support.  Yay.

Cheers,

--
Andrew


Re: Making Fatal Hidden Assumptions
On Thu, 16 Mar 2006 14:45:47 +0100, "Michael Jørgensen"

Quoted text here. Click to load it



If p=s-1 would only cause decrementing of DX (and not effecting ES)
and thus wrapping to FFFF, then what would p++ do ? It would be
logical to assume that only DX would be affected and incremented by 1,
causing wrapping DX back to 0000 and hence ES:DX would have the same
representation as the original s and p == s would be true.

Paul


Re: Making Fatal Hidden Assumptions
Quoted text here. Click to load it

Getting people to think about their code is no bad idea.  I've added a
couple of comments below...

Quoted text here. Click to load it

Not guaranteed in what way?  You are not guaranteed that p will be a
valid pointer, but you don't require it to be a valid pointer - all that
is required is that "p = s - 1" followed by "p++" leaves p equal to s.
I'm not good enough at the laws of C to tell you if this is valid, but
the laws of real life implementation say that it *is* valid on standard
2's complement cpus, *assuming* you do not have any sort of saturated
arithmetic.

In other words, the assumption is valid on most systems, but may be
risky on some DSPs or on dinosaurs.

Quoted text here. Click to load it

In what way could it not be meaningful here?  In particular, the code is
correct even in a 32-bit int, 64-bit pointer environment.

Quoted text here. Click to load it

Again, this is valid on everything bar a few dinosaurs.

Quoted text here. Click to load it

Ditto.


This is clearly an assumption that is not valid in general (although
perfectly acceptable in the context of this webpage, which assumes an
x86 in 32-bit mode).

Quoted text here. Click to load it

No, the assumption is that the ratio of two size_t items is compatible
with int.  This may or may not be valid according to the laws of C.

Quoted text here. Click to load it

This is implied by (5), so it's not a new assumption.

Quoted text here. Click to load it

I'm not quite sure what you mean here, but I think this is implied by
(3) and (4).

Quoted text here. Click to load it

That's incorrect.  No machine (that I can imagine) would segfault on p =
s - 1.  It might segfault if p is then dereferenced, but it never is
(until it has been incremented again).

Quoted text here. Click to load it

Since strlen is supplied by the system, it is reasonable to make
assumptions about the system when writing it.  It is *impossible* to
write good C code for a low-level routine like this without making
assumptions.  The nearest you can get to a portable solution is a
horrible mess involving pre-processor directives to generate different
code depending on the target architecture.  *All* the assumptions made
here are valid in that context - the code is good.


As a general point, however, it is important to comment such routines
with their assumptions, and possibly use pre-processor directives to
give an error if the assumptions are not valid.  In particular, this
code assumes a 32-bit int, 8-bit char model, and an otherwise standard CPU.

<snip similar analysis>

Quoted text here. Click to load it

I'm following this in comp.arch.embedded, where we see a lot of
different target architectures, and where efficient code is perhaps more
important than in the world of "big" computers (at least, a higher
proportion of "big" computer programmers seem to forget about
efficiency...).  There are two things about such target-specific
assumptions, in embedded programming - you should make such assumptions,
for the sake of good code, and you must know what they are, for the sake
of correct and re-usable code.

Re: Making Fatal Hidden Assumptions
Quoted text here. Click to load it


Merely subtracting 1 from s, renders the entire code undefined.
You're "off the map" as far as the laws of C are concerned.
On comp.lang.c,
we're mostly interested in what the laws of C *do* say
is guaranteed to work.

--
pete

Re: Making Fatal Hidden Assumptions

Quoted text here. Click to load it

It seems to me ironic that, in a discussion about hidden assumptions, the
truth of this remark requires a hidden assumption about how the function
is called.  Unless I am missing something big, p = s - 1 is fine unless s
points to the first element of an array (or worse)[1].  One can imagine
situations such as "string pool" implementations where all strings are
part of the same array where p = s - 1 is well defined. (You'd have to
take care with the first string: char pool[BIG_NUMBER]; char *pool_start =
pool + 1;).

[1] Since this is a language law discussion, we need to take the
definition from the standard where T *s is deemed to point to an array of
size one if it points to an object of type T.  By "worse" I mean that s
does not point into (or just past) an array at all.

--
Ben.

Site Timeline