Making Fatal Hidden Assumptions

- A
- Andrew Reilly
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 1:39 AM

It's no more "illegal" than any of the other undefined behaviour that you pointed out in that code snippet. There aren't different classes of undefined behaviour, are there?

I reckon I'll just go with the undefined flow, in the interests of efficient, clean code on the architectures that I target. I'll make sure that I supply a document specifying how the compilers must behave for all of the undefined behaviours that I'm relying on, OK? I have no interest in trying to make my code work on architectures for which they don't hold.

Of course, that list will pretty much just describe the usual flat-memory,

2's compliment machine that is actually used in almost all circumstances in the present day, anyway. Anyone using anything else already knows that they're in a world of trouble and that all bets are off.

--
Andrew

- D
- Dik T. Winter
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 1:44 AM

Now responding to the basic article:

In article snipped-for-privacy@maineline.net writes: ... > #define hasNulByte(x) ((x - 0x01010101) & ~x & 0x80808080)

It does not allow for larger sizeof(int) (as it does not allow for other values of CHAR_BIT). When sizeof(int) > 4 it will only show whether there is a zero byte in the low order four bytes. When sizeof(int) < 4 it will give false positives. Both constants have to be changed when sizeof(int) != 4. Moreover, it will not work on

1's complement or sign-magnitude machines. Using unsigned here is most appropriate.

It is false on the Cray 1 and its derivatives. See another article by me where I show that it may give wrong answers.

--
dik t. winter, cwi, kruislaan 413, 1098 sj  amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn  amsterdam, nederland; http://www.cwi.nl/~dik/

- A
- Andrew Reilly
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 2:03 AM

Yeah, my world-view doesn't allow individual objects to occupy half the address space or more. I'm comfortable with that restriction, but I can accept that there may be others that aren't. They're wrong, of course :-)

Yes, very hard indeed. Partition your object or use a machine with bigger addresses. Doesn't seem like a good enough reason to me to break a very useful abstraction.

Posit: you've got N bits to play with, both for addresses and integers. You need to be able to form a ptrdiff_t, which is a signed quantity, to compute d = anobject.a[i] - anobject.a[j], for any indices i,j within the range of the array. The range of signed quantities is just less than half that of unsigned. That range must therefore define how large any individual object can be. I.e., half of your address space. Neat, huh?

Yeah, yeah, for any complicated problem there's an answer that is simple, neat and wrong.

--
Andrew

- K
- Keith Thompson
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 2:15 AM

Right, "illegal" probably isn't the best word to describe undefined behavior. An implementation is required to diagnose syntax errors and constraint violations; it's specifically *not* required to diagnose undefined behavior (though it's allowed to do so).

Ok, you can do that if you like. If you can manage to avoid undefined behavior altogether, your code is likely to work on *any* system with a conforming C implementation; if not, it may break when ported to some exotic system.

For example, code that makes certain seemingly reasonable assumptions about pointer representations will fail on Cray vector systems. I've run into such code myself; the corrected code was actually simpler and cleaner.

If you write code that depends on undefined behavior, *and* there's a real advantage in doing so on some particular set of platforms, *and* you don't mind that your code could fail on other platforms, then that's a perfectly legitimate choice. (If you post such code here in comp.lang.c, you can expect us to point out the undefined behavior; some of us might possibly be overly enthusiastic in pointing it out.)

All bets don't *need* to be off if you're able to stick to what the C standard actually guarantees.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org  
San Diego Supercomputer Center               
We must do something.  This is something.  Therefore, we must do this.

- A
- Andrew Reilly
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 2:21 AM

Probably, since NULL has been given the guarantee that it's unique in some sense. In an embedded environment, or assembly language, the construct could of course produce NULL (for whatever value you pick for NULL), and NULL would not be special. I don't know that insisting on the existence of a unique and special NULL pointer value is one of the standard's crowning achievements, either. It's convenient for lots of things, but it's just not the way simple hardware works, particularly at the limits.

Sure, in the ptrdiff sense that I mentioned before. I.e., (a - 1) - (a + 0) < 0 (indeed, identically -1)

Go nuts. If your address space is larger than your integer range, (as, is the case for I32LP64 machines), your compiler might have to make sure that it performs the difference calculation to sufficient precision.

I still feel comfortable about this failing to work for objects larger than half the address space, or even for objects larger than the range of an int. That's IMO, a much less uncomfortable restriction than the one that the standard seems to have stipulated, which is that the simple and obvious pointer arithmetic that you've used in your examples works in some situations and doesn't work in others. (Remember: it's all good if those array references are in a function that was itself passed (&foo[n], for n>=1) as the argument.)

Cheers,

--
Andrew

- K
- Keith Thompson
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 2:32 AM

[...]

I can easily imagine a program that needs to manipulate a very large data set (for a scientific simulation, perhaps). For a data set that won't fit into memory all at once, loading as much of it as possible can significantly improve performance.

Your "very useful abstraction" is not something that has *ever* been guaranteed by any C standard or reference manual.

The standard explicitly allows for the possibility that pointer subtraction within a single object might overflow (if so, it invokes undefined behavior). Or, given that C99 requires 64-bit integer types, making ptrdiff_t larger should avoid the problem for any current systems (I don't expect to see full 64-bit address spaces for a long time).

The standard is full of compromises. Not everyone likes all of them.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org  
San Diego Supercomputer Center               
We must do something.  This is something.  Therefore, we must do this.

- A
- Andrew Reilly
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 2:34 AM

Same for gcc4 on MacOS X. However this slight permutation of your program (only the comparison line has changed):

#include

#define SIZE (50*1000000L) typedef struct { char a [SIZE]; } bigstruct;

static bigstruct bigarray [8];

int main(void) { printf("%lx\n", (unsigned long) &bigarray [0]); printf("%lx\n", (unsigned long) &bigarray [9]); printf("%lx\n", (unsigned long) &bigarray [-1]); if (&bigarray [-1] - & bigarray [0] < 0) printf ("Everything is fine\n"); else printf ("The C Standard is right: &bigarray [-1] is broken\n"); return 0; }

produces:

3080 1ad2a500 fd054000 Everything is fine

So what we see is that (a) pointer comparisons use direct unsigned integer comparison, instead of checking the sign of the pointer difference---since pointer comparisons only make sense in the context of an indivdual object, I'd argue that the compiler is doing the wrong thing here, and the comparison should instead have been done in the context of a pointer difference; and (b) your printf string about "&bigarray[-1] is broken" is wrong, since that's not what the code showed at all. What it showed is that &bigarray[-1] could be formed, that &bigarray[0] was one element to the right of it, and that hell did not freeze over (nor was any trap taken), since you did not attempt to access any memory there.

Cheers,

--
Andrew

- M
- msg
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 2:35 AM

The Cray blood-line starting at least with "Little Character" (prototype for the 160) was 1's complement, implemented with subtraction as the basis of arithmetic (the so-called 'adder pyramid'). Even the CDC

3000 series which were mostly others' designs retained 1's complement arithmetic. The 6000 and 7000 series PPUs were essentially 160s also. I should think it safe to say one could find 1's complement in Cray designs from at least 1957 through the early 1980s.

Nor did he have truck with integrated circuits until absolutely necessary.

Michael Grigoni Cybertheque Museum

- K
- Keith Thompson
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 2:51 AM

How exactly do you get from NULL (more precisely, a null pointer value) being "unique in some sense" to a guarantee that &a[-1], which doesn't point to any object, is unequal to NULL?

The standard guarantees that a null pointer "is guaranteed to compare unequal to a pointer to any object or function". &a[-1] is not a pointer to any object or function, so the standard doesn't guarantee that &a[-1] != NULL.

Plausibly, if a null pointer is represented as all-bits-zero, and pointer arithmetic works like integer arithmetic, an object of size N could easily happen to be allocated at address N; then pointer arithmetic could yield a null pointer value. (In standard C, this is one of the infinitely many possible results of undefined behavior.)

What restrictions would you be willing to impose, and/or what code would you be willing to break, in order to make such a guarantee?

--
Keith Thompson (The_Other_Keith) kst-u@mib.org  
San Diego Supercomputer Center               
We must do something.  This is something.  Therefore, we must do this.

- A
- Andrew Reilly
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 2:51 AM

Most jokes contain at least a kernel of truth.

Yeah, me to. Still do, regularly, on processors that will never have a C compiler. C is as close to a universal assembler as we've got at the moment. It doesn't stick it's neck out too far, although a more deliberately designed universal assembler would be a really good thing. (It's on my list of things to do...)

If you actually *want* a higher level language, there are better ones to chose from than C.

--
Andrew

- C
- Chris Torek
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 2:55 AM

While the data are consistent with this conclusion, there are other ways to arrive at the same output. But this is certainly allowed.

It is perhaps worth pointing out that in Ancient C (as in "whatever Dennis' compiler did"), before the "unsigned" keyword even existed, the way you got unsigned arithmetic and comparisons was to use "char *". That is:

int a, b; char *c, *d; ... if (a < b) /* signed compare */ ... c = a; /* no cast needed because this was Ancient C */ d = b; /* (we could even do things like 077440->rkcsr!) */ if (c < d) /* unsigned compare */ ...

It sounded to me as though you liked what Dennis' original compilers did, and wished that era still existed. In this respect, it does: and now you argue that this is somehow "wrong".

--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
email: forget about it   http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

- C
- Chad
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 4:44 AM

Let me get this correct.

If I went something like

#include int main(void) {

int *p; int arr[2]; p = arr + 4;

return 0; }

This would be undefine behavior because I'm writing two past the array instead of one. Right?

Chad

- A
- Arthur J. O'Dwyer
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 5:30 AM

Wrong. It would be undefined behavior because you're constructing a pointer that points /three/ elements past the end of the array. ("Writing" has nothing to do with it.) But yes, it's undefined behavior in C (and C++).

-Arthur

- K
- Keith Thompson
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 5:41 AM

You're not writing past the array, but yes, it's undefined behavior.

Given the above declarations, and adding "int i;":

p = arr + 1; /* ok */ i = *p; /* ok, accesses 2nd element of 2-element array */

p = arr + 2; /* ok, points just past end of array */ i = *p; /* undefined behavior */

p = arr + 3; /* undefined behavior, points too far past end of array */

--
Keith Thompson (The_Other_Keith) kst-u@mib.org  
San Diego Supercomputer Center               
We must do something.  This is something.  Therefore, we must do this.

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 6:34 AM

... snip ...

That's just fine with me, and is the attitude I wanted to trigger. As long as you recognize and document those assumptions, all is probably well. In the process you may well find you don't need at least some of the assumptions, and improve your code portability thereby.

In the original sample code, it is necessary to deduce when an integer pointer can be used in order to achieve the goals of the routine. Thus it is necessary to make some of those assumptions. Once documented, people know when the code won't work.

--
"If you want to post a followup via groups.google.com, don't use
 the broken "Reply" link at the bottom of the article.  Click on 
 "show options" at the top of the article, then click on the 
 "Reply" at the bottom of the article headers." - Keith Thompson
More details at: 
Also see

- J
- James Dow Allen
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 6:46 AM

I'm certainly no x86 expert. Can you show or point to the output of any C compiler which causes an "underflow trap" in this case?

At the risk of repetition, I'm *not* asking whether a past or future compiler might or may trap (or trash my hard disk); I'd just be curious to see one (1) actual instance where the computation (without dereference) p=s-1 causes a trap.

James

- R
- Richard Bos
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 8:06 AM

I don't know of any case where an pet grizzly bear who escaped has eaten anyone in the Netherlands, but I'm still not short-sighted enough to use that as an argument to allow grizzly bears as pets.

Richard

- R
- Richard Bos
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 8:09 AM

All that statement means is that the person who utters it knows diddly-squat about either C _or_ assembler.

Richard

- C
- Christian Bau
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 8:47 AM

We didn't see anything. The code involved undefined behavior.

Now try the same with array indices -2, -3, -4 etc. and tell us when is the first time that the program says your code is broken.

Or try this one on a 32 bit PowerPC or x86 system:

double* p; double* q;

q = p + 0x2000000; if (p == q) printf ("It is broken!!!"); if (q - p == 0) printf ("It is broken!!!");

- D
- Dik T. Winter
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 11:32 AM

In article snipped-for-privacy@maineline.net writes: ... > The reason to use a subtractor is that that guarantees than -0 > never appears in the results. This allows using that value for > such things as traps, uninitialized, etc.

This was however not done on any of the 1's complement machines I have worked with. The +0 preferent machines (CDC) just did not generate it in general. The -0 preferent machines I used (Electrologica) in general did not generate +0. Bit the number not generated was not handled as special in any way.

I have seen only one machine that used some particular bit pattern in integers in a special way. The Gould. 2's complement but what would now be regarded as the most negative bit pattern was a trap representation on the Gould.

--
dik t. winter, cwi, kruislaan 413, 1098 sj  amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn  amsterdam, nederland; http://www.cwi.nl/~dik/