On 2006-03-07, James Dow Allen wrote: > [...] but I'm sincerely curious whether anyone knows of an *actual* > environment where p == s will ever be false after (p = s-1; p++).
The problem is that evaluating s-1 might cause an underflow and a trap, and then you won't even reach the comparison. You don't necessarily have to dereference an invalid pointer to get a trap.
You might hit this behavior on any segmented architecture (e.g.,
80286, or 80386+ with segments on) and you are virtually guaranteed to hit it on any architecture with fine-grained segmentation. comp.std.c periodically reminisces about the old Burroughs architecture, and it's always possible something like it might come back sometime.
You will also see this behavior in any worthwhile bounds-checking implementation.
Yes, well, that's what comp.lang.c is about...
--
- David A. Holland
(the above address works if unscrambled but isn't checked often)
There's (at least) one more property I forgot to mention. Given:
#define LEN 100 #define INC 5000 /* modify both of these as you like */ int arr[LEN]; int *ptr1 = arr; int *ptr2 = ptr1 + INC; /* D */
would you also require that, at point D, ptr2 > ptr1? (If pointer arithmetic wraps around, this might not be the case even if adding and subtracting as above always gets you back to the original address.)
And you think that having the standard guarantee this behavior is worth the cost of making it much more difficult to implement C on systems where the underlying machine addresses don't meet this property, yes?
If so, that's a consistent point of view, but I disagree with it.
I'll also mention that none of this stuff has changed significantly between C90 (the 1990 ISO C standard, equivalent to the original ANSI standard of 1989) and C99 (the 1990 ISO standard).
In fact, I just checked my copy of K&R1 (published in 1978). I can't copy-and-paste from dead trees, so there may be some typos in the following. This is from Appendix A, the C Reference Manual, section
7.4, Additive operators:
A pointer to an object in an array and a value of any integral type may be added. [...] The result is a pointer of the same type as the original pointer, and which points to another object in the same array, appropriately offset from the orginal object.
[...]
[... likewise for subtracting an integer from a pointer ...]
If two pointers to objects of the same type are subtracted, the result is converted [...] to an int representing the number of objects separating the pointed-to objects. This conversion will in general give unexpected results unless the pointers point to objects in the same array, since pointers, even to objects of the same type, do not necessarily differ by a multiple of the object-length.
The last quoted paragraph isn't quite as strong as what the current standard says, since it bases the undefinedness of pointer subtraction beyond the bounds of an object on alignment, but it covers the same idea.
The C Reference Manual from May 1975, , has the same wording about pointer subtraction, but not about pointer+integer addition.
So if you think that the requirements you advocate are "the very essence of the nature of C", I'm afraid you're at least 28 years too late to do anything about it.
--
Keith Thompson (The_Other_Keith) kst-u@mib.org
San Diego Supercomputer Center
a+9 > a+8 because a + 9 - (a + 8) == 1, which is > 0. Doesn't matter if the signed or unsigned pointer value wrapped around in an intermediate term. On many machines that's how the comparison is done anyway. You're suggesting that having the compiler ensure that a+8 doesn't wrap around wrt a is OK, but a+9 is too hard. I don't buy it.
Only if you put them there. (The real problem is objects larger than half the address space, where a valid pointer difference computation produces a ptrdiff value that is out of range for a signed integer.)
Unsigned ints have the nice property that (a + 1) - 1 == a for all a, even if a + 1 == 0. Overflow is generally no big deal in any case. (Other than the object larger than half the address space issue.)
The compiler can't necessarily avoid overflow, but it *can* arrange for pointer comparisons to work properly.
Seems like it will work at least as well as the usual unit-stride algorithm and idiom.
and if the arithmetic happens to wrap round after s + N, you really are dead too.
It doesn't have to be about weird architectures and traps. No implementation can provide an unlimited range for pointer arithmetic without some kind of overflow behaviour, such as a wrap round. Granted a wrap-round needn't affect addition and subtraction, but it will affect comparisons.
Every allocated object comes with a limited range for pointer comparisons to satisfy p-1
There are lots of embedded systems with 8- and 16-bit pointers. With the right value of buffer_pos, it wouldn't take a very large value of amount_needed for that addition to wrap and given you an incorrect comparison.
How would you guarantee that a+(i+1) > a+i for all arbitrary values of i? It's easy enough to do this when the addition doesn't go beyond the end of the array (plus the case where it points just past the end of the array), but if you want to support arbitrary arithmetic beyond the array bounds, it's going to take some extra work, all for the sake of guaranteeing properties that have *never* been guaranteed by the C language. (Don't confuse what happens to have always worked for you with what's actually guaranteed by the language itself.)
[...]
But unsigned ints *don't* have the property that a + 1 > a for all a.
--
Keith Thompson (The_Other_Keith) kst-u@mib.org
San Diego Supercomputer Center
Er, didn't I point that fix out in the original article? That was the only error in the original sample code, all other problems can be tied to assumptions, which may be valid on any given piece of machinery. The point is to avoid making such assumptions, which requires recognizing their existence in the first place.
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
I have encountered situations where free(p); .... if (p == q) leads to the platform's equivalent of the much beloved "segmentation fault". Your theory means that this should have worked. Assigning NULL or a valid address to p after freeing avoids the error.
Incidentally, in gnu.gcc.help there is a discussion about much the same situation in C++ where someone gets in trouble for delete a; .... if (a == b) ... Happens only for multiple inheritance and only for gcc. Thread starts at
No, they don't, but when you're doing operations on pointer derivations that are all in some sense "within the same object", even if hanging outside it, (i.e., by dint of being created by adding integers to a single initial pointer), then the loop termination condition is, in a very real sense, a ptrdif_t, and *should* be computed that way. The difference can be both positive and negative.
The unsigned comparison a + n > a fails for some values of a, but the ptrdiff_t (signed) comparison a + n - a > 0 is indeed true for all a and n > 0, so that's what should be used. And it *is* what is used on most processors that do comparison by subtraction (even if that's wrapped in a non-destructive cmp).
I actually agree completely with the piece of K&R that you posted a few posts ago, where it was pointed out that pointer arithmetic only makes sense for pointers within the same object (array). Since C doesn't tell you that the pointer that your function has been given isn't somewhere in the middle of a real array, my aesthetic sense is that conceptually, arrays (as pointers, within functions) extend infinitely (or at least to the range of int) in *both* directions, as far as pointer arithmetic within a function is concerned. Actually accessing values outside of the bounds of the real array that has been allocated somewhere obviously contravenes the "same object" doctrine, and it's up to the logic of the caller and the callee to avoid that.
Now it has been amply explained that my conception of how pointer arithmetic ought to work is not the way the standard describes, even though it *is* the way I have experienced it in all of the C implementations that it has obviously been my good fortune to encounter. I consider that to be a pity, and obviously some of my code wouldn't survive a translation to a Boroughs or AS/400 machine (or perhaps even to some operating environments on 80286es). Oh, well. I can live with that. It's not really the sort of code that I'd expect to find there, and I don't expect to encounter such constraints in the future, but I *will* be more careful, and will keep my eyes more open.
Well I, for one, commented on the hidden assumption that must be made for what you call "the one real error" to actually be an error -- but it was not recognised! ;-)
[At the top of your original post did not, in fact, claim this was an error but you call it a "real error" later on.]
I feel that your points would have been better made using other examples. The context of the code made me read the C as little more than pseudo-code with the added advantage that a C compiler might, with a following wind, produce something like the assembler version (which in turn has its own assumptions but you were not talking about that).
I found Eric Sosman's "if (buffer + space_required > buffer_end) ..." example more convincing, because I have seen that in programs that are intended to be portable -- I am pretty sure I have written such things myself in my younger days. Have you other more general examples of dangerous assumptions that can sneak into code? A list of the "top 10 things you might be assuming" would be very interesting.
This is pure theology. the simple fact is that you can't GUARANTEE that p++, or p--, or for that matter p itself, points to anything in particular, unless you know something about p. And if you know about p, you are OK. What's your problem?
Surely the camel's nose is already through the gate, on that one, with the explicit allowance of "one element after"? How does that fit with all of the conniptions expressed here about things that fall over dead if a pointer even looks at an address that isn't part of the object? One out, all out.
It's an implementation of strlen(). One must expect it to be called with any pointer to a valid string - and those are usually pointers to the first byte of a memory block.
Are you quite sure that you know what the word "theology" means?
What Arthur wrote above is entirely correct. (Remember that undefined behavior includes the possibility, but not the guarantee, of the code doing exactly what you expect it to do, whatever that might be.)
What's your problem?
--
Keith Thompson (The_Other_Keith) kst-u@mib.org
San Diego Supercomputer Center
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.