Making Fatal Hidden Assumptions

You can still write your code to make whatever assumptions you like. You just can't assume that it will work portably. If, for example, you are writing code for a particular embedded architecture with a given compiler, then it may be reasonable to make assumptions beyond those granted by the standard.

In other words, the standard provides minimum guarantees. Your implementation may provide stronger ones.

--
"It would be a much better example of undefined behavior
 if the behavior were undefined."
 Click to see the full signature
Reply to
Ben Pfaff
Loading thread data ...

And are there any? Any in common use? Any where the equivalent (well defined) pointer+offset code would be slower?

I'll need that list to know which suppliers to avoid...

Suppose the computer uses tribits.

Standards are meant to codify common practice. If you want a language that only has object references and array indices, there are plenty of those to chose from.

And that helps who?

Which implementations?

That would be a *good* thing. Checking any earlier than at reference time breaks what it is about C that makes it C.

OK, I've made enough of a fool of myself already. I'll go and have that second cup of coffee for the morning, before I start going on about having the standard support non-2's complement integers, or machines that have no arithmetic right shifts...

--
Andrew
Reply to
Andrew Reilly

Arithmetic right shift isn't particularly useful on some machines that do. notably twos-complement ones.

Reply to
Jordan Abel

Well, shucks, I manage to make it work pretty well most every day. Does that mean I'm in ill-repute too?

--
Al Balmer
Sun City, AZ
Reply to
Al Balmer

x86 in non-flat protected mode would be one example. Attempting to load an invalid value into a segment register causes a fault.

--
"Given that computing power increases exponentially with time,
 algorithms with exponential or better O-notations
 Click to see the full signature
Reply to
Ben Pfaff

Decades of use? This isn't a new rule.

An implementation might choose, for valid reasons, to prefetch the data that pointer is pointing to. If it's in a segment not allocated ...

--
Al Balmer
Sun City, AZ
Reply to
Al Balmer

I wasn't aware that address arithmetic generally operated on the segment register in that environment, rather on the "pointer" register used within the segment. I haven't coded in that environment myself, so I have no direct experience to call on. My understanding was that the architecture was intrinsically a segment+offset mechanism, so having the compiler produce the obvious code in the offset value (i.e., -1) would not incur the performance penalty that has been mentioned. (Indeed, it's loading the segment register that causes the performance penalty, I believe.)

--
Andrew
Reply to
Andrew Reilly

Hypothetical hardware that traps on *speculative* loads isn't broken by design? I'd love to see the initialization sequences, or the task switching code that has to make sure that all pointer values are valid before they're loaded. No, scratch that. I've got better things to do.

Cheers,

--
Andrew
Reply to
Andrew Reilly

Address arithmetic might not, but the standard doesn't disallow it. Other uses of invalid pointers, e.g. comparing a pointer into a freed memory block against some other pointer, seem more likely to do so.

--
"Give me a couple of years and a large research grant,
 and I'll give you a receipt." --Richard Heathfield
Reply to
Ben Pfaff

I really don't know, but the idea of allowing errors to be caught as early as possible seems like a good one.

[...]

Do you mean trinary digits rather than binary digits? The C standard requires binary representation for integers.

It (potentially) helps implementers to generate the most efficient possible code, and it helps programmers to know what's actually guaranteed to work across all possible platforms with conforming C implementations.

[...]

C99 allows signed integers to be represented in 2's-complement,

1's-complement, or signed-magnitude (I think I mispunctuated at least one of those).

C has been implemented on machines that don't support floating-point, or even multiplication and division, in hardware. The compiler just has to do whatever is necessary to meet the standard's requirements.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org  
San Diego Supercomputer Center               
 Click to see the full signature
Reply to
Keith Thompson

Consider the alternative.

#define LEN 100 #define INC 5000 int arr[LEN]; int *ptr = arr; /* A */ ptr += 2*INC; ptr -= INC; ptr -= INC; /* B */ ptr -= INC; ptr -= INC; ptr += 2*INC; /* C */

What you're suggesting, I think, is that ptr==arr should be true at points A, B, and C, for any(?) values of LEN and INC. It happens to work out that way sometimes (at least in the test program I just tried), but I can easily imagine a system where guaranteeing this would place an undue burden on the compiler.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org  
San Diego Supercomputer Center               
 Click to see the full signature
Reply to
Keith Thompson

It doesn't have to make sure. It's free to segfault. You write funny code, you pay the penalty (or your customers do.) Modern hardware does a lot of speculation. It can preload or even precompute both branches of a conditional, for example.

--
Al Balmer
Sun City, AZ
Reply to
Al Balmer

I've been told that IBM AS/400 bollixes the bogus arithmetic, at least under some circumstances. A friend told of fixing code that did something like

if (buffer_pos + amount_needed > buffer_limit) { ... enlarge the buffer ... } memcpy (buffer_pos, from_somewhere, amount_needed); buffer_pos += amount_needed;

This looks innocuous to devotees of flat address spaces (Flat-Earthers?), but it didn't work on AS/400. If the sum `buffer_pos + amount_needed' went past the end of the buffer, the result was some kind of NaP ("not a pointer") and the comparison didn't kick in. Result: the code never discovered that the buffer needed enlarging, and merrily tried to memcpy() into a too-small area ...

I have no personal experience of the AS/400, and I may have misremembered some of what my friend related. Would anybody with AS/400 knowledge care to comment?

--
Eric.Sosman@sun.com
Reply to
Eric Sosman

Andrew Reilly wrote (in article ):

This is a lot of whining about a specific problem that can easily be remedied just by changing the loop construction. The whole debate is pretty pointless in that context, unless you have some religious reason to insist upon the method in the original.

--
Randy Howard (2reply remove FOOBAR)
"The power of accurate observation is called cynicism by those 
 Click to see the full signature
Reply to
Randy Howard

of

to

past

points

or

the

shall

It's not always equivalent. The trouble starts with

char a[8]; char *p;

for ( p = a+1 ; p < a+8 ; p += 2 ) {}

intending that the loop terminates on p == a+9 (since it skips a+8). But how do we know that a+9 > a+8 ? If the array is right at the top of some kind of segment, the arithmetic might have wrapped round.

To support predictable pointer comparisons out of range, the compiler would have to allocate space with a safe buffer zone. Complications are setting in.

Ints have the nice property that 0 is in the middle and we know how much headroom we've got either side. So it's easy for the compiler to make the int version work (leaving it to the programmer to take responsibility for avoiding overflow, which is no big deal).

Pointers don't have that property. The compiler can't take sole responsibility for avoiding overflow irrespective of what the programmer does. If the programmer wants to go out of range and is at the same time responsible for avoiding overflow, then he has to start worrying about whereabouts his object is and what headroom he's got.

Can't see how assembly programmers avoid the same kind of issue. I can see how they could ignore it. The code will work most of the time.

--
RSH
Reply to
Robin Haigh

In article David Brown writes: > CBFalconer wrote: ... > > Some sneaky hidden assumptions here: > > 1. p = s - 1 is valid. Not guaranteed. Careless coding. > > Not guaranteed in what way? You are not guaranteed that p will be a > valid pointer, but you don't require it to be a valid pointer - all that > is required is that "p = s - 1" followed by "p++" leaves p equal to s.

But the standard allows "p = s - 1" to trap when an invalid pointer is generated. And this can indeed be the case on segmented architectures.

--
dik t. winter, cwi, kruislaan 413, 1098 sj  amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn  amsterdam, nederland; http://www.cwi.nl/~dik/
Reply to
Dik T. Winter

That *is* what I'm suggesting. In fact, I'm suggesting that p += a; p -= a; should leave p as it was originally for any int a and pointer p. To my mind, and I've been using C for more than 20 years, that is the very essence of the nature of C. It's what makes pointer-as-cursor algorithms make sense. Throw it away, and you might as well restrict yourself to coding p[a], and then you've got fortran, pascal or Java.

Just because hardware can be imagined (or even built) that doesn't match the conventional processor model that C most naturally fits *shouldn't* be an argument to dilute or mess around with the C spec. Just use a different language on those processors, or put up with some inefficiency or compiler switches. Pascal has always been a pretty nice fit for many hardware-pointer-checked machines. Such hardware isn't even a good argument in this case though, since the obvious implementation will involve base+offset compound pointers anyway, and mucking around with the offset (as an integer) should neither trap nor cause a performance issue.

I've coded for years on Motorola 56000-series DSPs, and they don't look anything like the conventional processor that C knows about: you've got two separate data memory spaces and a third for program memory, pointers aren't integers, words are 24-bits long and that's the smallest addressable unit, and so on. Never the less, there have been at least two C compilers for the thing, and they've both produced *awful* code, and that's OK: they were never used for performance-critical code. That was always done in assembler. There are lots of processors (particularly DSPs) that are worse. I know of one that doesn't have pointers as such at all. That's OK too. There isn't a C compiler for that.

C is useful, though, and there's a lot of code written in it, so it's no surprise that most of the more recent DSP designs actually do fit nicely into the C conventional machine model. And (p + n) - n works in the obvious fashion for those, too.

Cheers,

--
Andrew
Reply to
Andrew Reilly

In article Andrew Reilly writes: ... > It's precicely this sort of tomfoolery on the part of the C standards > committee that has brought the language into such ill-repute in recent > years. It's practically unworkable now, compared to how it was in (say) > the immediately post-ANSI-fication years.

I do not understand this. The very same restriction on pointer arithmetic was already in the very first ANSi C standard.

--
dik t. winter, cwi, kruislaan 413, 1098 sj  amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn  amsterdam, nederland; http://www.cwi.nl/~dik/
Reply to
Dik T. Winter

In article Andrew Reilly writes: ... > OK, I've made enough of a fool of myself already. I'll go and have that > second cup of coffee for the morning, before I start going on about having > the standard support non-2's complement integers, or machines that have no > arithmetic right shifts...

In the time of the first standard, the Cray-1 was still quite important, and it has no arithmetic right shift. When K&R designed C there was a large number of machines that did not use 2's complement integers.

--
dik t. winter, cwi, kruislaan 413, 1098 sj  amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn  amsterdam, nederland; http://www.cwi.nl/~dik/
Reply to
Dik T. Winter

I haven't used an AS/400 myself, either, but this is almost certainly the sort of perfectly reasonable code that the standard has arranged to be undefined, precicely so that it can be said that there's a C compiler for that system.

Given the hardware behaviour, it would have been vastly preferable for the compiler to handle pointers as base+offset pairs, so that the specialness of the hardware pointers didn't interfere with the logic of the program.

Since most coding for AS/400s were (is still?) done in COBOL and PL/1, both of which are perfectly suited to the hardware's two-dimensional memory, any performance degredation would hardly have been noticed. (And since AS/400s are actually Power processors with a JIT over the top now, there would likely not be a performance problem from doing it "right" anyway.) But no, your friend had to go and modify good code, and risk introducing bugs in the process.

--
Andrew
Reply to
Andrew Reilly

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.