Making Fatal Hidden Assumptions

- R
- Richard Heathfield
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 12:39 PM

OE blaec, and OFr garter (the latter from from OHGer warten; OE weardian)

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

- R
- Richard Heathfield
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 12:40 PM

Ask Jack to lend you his bottle. You'll soon change your mind.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

- A
- Al Balmer
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 4:42 PM

? What do you imagine the etymology to be?

--
Al Balmer
Sun City, AZ

- V
- Vladimir S. Oka
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 4:52 PM

FWIW, from :

==== Blackguard

The exact etymology of this term for a villain is a bit uncertain. What is known is that it is literally from black guard; it is English in origin; and it dates to at least 1532.

The two earliest senses (it is impossible to tell which one came first) are:

the lowest servants in a household (often those in charge of the scullery), or the servants and camp followers of an army. * attendants or guards, either dressed in black, of low character, or attending a criminal.

The OED2 doesn't dismiss the possibility that there may literally have been a company of soldiers at Westminster called the Black Guard, but no direct evidence of this exists.

The earliest known citation (1532) uses the term blake garde to refer to torch bearers at a funeral. A 1535 cite refers to the Black Guard of the King's kitchen, a scullery reference. The second sense of a guard of attendants appears in 1563 in reference to a retinue of Dominican friars--who would be in black robes.

The sense of the vagabond or criminal class doesn't appear until the

1680s. And the modern sense of a scoundrel dates to the 1730s. ====

Nothing racist there...

--
BR, Vladimir

- C
- Christopher Barber
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 5:25 PM

I guess I will have to keep all this in mind the next time I copy C code off of a web page devoted to x86 assembly hacks and try to get it to run on a machine with 24-bit ones-complement integers.

;-)

- C

- A
- Al Balmer
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 5:40 PM

Actually, I wasn't asking that. I wondered what Jordan was imagining it to be.

--
Al Balmer
Sun City, AZ

- V
- Vladimir S. Oka
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 7:37 PM

Ah, sorry. I didn't read the lot carefully enough.

--
BR, Vladimir
 
There was a young lady named Mandel
Who caused quite a neighborhood scandal
        By coming out bare
        On the main village square
And frigging herself with a candle.

- A
- Andrey Tarasevich
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 7:52 PM

There are actual environments where 's - 1' alone is enough to cause a crash. In fact, any non-flat memory model environment (i.e. environment with 'segment:offset' pointers) would be a good candidate. The modern x86 will normally crash, unless the implementation takes specific steps to avoid it.

--
Best regards,
Andrey Tarasevich

- A
- Andrey Tarasevich
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 7:58 PM

This is not exactly correct. Line 13 uses a cast of a 'char*' pointer to an 'int*' pointer, not to an 'int'. This is relatively OK, especially compared to the "less predictable" pointer->int casts.

After that the char array memory pointed by the resultant 'int*' pointer is reinterpreted as an 'int' object. The validity of this is covered by the previous assumptions.

--
Best regards,
Andrey Tarasevich

- A
- Andrey Tarasevich
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 8:14 PM

Incorrect. It is not about "lawyers", it is about actual _crashes_. The reason why 's - 1' itself can (an will) crash on certain platforms is the same as the one that will make it crash in exactly the same way in "assembly language" on such platforms.

Trying to implement the same code in assembly language on such a platform would specifically force you to work around the potential crash, sacrificing efficiency for safety. In other words, you'd be forced to use different techniques for doing 's - 1' in contexts where it might underflow and in contexts where it definitely will not underflow.

C language, on the other hand, doesn't offer two different '-' operators to for these two specific situations. Instead C language outlaws (in essence) pointer underflows. This is a perfectly reasonable approach for a higher level language.

--
Best regards,
Andrey Tarasevich

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 9:06 PM

This illustrates the fact that usenet threads are uncontrollable. I wrote the original to draw attention to hidden assumptions, and it has immediately degenerated into thrashing about the one real error in the sample code. I could have corrected and eliminated that error by a slight code rework, but then I would have modified Mr Hsiehs code. There were at least seven further assumptions, most of which were necessary for the purposes of the code, but strictly limited its applicability.

My aim was to get people to recognize and document such hidden assumptions, rather than leaving them lying there to create sneaky bugs in apparently portable code.

--
"If you want to post a followup via groups.google.com, don't use
 the broken "Reply" link at the bottom of the article.  Click on 
 "show options" at the top of the article, then click on the 
 "Reply" at the bottom of the article headers." - Keith Thompson
More details at: 
Also see

- P
- Paul Keinanen
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 9:26 PM

Exactly which x86 mode are you referring to ?

16 bit real mode, virtual86 mode or some 32 mode (which are after all segmented modes with all segmented registers with the same value) ?

If s is stored in 16 bit mode in ES:DX with DX=0, then p=s-1 would need to decrement ES by one and store 000F in DX. Why would reloading ES cause any traps, since no actual memory reference is attempted ? Doing p++ would most likely just increment DX by one to 0010, thus ES:DX would point to s again, which is a legal address, but with a different internal representation.

IIRC some 32 bit addressing mode would trap if one tried to load the segment register, but again, how could the caller generate such constructs as s = ES:0 at least from user mode. In practice s = ES:0 could only be set by a kernel mode routine calling a user mode routine, so this is really an issue only with main() parameters.

Paul

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 10:26 PM

It's not deprecated, it's illegal. Once you have involved UB all bets are off. Without the p-1 the p++ statements are fine, as long as they don't advance the pointer more than one past the end of the object.

and a statement --p or p-- would be illegal. However p++ would be legal. But *(++p) would be illegal, because it dereferences past the confines of the object x.

--
"If you want to post a followup via groups.google.com, don't use
 the broken "Reply" link at the bottom of the article.  Click on 
 "show options" at the top of the article, then click on the 
 "Reply" at the bottom of the article headers." - Keith Thompson
More details at: 
Also see

- C
- Christian Bau
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 10:53 PM

Consider a typical implementation with 32 bit pointers and objects that can be close to 2 GByte in size.

typedef struct { char a [2000000000]; } giantobject;

giantobject anobject;

giantobject* p = &anobject; giantobject* q = &anobject - 1; giantobject* r = &anobject + 1; giantobject* s = &anobject + 2;

It would be very hard to implement this in a way that both q and s would be valid; for example, it would be very hard to achieve that q < p, p < r and r < s are all true. If q and s cannot be both valid, and there isn't much reason why one should be valid and the other shouldn't, then neither can be used in a program with any useful guarantees by the standard.

- C
- Christian Bau
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 11:00 PM

Question: If the C Standard guarantees that for any array a, &a [-1] should be valid, should it also guarantee that &a [-1] != NULL and that &a [-1] < &a [0] and &a [-1] < &a [0]?

In that case, what happens when I create an array with a single element that is an enormously large struct?

- C
- Christian Bau
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 8, 2006 11:15 PM

I just tried the following program (CodeWarrior 10 on MacOS X):

#include

#define SIZE (50*1000000L) typedef struct { char a [SIZE]; } bigstruct;

static bigstruct bigarray [8];

int main(void) { printf("%lx\n", (unsigned long) &bigarray [0]); printf("%lx\n", (unsigned long) &bigarray [9]); printf("%lx\n", (unsigned long) &bigarray [-1]); if (&bigarray [-1] < & bigarray [0]) printf ("Everything is fine\n"); else printf ("The C Standard is right: &bigarray [-1] is broken\n"); return 0; }

The output is:

2008ce0 1cd30160 ff059c60 The C Standard is right: &bigarray [-1] is broken

- A
- Al Balmer
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 12:07 AM

Nice parrot. I think the original author of that phrase meant it as a joke.

I spent 25 years writing assembler. C is a higher-level language.

--
Al Balmer
Sun City, AZ

- K
- Keith Thompson
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 12:58 AM

Andrew Reilly writes: [...]

It's higher-level than some, lower than others. I'd call it a medium-level language.

Not in any meaningful sense of the word "assembler".

--
Keith Thompson (The_Other_Keith) kst-u@mib.org  
San Diego Supercomputer Center               
We must do something.  This is something.  Therefore, we must do this.

- D
- Dik T. Winter
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 1:21 AM

In article Andrey Tarasevich writes: ... The first time I see this code, but:

Note that in here the byte-offset in the pointer is ignored, so d points to the integer that contains the character array: "0\000234567".

Again an hidden assumption I think. (It is exactly this hidden assumption that made porting of a particular program extremely difficult to the Cray 1. The assumption was that in a word pointer the lowest bit was 0, and that bit was used for administrative purposes.)

--
dik t. winter, cwi, kruislaan 413, 1098 sj  amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn  amsterdam, nederland; http://www.cwi.nl/~dik/

- D
- Dik T. Winter
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 9, 2006 1:28 AM

In article msg writes: > > > OK, I've made enough of a fool of myself already. I'll go and have that > > second cup of coffee for the morning, before I start going on about having > > the standard support non-2's complement integers, or machines that have no > > arithmetic right shifts... > > I get queasy reading the rants against 1's complement architectures; I > wish Seymour Cray were still around to address this.

There are quite a few niceties indeed. Negation of a number is really simple, just a logical operation, and there are others. This means simpler hardware for the basic operations on signed objects, except for the carry. It was only when I encountered the PDP that I saw the first

2's complement machine.

On the other hand, when Seymour Cray started his own company, those machines where 2's complement. And he shifted from 60 to 64 bit words, but still retained octal notation (he did not like hexadecimal at all).

--
dik t. winter, cwi, kruislaan 413, 1098 sj  amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn  amsterdam, nederland; http://www.cwi.nl/~dik/