Re: C is not a low level language

Ozz Nixon · 2019-02-23T12:26:37+00:00

KUDOS!!!!!!!!!!!!! I say that in StackOverflow and had my account banned. ;-)

C

Charlie Gibbs 7 years ago

Thanks for mentioning comments. How many of you would be rich if you had a dollar for every time you heard someone say, "My code is so readable it doesn't need comments"?

/~\ cgibbs@kltpzyxm.invalid (Charlie Gibbs) \ / I'm really at ac.dekanfrus if you read it the right way. X Top-posted messages will probably be ignored. See RFC1855. / \ Fight low-contrast text in web pages! http://contrastrebellion.com

Vote

R

R.Wieser 7 years ago

Richard,

You're only strengthening my position you know. :-)

/Especially/ when you do not know what could happen it is a good idea to be defensive about what you're doing.

But, do me a favour: think about the ammount of effort you would need to do to follow my suggestion, and the amount of fall-out you could be facing when the string doesn't get terminated (regardless of by mistake or by an attack) and compare those two. What do you say, which of the two would "win" ?

Regards, Rudy Wieser

Vote

P

Pascal J. Bourguignon 7 years ago

Sorry to interject this important argument so deep in an unrelated thread, but

In case you didn't notice, the terminating null is not stored or transmitted to external devices. When you display the string on a screen, print it in a printer, transmit it to a remote terminal, etc, the null byte should not be displayed, printed, transmitted. It is NOT part of the string.

So why C uses a null terminating byte for string literals? The alternatives are:

- pascal strings: a length prefix (but is it a byte, a word, a long? In "modern" pascal implementations all those variants are possisble.

- no length, but a different type, and a stringly typed language that let you copy the literal only to an array of the same size (eg. Modula-2, IIRC).

- a designated string class used to implement strings (eg. Objective-C; it's even customizable in Objective-C, since the specific string class can be specified as argument to the compiler).

you can notice that 1960's C solution of using a terminating null byte looked as the simpliest solution, and provided also the safety that if you lost a null byte, there should be another null byte in memory not too far away, so everything would be ok. But that was before paged memory existed! (well not chronologically, but on the PDP-7/11). Ie. basically the argument was: we're ok with buffer overflows because they're benign on our PDP-7 in 1969.

Obviously this only argument for null-terminated strings should be revised!

This is a place where C could be improved, and I like Objective-C solution of providing a class (a "plug-in") to implement the representation of string literals.

__Pascal J. Bourguignon__ http://www.informatimago.com

Vote

R

R.Wieser 7 years ago

TNP,

You want to be pedantic and point out the obvious ? No, you do not need to do that. Just making sure that the /last/ character is zero will do fine.

Though I must say that I thought that "malloc" already took care of that. My mistake. You would need to use "calloc" instead.

Exactly. You should /not/, as a default, just work with whatever you where able to copy.

Bingo! You found one of the exceptions-to-the-rule. Well done. :-)

But tell me, how does that counter my suggestion to tell "strncpy" that the buffer is one char smaller than it actually is ?

Worse: it is a situation (string possibly longer than what the target buffer can hold) where you should actually be using it.

And do tell me: You seem to be opposing my suggestion. Why ? What /benefit/ would you gain from /not/ doing it ?

Regards, Rudy Wieser

Vote

W

Wouter Verhelst 7 years ago

Except that strn... is not guaranteed to be thread-safe.

Vote

R

R.Wieser 7 years ago

Pascal,

Really ? Good luck with "printf" -ing such a non-terminated string than. :-)

Wrong. See below.

While you are right there, seeing it or not is a whole other matter.

Good luck with determining where one string ends and the next one starts I would say. :-)

And by the way: The "nul" character was used as a short delay for the, long ago, mechanical(!) teletypes (no handshaking, no hold-off or buffering facilities) as a short delay - so that the stream of send characters could be continuous but the TTY had a chance to actually return is carriage and progres a line before type the next actual character.

And yes, once upon a time I had such a device. :-)

Wrong subject I'm afraid.

Regards, Rudy Wieser

Vote

P

Pascal J. Bourguignon 7 years ago

How do you make sure that the last character is zero? Do you prove it mathematically? Have you tested it too? Do you ensure it manually? Are you use you didn't miss one or made a mistake?

Using strn with strlen is dumb.

The only way to make sure that the last character is zero is to keep the buffer size and the string length along with the string:

typedef { size_t allocated; size_t length; char* characters; } string;

and to let the code perform the check itself!

void string_validate(string* string){ if(string==NULL){error("null string");} if(string->characters==NULL){error("invalid string"); if(string->allocatedlength){error("string length too big for allocated size");} if(string->characters[string->length]){error("string internal representation is not null-terminated!");} }

You didn't ensure it by code.

__Pascal J. Bourguignon__ http://www.informatimago.com

Vote

R

R.Wieser 7 years ago

Wouter,

You mean that the other thread could throw the string away and invalidate the memory, causing "strn..." to try to access memory thats not available anymore ? True. But thats a whole other can of worms.

Besides, the resulting exception/crash would be a whole lot better than the silent memory overrun a non-terminated string c|would cause. :-)

Regards, Rudy Wieser

Vote

W

Wouter Verhelst 7 years ago

That too.

Also that between strn... starting to do its operation and finishing it, another thread might have grown (or shrunk) the string.

Sure.

Vote

R

R.Wieser 7 years ago

Pascal,

The same goes for TNPs and your own code. Funnily I do not see you have any problems with either of that ....

I think you are looking for a way out, and for that demand I prove any-and-everything. Which makes me think its better to terminate our conversation before harsher words are spoken.

So, goodbye. Have a good life and do not let the (bed) bugs bite you. :-)

Regards, Rudy Wieser

Vote

P

Pascal J. Bourguignon 7 years ago

Indeed, null introduced a delay, due to the time needed to transmit it. But it was entirely ignored (had to be), by the receiving device.

To separate your strings, you would use the US (Unit Separator) ASCII code, or transmit a binary length first.

__Pascal J. Bourguignon__ http://www.informatimago.com

Vote

R

R.Wieser 7 years ago

Wouter,

True. The former would result in an unterminated string (which my suggestion would ward against). The latter would either return a string starting with the old and ending with the new, or just return the full old one (in other words, not a buffer overflow)

Though my take on it was more about trying to copy from strings/ string buffers you're not fully aware of what they contain anymore - regardless of in the same or another thread.

Regards, Rudy Wieser

Vote

B

Björn Lundin 7 years ago

That only says that you are better in C than in Turbo Pascal

--

Vote

J

J. P. Gilliver (John) 7 years ago

Or C, or any language - even BASIC.

Or the beer holder! That applies to more than just pulchritude: one should not code when under the affluence!

J. P. Gilliver. UMRA: 1960/

Vote

A

Ahem A Rivet's Shot 7 years ago

I can only give you the answer Dennis Ritchie gave me. There were a number of factors. Using a length indicator would have meant deciding on a size for the length which would be a machine dependent size[1]. Whatever size of length indicator you choose has issues - use a char and on most machines that limits you to 256 character strings, use an int on a 64 bit machine and no string is shorter than nine bytes long (think of all those two letter commands in unix) - remember memory was expensive and wasting it was not sensible. Using a null terminator meant that the only limit was the machines address range and avoided any risk of making dangerous assumptions about the maximum possible string size.

Finally there is the important point that C was designed for efficiency *not* safety - the implementation of strcpy from the first edition of K&R is gloriously efficient and did not depend on you handing pointers to the start of the string.

strcpy(d, s) char *d; char *s; { while (*d++ = *s++); }

For a more dramatic example of efficiency consider strtok(), the routine at the heart of turning a string into argc and argv. It works by injecting nulls into the string where the spaces are and returning an array of pointers to the starts of the words in the original string - In other words it turns a single string into several strings and returns them in an array - and it does so in place without using more memory than is absolutely essential. Doing string manipulation in place with minimal overhead was very important at the time C was designed.

Yes this violates several modern principles - all of which are about preferring safety to efficiency, which in these days of pocket super computers is laudable but in the days of souped up adding machines with memory would have been insanely impractical.

[1] remember in C we have char is no larger than int is no larger than long and char is big enough to hold the character set required for C source and sizes are in terms of char. Also think how much that would impact the parts of the compiler code that you would really like to be machine independent.

Steve O'Hara-Smith | Directable Mirror Arrays C:\>WIN | A better way to focus the sun The computer obeys and wins. | licences available see You lose and Bill collects. | http://www.sohara.org/

Vote

R

Richard Heathfield 7 years ago

Hence the next sentence in my reply, which you've snipped:

"In the more general case, however, your point is taken."

But I do know what could happen. Someone could use my code in a multi-threaded environment. Equally, someone could use my code and replace all instances of sin() with cos(), cos() with tan(), and tan() with sin(). Or they could change all the #defines. Or they could use the code in an environment where CHAR_BIT is not 8 (a restriction I regret, but not as much as I'd regret static arrays of, say, 16 giga-octets). Or they could flip a few random bits in the object code. Or they could use a compiler that assumes one calling convention and a linker that assumes a different one.

If they do any of those things, that really is their problem.

It would have to be a mistake - that of using, in a multi-threaded environment, code designed for use in a single-threaded environment.

A *deliberate* single-threaded attempt to stop my code from null-terminating a copy of a string would fail. (If an attack is likely, the place to guard against it is after the length caching but before the malloc.)

And a deliberate *multi*-threaded attempt to attack the code would entail the mistake of using my code in a multi-threaded environment.

Richard Heathfield Email: rjh at cpax dot org dot uk "Usenet is a strange place" - dmr 29 July 1999 Sig line 4 vacant - apply within

Vote

R

Richard Heathfield 7 years ago

*Any* language?

I think I would draw the line at BrainF**k, or Ook!, or Whitespace, or Piet. These languages were *designed* not to be readable.

Which is the saving grace of Piet. Piet programs can be rather attractive.

Richard Heathfield Email: rjh at cpax dot org dot uk "Usenet is a strange place" - dmr 29 July 1999 Sig line 4 vacant - apply within

Vote

A

Ahem A Rivet's Shot 7 years ago

This program is pretty ...

#define _ -F

Vote

R

R.Wieser 7 years ago

Richard,

/You/ brought multi-threading up, I didn't. Solve your own problems.

Furthermore, do you think that those problems will be any different when you do /not/ use my suggestion ?

If not, what the f*ck are you babbeling against /me/ in regard to them ? Dishonest much ?

tl;dr: Try to concentrate on the difference between applying my suggestion or not. If you can that is. Can you ?

Regards, Rudy Wieser

P.s. Yes, you pissed me off. How did you guess ?

Vote

R

Richard Heathfield 7 years ago

Indeed. I was trying to imagine a situation in which the code I'd posted could break, and that was the only one I could come up with.

I do. That isn't one of my problems.

The rest of your reply suggests that we're not going to have a pleasant discussion, so I'll stop there.

Richard Heathfield Email: rjh at cpax dot org dot uk "Usenet is a strange place" - dmr 29 July 1999 Sig line 4 vacant - apply within

Vote

Re: C is not a low level language

Join the Discussion

Didn't find your answer?