embedded questions!!! - Page 2

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Re: embedded questions!!!

Quoted text here. Click to load it

Understood.


See Harbison and Steele, 5th edition, page 141.  I think it states
what I wrote.

Jon

Re: embedded questions!!!
On Fri, 13 Jan 2006 20:25:33 GMT in comp.arch.embedded, Jonathan

Quoted text here. Click to load it
[...]
Quoted text here. Click to load it

Well, maybe I'm jut getting nitpicky.  Which is why I qualified my
statement with "arguably."

My point was that it's not a positional thing, e.g., that i[a] is not
equivalent to *((&i[0])+(a)).  As i stated in my earlier post, the
unadorned name of an array devolves into a pointer to the first
element of that array.  In the reference, Harbison names the mechanism
by which this occurs (the ususal unary conversions).

Note that in the same paragraph he says what I said above, namely i[a]
is equivalent to a[i].

Regards,
                                        -=Dave

--
Change is inevitable, progress is not.

Re: embedded questions!!!

Quoted text here. Click to load it

That's fine.  Now, have you looked at my question regarding the
contents of unnamed string initializers?  I've done a brief look at
the C99 standard and haven't found a specific answer (which may mean
it is buried in a chain of logic.)  I also have looked through three
Harbinson and Steele editions without luck.  I may jump into my
compiler books (like the two editions of the 'dragon' book, and
others) to see if I can find it referenced there.  But I don't have a
good model for this in my head and looking at these examples exposed
my own ignorance on that specific point.

Jon

Re: embedded questions!!!
On Fri, 13 Jan 2006 20:51:04 GMT in comp.arch.embedded, Jonathan

Quoted text here. Click to load it

I don't think the Dragon book will help.  It's really a language issue
rather than a compiler issue.

My copy of the standard is in a box somewhere, but H&S5, p33, about
3/4 way down the page, says (asterisks indicate *bold* text):

   "*Storage for string constants.*  You should never attempt to
    modify the memory that holds the characters of a string constant
    since that memory may be read-only -- that is, physically
    protected from modification."

IIRC, the type of string literals is "array of (non-const, plain)
char,"  but writing to them results in undefined (rather than
implementation-defined) behavior.  You might check with the denizens
of comp.std.c to be sure...

Regards,
                                        -=Dave

--
Change is inevitable, progress is not.

Re: embedded questions!!!

Quoted text here. Click to load it

I think that settles it.  Thanks, Dave.  And it makes sense, too.
Which is a good thing.

Jon

Re: embedded questions!!!
On Fri, 13 Jan 2006 21:12:05 GMT, the renowned Jonathan Kirwan

Quoted text here. Click to load it

In the standard, these excerpts and examples might be relevant:

6.5.2.5 Compound literals

9 String literals, and compound literals with const-qualified types,
need not designate distinct objects.82)

82) This allows implementations to share storage for string literals
and constant compound literals with the same or overlapping
representations.

...

6.5.2.5 Compound literals

13 EXAMPLE 5 The following three expressions have different meanings:
"/tmp/fileXXXXXX"
(char [])
(constat char [])

The first always has static storage duration and has type array of
char, but need not be modifiable; the last two have automatic storage
duration when they occur within the body of a function, and the first
of these two is modifiable.


14 EXAMPLE 6 Like string literals, const-qualified compound literals
can be placed into read-only memory and can even be shared.

For example,

(constat char []) == "abc"

might yield 1 if the literals storage is shared.



6.7.3 Type Qualifiers

If an attempt is made to modify an object defined with a
const-qualified type through use of an value with non-const-qualified
type, the behavior is undefined.



Best regards,
Spehro Pefhany
--
"it's the network..."                          "The Journey is the reward"
snipped-for-privacy@interlog.com             Info for manufacturers: http://www.trexon.com
We've slightly trimmed the long signature. Click to see the full one.
Re: embedded questions!!!
On Mon, 16 Jan 2006 14:41:56 -0500, Spehro Pefhany

Quoted text here. Click to load it

Thanks much.  You just saved me some time, this evening!

Jon

Re: embedded questions!!!

Quoted text here. Click to load it

That's still incorrect.  The actual rule is that any expression of
array type (not just "the name of an array variable"!) used in a way
that requires a pointer, automagically turns into a pointer to its
first element.  The canonical counter-example is that

    sizeof(array) == sizeof(&(array[0]))

will be true only rather rarely.

--
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Re: embedded questions!!!

Quoted text here. Click to load it

I'll assume that's supposed to be 'char *str1 = "JHONSON";'...


Quoted text here. Click to load it


Nope, it's an array.  If it helps to make things clearer, sizeof(str1)
will be different than sizeof(str2).  In particular:

     sizeof(str1) == sizeof(char *)
     and
     sizeof(str2) == strlen(str2)+1

That should help make it clearer that str1 and str2 are different types.
If you were to give them names, str1's type would be "char *" and str2's
type might be "char[8]".  One is a pointer and the other is an array.

That may not make much sense, but think for a moment about how an int
can be automatically converted to a float:

    int i = 1;
    float f = 3.14159;

    printf ("%g", i + f);

What's going on here is that the compiler sees an expression where the
operator (in this case "+") has two arguments which are different types.
It then implicitly converts the integer into a floating point value
when generating the code that will correspond to that expression.

A similar thing happens when arrays are used where a pointer value is
needed.  The compiler thinks "this isn't a pointer, but since it's an
array, I can create a pointer value from it".

Another way they differ, if I understand correctly is that this would
be legal:

    str1 = (char *) "another string";

but this would not be:

    str2 = (char *) "another string";

The reason is this:  str1 is just a pointer, so it can take on any value.
But str2 is the array that contains { 'J', 'H', 'O', 'N', 'S', 'O', 'N', 0 },
and how are you supposed to assign a pointer to that?

   - Logan

Re: embedded questions!!!

Quoted text here. Click to load it


There are good arguments that that's what it _should_ be like, but it
isn't.

Quoted text here. Click to load it





Completely wrong, for two reasons.

1) A typo: the first one has to read

    char *str1 = "JHONSON"

2) Incorrect ideas about C.  The string allocated by this (fixed)
declaration is *not* modifiable.  Any program trying to write to
str1[i] would be broken.

Some C platforms will indeed keep string literals in modifiable
storage.  But the language definition explicitly allows them to be in
read-only storage, and that means all programs assuming otherwise are
broken.  They cause undefined behaviour.

This is all for hysterical raisins.  From an abstract point of view,
the type of string literals _should_ obviously be "array of const
char", which would mean

    char *str1 = "JHONSON"

would trigger a warning about mixing const and non-const.  But it
isn't.  That's because there was no "const" in the original language,
and by the time it was added in a reliable way, it was way past too
late to change that.

--
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Re: embedded questions!!!
Quoted text here. Click to load it
... snip ...
Quoted text here. Click to load it

you mean "char *str1 = ..."
Quoted text here. Click to load it

Not so.  I just wrote an explanation of the difference in this
thread.  The char * declaration _MAY_ create a modifiable string,
but need not.  This allows multiple such declarations to point to
the same actual string, if the compiler so chooses.

--
"If you want to post a followup via groups.google.com, don't use
  the broken "Reply" link at the bottom of the article.  Click on
We've slightly trimmed the long signature. Click to see the full one.
Re: embedded questions!!!
On 13 Jan 2006 14:27:32 GMT, "John B"

Quoted text here. Click to load it
well reside in ROM).
Quoted text here. Click to load it
first one also creates a
Quoted text here. Click to load it

To expand on this:

  const char *str1 = "some string";

It produces something akin to:

  char unnamed[] = "some string";
  const char *str1= &unnamed[0];

where 'unnamed' isn't accessible by name, at the language level.

...

However, there is an interesting question this called to mind that I'm
not precisely clear on, though I can explain my own mental model about
it.  This is the issue of the string initializer itself.

I believe the array contents of 'unnamed' need not be (but may be)
modifiable.  In other words, the compiler may or may not use read-only
memory such as flash for the location of 'unnamed's array of chars.

I'm a little unclear on this point, though.  So perhaps someone can
cite the chapter in the standard that clarifies this.

In other words, do these two statements set up different type
qualifiers for the literal?

  char *str1= "Hi there.";
  char const *str2= "Hi there.";

Would a c compiler be allowed to "fold" these two constant arrays so
that they occupy the exact same memory?

I'm unclear on that, though my own mental model says that a compiler
may fold these together, despite the fact that the pointers to the
then-identical array of characters do NOT have the same type
qualifiers.  And, further, I think whether or not the compiler places
such unnamed strings in read-only memory or read-write is not
specified by the language and the compiler is allowed to make either
choice.

But I don't know.

Jon

Re: embedded questions!!!

Quoted text here. Click to load it

Actually, I think folding two strings together (which is an option on
some compilers) may be against the standard.  So that part of the
question may not be relevant.  But the subtler question remains as to
whether or not the string initializers themselves should be considered
unmodifiable by a programmer.  I'd argue that they must be considered
unmodifiable, as some operating systems/compilers may place these
constants in "program text areas" or otherwise in read-only protected
memory.  So that when saying:

  char *s1= "hello";

you are doing something like,

  char *s1= (char *) ((const char []) { "hello" });

Jon

Re: embedded questions!!!
On 13/01/2006 the venerable Jonathan Kirwan etched in runes:

Quoted text here. Click to load it

We must also remember that C was devised for use on a machine with Von-Neuman
architecture. Many
modern microcontrollers use a Harvard architecture and this presents the
compiler writers with the
problem of where to put unmodifiable data.

--
John B

Re: embedded questions!!!
On 13 Jan 2006 21:38:47 GMT, "John B"

Quoted text here. Click to load it

The PDP-11, for one.

Quoted text here. Click to load it

I'm intimately aware of that.

Quoted text here. Click to load it

Some thoughts on that:

The problem presented with any compiler used for dedicated embedded
situations is all the work it takes to meet the spec before arriving
at main().  At the time c was being developed, the current thinking
about running program environments included the following functional
classifications:

  Segment Name    Segment Description
  -------------------------------------------------
  CODE            Code section
  CONST           Constant data section
  INIT            Initialized data section
  BSS             Uninitialized data section
  HEAP            Heap section
  STACK           Stack section

[Actually, the very concept of 'stack' as a general purpose workhorse
will still in its childhood -- many of the existing and commercially
successful machines did NOT support, via hardware, the idea of a stack
and a great deal of code had been written completely without them
except as a specialized concept for certain problems.  (I worked on
such operating systems and languages.)  Heap was kind of new, too. The
PDP-11 was only just out around 1970, or so, to light the way out of
the darkness.  :) ]

In Von Neumann, all of these are in the same memory addressing system.
Modern concepts weren't completely worked out and the PDP-11 included
support for several equally good conventions.  But the gist of the
above is that stack would grow down, heap grow up, and that the other
four sections were each of fixed size at start.  Only the CODE, CONST
and INIT sections needed to be kept "on disk" or in some form of
non-volatile storage (which could, of course, include cards, tape, or
whatever.)  Neatly, the non-volatile portions are all of fixed size.

In other words, like this:

  Section Description     Access          NV?     Size
  =================================================================
  Code                    Execute         Yes     Fixed/static
  Constants               Read            Yes     Fixed/static
  Initialized Data        Read/Write      Yes     Fixed/static
  Uninitialized Data      Read/Write      No      Fixed/static
  Heap                    Read/Write      No      Variable, up
  Stack                   Read/Write      No      Variable, down

If you look at the above list and think about Harvard architectures
and c programming generally, you find that the code must be placed in
code memory while the others must all be placed in data memory.  on
Von Neumann, this is the same memory system.  On Harvard, two
different ones, at least.

But this is NOT a problem, really.  Even in the Harvard case.  In
fact, it's not too far from how an operating system would do it under
the Intel 80386 and above, if it wanted to implement an execute-only
region for the code (you can't read it as data.)  And the 80386 is NOT
Harvard.

Also, keep in mind that it is still the case that the first three must
be stored somewhere in non-volatile memory.  In the case of Von
Neumann systems with flash on-chip, this is fairly easy -- just place
it there.  Both code and data can be accessed without having to move
any of it around.

A question for c in embedded use for Harvard comes in the use of
pointers.  A pointer to code memory may NOT occupy the same memory
footprint (in other words, the sizeof() the two pointer types may be
different) and the actual instructions used to access these different
types of memory may be different.  The different size can be fixed, by
requiring the larger of the two sizes for all (in other words, making
a union.)  And code generation can simply depend on the declaration of
the pointer.  I believe casting can also be handled.  So, frankly,
neither of these are insurmountable and it is quite possible for a c
compiler to accept straight c code and generate functioning programs
on Harvard machines without special decorations/declarations.

For Harvard, a re-definition of the Von Neumann layout is in order, if
you want to be able to port code as easily as to another Von Neumann
system. Something like these functional areas:

  Segment Name    Segment Description
  -------------------------------------------------
  CODE            Code section
  CONST_copy      Data for constant section
  INIT_copy       Data for initialized data section
  CONST           Constant data section
  INIT            Initialized data section
  BSS             Uninitialized data section
  HEAP            Heap section
  STACK           Stack section

In this case, the first three must be placed in non-volatile memory --
flash, for example.  And the remaining can be placed in volatile.  At
start, pre-main() code copies CONST_copy into CONST and INIT_copy into
INIT before starting main().  If this is done, then once again all
data memory is accessible as data.  And Harvard works consistently
with c's model, I think, in this case.

For embedded Harvard processors -- the only difficult problem in the
above is if there are _no_ instructions which can read from code space
and if the code space is the only non-volatile memory present.  In
such cases, I believe, space will have to be reserved in data memory
and code instructions must be able to use immediate-mode constants
they can load into registers and then place into data memory to
initialize them to specific values.  That would be painful, but
doable, if instructions support some form of immediate mode and there
is enough code space, of course.

So the bottom line, I think, is that Harvard really isn't exactly an
insurmountable problem for c compilers accepting unvarnished c code,
granting an instruction type or two on the target.

That is, until you worry about practical things like scarce resources
-- such as RAM.  It is one thing that INIT_copy needs to be copied
into INIT.  There is no avoiding the need to use RAM for initialized
data that can be later modified.  It has to be in RAM.  But CONST_copy
must also then be copied into RAM where it can be accessed as data and
it would be nice if, instead, those constants could just remain in the
code space and not take up RAM resources at run-time.

So that can be a bad thing.  There may not be very much RAM to go
around.  So suddenly you have a strong desire to access data that sits
in code space (if data space doesn't include non-volatile memory.) But
if you cave into that desire then you may have another problem,
passing around pointers to data which may be either in code space or
data space.  In that case, you either need to fashion data pointers
which support both (and that will likely balloon the code as well as
slow execution time) or else have the compiler emit code for one kind
of access where that routine then cannot accept pointers to the other
space.  I was faced with this problem, for example, using the PIC
chips -- in a routine that was basically a "printf()" accepting
strings which could _either_ be in a RAM buffer _or_ constant literals
located in code space.

So vendors do expand things by adding type qualifiers or #pragma
statements.  But mostly to be competitive and sell their product --
not so much because they absolutely have to -- Harvard can be painful
for vanilla c and it can be uncompetitive with decorations added, but
I don't think it is impossible in principle.

Jon

Re: embedded questions!!!

Quoted text here. Click to load it

Harvard vs. von-Neumann architecture has little or nothing to do with
that.  The only question is whether the target platform has some kind
of read-only memory accessible as data (rather than only as code) or
not.  If data ROM is available, string literals should usually go
there.  If there isn't, it doesn't matter where you put them.

--
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Re: embedded questions!!!

Quoted text here. Click to load it

Incorrect.  Constant folding was even among the rationales for making
string literals non-modifiable.

Quoted text here. Click to load it

They *must* be considered immutable.  Any program assuming differently
will cause undefined behaviour.  Which means that such a program would
be about as fundamentally buggy as a program can possibly be and still
make it through some compilers.

As I've written here before, the fact that the type of "hello" is
formally "array of char" instead of "array of const char" is a
historical accident.

--
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Re: embedded questions!!!
On 16 Jan 2006 14:21:35 GMT, Hans-Bernhard Broeker

Quoted text here. Click to load it

Constant folding would argue exactly that, of course.  But the
rationale is not the standard.  I would be interested in the section
of the standard where this is addressed.  Anyway, I was guessing in
the back of my mind about the potential distinctness of pointers to
two different instances of the same literal text and whether or not it
might be allowed to have the same address.  I haven't thought deeply
about it and I don't recall reading a specific point in the standard,
so that was my guess.  Now, I suppose, I should have to go more deeply
into it.  Unless you have a ready citation.

And thanks for the point.

Quoted text here. Click to load it

Makes a lot of sense to me.

Quoted text here. Click to load it

I didn't read what you wrote before.  Thanks.

Jon

Re: embedded questions!!!

Quoted text here. Click to load it

Well, I don't have a copy of C89, but see K&R2 (ANSI C edition),
appendix section A.2.6.  Also see the C FAQ, entry 1.32, and
references therein.

Quoted text here. Click to load it

They're explicitly neither forbidden, nor guaranteed, to be the same.
This falls under the general heading of implementation-defined
behaviour.

Some people may think that the need for a distinction between two
objects being the same vs. them being equal is a novelty of OO
programming.  Well, it's not.

--
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Re: embedded questions!!!
On 16 Jan 2006 18:51:46 GMT, Hans-Bernhard Broeker

Quoted text here. Click to load it

I'd like to see it in the standard, but I looked at your reference in
K&R 2nd ed.  And it says what you say it says.

Quoted text here. Click to load it

That's what K&R 2nd ed. appears to say.  I'm with you, there.

Quoted text here. Click to load it

I wasn't thinking of OO here.  However, I'll see if I can find the
point referenced in C99 or C89.  I like the points you've made, but
I'm still curious.

Jon

Site Timeline