Just what makes an architecture "C Friendly"?

- K
- kyle york
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, Jun 1, 2006 8:40 PM

These are annoying, but far from fatal (see the x86 which was reasonably C friendly)

same as above

You missed:

o constants are accessed differently than regular data (they live in code space)

o no software stack (and only a single pointer), making local variables difficult. All C compilers I've seen have the restriction that all local variables are considiered `static'

--
Kyle A. York
Sr. Subordinate Grunt

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 12:03 AM

Yes, but much less so, particularly in small systems.

Because C only recognizes one kind of data space. In C a pointer is a pointer is a pointer, but in a Harvard machine a pointer to code space could have the same value as a pointer to data, yet not point to the same thing at all.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Posting from Google?  See http://cfaj.freeshell.org/google/

"Applied Control Theory for Embedded Systems" came out in April.
See details at http://www.wescottdesign.com/actfes/actfes.html

- F
- fox
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 6:41 AM

Yes, but I would go much further.

byte addressing indexed addressing mode

Technically a stack in only accessed from one end. Last-in First-out.

I don't know if it is 'by nature' but perhaps it is. But C in practice often uses a single stack to hold stack frames, and stack frames are not Last-in First-out access that defines a 'stack'

Using a stack for stack frames requires "easy stack manipulation" as you put it. And it requires an indexed addressing mode and the ability to index into a stack frame to get at arrays in memory.

And putting arrays in memory inside of stack means that you want, as you put it, want "no hard limits" on stack size.

Forth does not use the strict computer science definition of a stack either because many common Forth words access the top two cells on the stack, not just the top one. And many Forth provide a way to index deep into a stack and build stack frames like in C. But parameters are passed between words on one stack while return address and control flow parameters and locals may go on a different stack. So two LIFO stacks might be all that is needed there. And it might be easily demonstrated that complex apps may need very few data stack and return stack cells and could be very happy with what would be seen as serverely size limited stacks from a C programmer's perspective.

But C will usually want byte addressing and indexed addressing and the ability to put arrays of arbitrary size into stack frames and manipuate stack pointers into memory easily. So C would not like a hardware LIFO stack very much especially if it was small.

There are many people who love Unix and Linux and GNU and GCC and all that great stuff. They are likely to feel that what makes an architecture "C Friendly" is that it runs what C was designed to write.

And GCC is clear that it wants a machine with

byte addressing indexed addressing mode lots of general purpose registers

32-bits wide or more, preferably in a power of 2 and with enough memory to run GCC.

And I think that there are other things needed to support traditional multi-user, multi-tasking, OS protected systems running potentially hostile or destructive programs in a modern way. You are going to need to trap many errors and have a way out.

Stack overflow. Stack underflow. Watchdog timer recovery from certain deadlock conditions, Interrupts, Memory protection faults, Divide by zero, Etc.

C strictly speaking doesn't need any of this stuff. But if architectures don't have it people will definately tell you that it is not C friendly.

Take my Pic. Please!

- P
- Paul Black
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 7:10 AM

My point is that this is not specific to C, this equally apply to any language including assembler.

--
Paul

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 7:13 AM

Yes - most things that are good for C friendliness are also good for assembly friendliness. The reverse is not necessarily true, however - for example, rotate instructions are used much more in assembly programming than C programming (although they are useful to the C compiler when dealing with data types wider than the ALU).

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 7:27 AM

I know it's slightly off-topic, but since you've added that web address to your signature... How does your RS08 compiler compare to Metrowerks? I note that Freescale don't mention them on the RS08 webpages.

- P
- Paul Black
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 7:32 AM

Not true. In C, pointers to objects and pointers to functions are different.

Correct. Where would this ever confuse a C program?

--
Paul

- P
- Paul Keinanen
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 8:46 AM

Unless you cast the code pointer to a void pointer and try to use it in some unspecified way, what exactly is the problem ?

A typical use for code pointers is to take the address of a function and store it into a data structure and used as a callback routine. When you invoke the callback routine, you know that you should be calling something in the code space, thus there is no ambiguity which address space should be used for the call, since by definition you can not execute from the data space anyway.

There are of course problems if the code space address is longer than the data space address (e.g. 64 KiB code, 256 B data), but with e.g.

16 KiB code and 1 KiB data space, you still would have to use two byte pointers.

When the I/D space (64 KiB+64 KiB) support was added to some larger PDP-11 models, it did not usually cause much problems to use this features, provided that some precaution had been used when coding the applications originally.

Of course you could not use self modifying code. In line function parameters could not be used, which caused problems for code generated by some older Fortran compilers, which used in line parameter passing and had to be executed in a single address space. Some system call macros could not be used for the same reason and had to be replaced with the other form of the same macro.

While this separate Instruction/Data (I/D) space was not the same as a true Harvard architecture, I don't think that there would be many extra other problems.

Paul

- A
- Andreas Schwarz
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 8:51 AM

Paul Keinanen schrieb:

A more typical use is accessing of constants in ROM ("code memory"), and there it is a problem. You can't just pass a pointer to a function, the function needs to know whether it's pointing at RAM or ROM. For this reasons some compilers have a seperate set of library functions like puts_p that work with ROM pointers. Others try to hide the difference, but once in a while it blows up and you have to take care of it manually.

--
http://www.mikrocontroller.net

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 9:34 AM

The real issue is for constant data in code space. There are three ways to deal with this - proper C compatibility, which means copying the data from flash to ram at startup (which is impractical for many embedded programs), adding some non-standard extension to distinguish code-space data from data-space data (such as a "flash" keyword or "prog_mem" attribute), and cheating (such as misusing the "const" keyword). Each method has its disadvantage - this is why AVRs are not as C-friendly as the Atmel marketing department would claim.

There may also be some issues if code pointers are larger than data pointers (for example, targets with 16-bit data addressing and 24-bit code addressing such as the bigger AVRs).

- D
- David R Brooks
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 9:42 AM

I wonder about the numerous registers. Historically, access to registers was much faster than to memory, but with efficient L1 caches, that may no longer be so. Why then, have all those loads & stores? Maybe it's time to dust off those old stack-based architectures (eg English Electric KDF9, Burroughs B5000).

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 9:53 AM

Even the most efficient L1 caches are not as efficient as registers. In particular, caches on multi-GHz chips have quite significant latency. An additional effect is that registers can be accessed with much shorter instructions - a typical 32-bit RISC cpu will encode "r1 = r2 + r3" in a single instruction word, while direct memory access would require at least 4 32-bit instruction words (if any cpu had such an addressing mode combination).

Almost all of the speed-up from using 64-bit mode on an Athlon is due to the extra registers, rather than the 64-bit ALU.

- P
- Paul Keinanen
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 10:56 AM

You seem to think that code space and read only storage are equivalent.

The only constants in the code space of a Harvard computer that I can think of is the constant values immediately after the opcode as used by any immediate addressing mode like

;; bytevar = 123 mov #123,bytevar

The issue of initialising C "static" variables in data space is a different issue.

In the initialisation code you can have several assignment statements as above to initialise all the static variables.

Alternatively, you can have a ROM in the data space (with the same word length as the data space RAM) and copy the constant values to RAM or you can load the initial values from disk/flash/network with the "program" (actually data) loader.

Paul

- W
- Walter Banks
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 12:05 PM

Look at the Metrowerks listing in detail. They have assembler support for the RS08 only. May have C support in the future.

w..

David Brown wrote:

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 12:29 PM

To C, the code space is separate from data space, and read-only storage is a subset of data space. To a small microcontroller, read-only storage is in the same physical memory as code, and for Harvard ISA microcontrollers, it is accessed with different instructions from ram storage.

Your points below are all strictly speaking correct, but fail to consider the practical issues for a small flash-based Harvard ISA micro such as the AVR. Ideas such as adding ROM in the data space are mostly relevant only for a few dinosaur ISAs like the PDP-11 you mentioned. Modern cpus are virtually all von Neumann (from the programmers' viewpoint - implementation may be more complex), except for small single-chip microcontrollers which have built-in flash (or rom) and ram, although there are a few devices that have Harvard ISAs and connections for external databuses on either the data space or the code space.

Going back to the AVR as an example, you have separate flash memory and ram, which are accessed with different instructions. This separation does not map cleanly onto C's memory model, since read-only data is in data space in C, but ideally should be in flash on the AVR. Copying from flash to ram on startup is fine if you have small amounts of read-only data, but is out of the question if you have large tables. Thus we can say that a von Neumann ISA is essential to being C-friendly here in c.a.e., since it is the case for practical small microcontrollers.

mvh.,

David

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 12:35 PM

Good point - I hadn't noticed that "minor" detail.

mvh.,

David

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 1:58 PM

... snip ...

The HP3000, circa 1975, was a stack machine with an invisible complement of registers. The hardware mapped the top few stack items onto the registers. Procedure calls required ensuring that the registers were all written out to the real stack.

--
 Some informative links:
   news:news.announce.newusers
   http://www.geocities.com/nnqweb/
   http://www.catb.org/~esr/faqs/smart-questions.html
   http://www.caliburn.nl/topposting.html
   http://www.netmeister.org/news/learn2quote.html

- G
- Grant Edwards
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 2:52 PM

I think it affects C more so than assembler, but you're right: Harvard is a PITA for assembler too.

--
Grant Edwards                   grante             Yow!  UH-OH!! We're out
                                  at               of AUTOMOBILE PARTS and
                               visi.com            RUBBER GOODS!

- G
- Grant Edwards
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 2:55 PM

It's not uncommon for Von Newmann machines to have that issue as well: it's often convenient to limit data to a 16 bit address space and have 20 or 24 bits for code.

--
Grant Edwards                   grante             Yow!  I just remembered
                                  at               something about a TOAD!
                               visi.com

- G
- Grant Edwards
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 2, 2006 3:01 PM

For many of us that's effectively true. There are a lot of processors where the only non-volatile storage is the ROM that contains the executable.

What about read-only data declared as "const"? Its quite common to put that in the code space instead of the data space.

The initializer values are often stored in code space and copied to data space at startup.

That's massively inefficient.

On a lot of processors you can't.

If you've got ROM in data space, you don't _need_ to copy the constant values to RAM -- just use them where they are.

disk? network?

--
Grant Edwards                   grante             Yow!  Gibble, Gobble, we
                                  at               ACCEPT YOU...
                               visi.com