reducing flash size in embedded processors?

- B
- booth multiplier
  
  Contact options for registered users
posted
19 years ago

Sat, Oct 2, 2004 11:46 AM

Dear All, Two years I attended a Hitachi Embedded Seminar. They presented their embedded low power Flash microcontrollers there. The Presenter said: " If you look to the die you'll see that the cpu is only a small fraction and the flash part occupies much of it. Especially if the flash size is >32k." If this is real why don't they change the architecture to save some flash size? Any Comments? I am sure somebody has already worked on it.

- C
- CBarn24050
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Sat, Oct 2, 2004 12:45 PM

Hi, the modern trend is to write single chip applications in C, much easier than assembler for micros like the Hitachi range. The consequence is that you need all that extra memory, both rom and ram, for allmost any practical program.

- G
- Grant Edwards
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Sat, Oct 2, 2004 3:09 PM

You seem to be implying that writing a program in C requires significantly more memory than writing in assembly language.

In my experince that simply isn't true.

--
Grant Edwards                   grante             Yow!  Gibble, Gobble, we
                                  at               ACCEPT YOU...
                               visi.com

- C
- CBarn24050
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Sat, Oct 2, 2004 4:26 PM

Hi, do you have first hand experience of that? Perhaps you could provide some real world examples.

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Sat, Oct 2, 2004 7:05 PM

CBarn24050 wrote: (*** and neglected to preserve attributes ***)

Try it for yourself. Write some moderately complex function of 5 or 10 lines, compile it to an object module (gcc -gstabs+

-Wa,-ahldn -c source.c works for me) with various optimization settings, and examine the assembly code. Then see if you can significantly improve that code.

In general a large portion of the bulk from simple C programs arises from the library and startup code.

--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
     USE worldnet address!

- I
- Ian Bell
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Sat, Oct 2, 2004 9:42 PM

Tight loops blitting data to video hardware, assembler wins hands down every time.

Ian

--
Ian Bell

- C
- CBarn24050
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Sat, Oct 2, 2004 10:42 PM

Well if your saying that compiling by hand is not much better than letting a compiler do it then I would agree with you. That is not the same as saying C produces code of a similar size to assembler. Try taking a small job, write in C then write in assembler and see the difference.

Librarys can be quite big but the startup code should be minimal, maybe 100 instructions.

- C
- CBarn24050
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Sat, Oct 2, 2004 10:43 PM

Hi Thad, maybe you could provide some real word example to prove your point.

- T
- Thad Smith
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Sat, Oct 2, 2004 10:59 PM

I'm sure it's much faster, but the discussion is on memory space. While the assembler loop is probably tighter, too, it is usually a very small part of a complete program.

My experience is that C code is usually 20% to 70% larger, depending on several factors. If you are familiar with the compiler's output, you can reduce your program's size while coding in C. One simple technique is recoding to eliminate library functions that are only called once or twice. Know the tradeoff between using compile-time constants and run-time parameters. Know the tradeoff between function-like macros (or inline functions) and callable functions. Know the size and precision requirements of the data; know which data must be signed.

Thad

- T
- TheDoc
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Sat, Oct 2, 2004 11:58 PM

every

I would agree at 20% or so larger.. however.. don't forget the tradeoff for portability, maintenance etc..

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Sun, Oct 3, 2004 1:04 AM

TheDoc wrote:

Why can't you try it for yourself?

Here is a simple experiment, no optimization, creating 34 bytes of object code outside of the alignment nops. The lines with **** are the source lines. stdio is not used.

1 .file "junk.c" 4 .section .text 5 Ltext0: 48 .globl _thing 49 _thing: 1:junk.c **** #include 2:junk.c **** 3:junk.c **** int thing(int a, int b, int c) 4:junk.c **** { 51 LM1: 52 LBB2: 53 0000 55 pushl %ebp 54 0001 89E5 movl %esp, %ebp 55 0003 83EC04 subl $4, %esp 5:junk.c **** int result; 6:junk.c **** 7:junk.c **** result = a + b * c; 57 LM2: 58 0006 8B450C movl 12(%ebp), %eax 59 0009 0FAF4510 imull 16(%ebp), %eax 60 000d 034508 addl 8(%ebp), %eax 61 0010 8945FC movl %eax, -4(%ebp) 8:junk.c **** if (result < 0) result = -result 63 LM3: 64 0013 837DFC00 cmpl $0, -4(%ebp) 65 0017 7905 jns L2 66 0019 8D45FC leal -4(%ebp), %eax 67 001c F718 negl (%eax) 68 L2: 9:junk.c **** return result; 70 LM4: 71 001e 8B45FC movl -4(%ebp), %eax 72 LBE2: 10:junk.c **** } 74 LM5: 75 0021 C9 leave 76 0022 C3 ret 80 Lscope0: 82 .text 84 Letext: 85 0023 90909090 .ident "GCC: (GNU) 3.2.1" 85 90909090 85 90909090 85 90

Here is the same source compiled with -O3 optimization. Now it only takes 23 bytes.

1 .file "junk.c" 4 .section .text 5 Ltext0: 44 .p2align 4,,15 49 .globl _thing 50 _thing: 1:junk.c **** #include 2:junk.c **** 3:junk.c **** int thing(int a, int b, int c) 4:junk.c **** { 52 LM1: 53 0000 55 pushl %ebp 54 0001 89E5 movl %esp, %ebp 5:junk.c **** int result; 6:junk.c **** 7:junk.c **** result = a + b * c; 56 LM2: 57 LBB2: 58 0003 8B4510 movl 16(%ebp), %eax 59 0006 8B4D0C movl 12(%ebp), %ecx 60 0009 0FAFC1 imull %ecx, %eax 8:junk.c **** if (result < 0) result = -result 62 LM3: 63 000c 034508 addl 8(%ebp), %eax 64 000f 7802 js L3 65 L2: 9:junk.c **** return result; 10:junk.c **** } 67 LM4: 68 LBE2: 69 0011 5D popl %ebp 70 0012 C3 ret 71 .p2align 4,,7 72 L3: 73 LBB3: 74 0013 F7D8 negl %eax 75 0015 EBFA jmp L2 76 LBE3: 84 Lscope0: 86 .text 88 Letext: 89 0017 90909090 .ident "GCC: (GNU) 3.2.1" 89 90909090 89 90

--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
     USE worldnet address!

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Sun, Oct 3, 2004 1:04 AM

CBarn24050 wrote: (*** and again removed attributions ***)

... snip ...

Not so. It has to do such things as setting up the malloc arena, opening stdin and stdout and assigning their buffers, possibly parsing and globbing the command line. It also has to set up the traps to catch program aborts, any mechanism for functions to be called on exit, etc. This can easily add up to quite a pile.

Please stop stripping attributions for material you quote.

--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
     USE worldnet address!

- K
- Kenneth Lemieux
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Sun, Oct 3, 2004 2:14 AM

Code samples... show the ASM, the C code with ASM output. It's easy to compare.

Ken

formatting link

- T
- TheDoc
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Mon, Oct 4, 2004 12:54 AM

What bay would that be ???...

the "bay" I live in has only 4 houses, no parking lots and one small street..

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Mon, Oct 4, 2004 8:48 AM

Are we talking about a specific target here, or some sort of generalisation? As we all know, all generalisations are false, but in practice C compilers generally lose out to hand assembly on small micros (and for some micros and some compilers, they lose very badly indeed), while on powerful processors and complex code, they produce far smaller (and often faster) code than any sane assembly programmer code produce in a sensible amount of time.

- W
- Wilco Dijkstra
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Mon, Oct 4, 2004 8:49 PM

I totally agree - modern compilers on modern architectures completely blow away all but the very best assembly coders. As compilers improve over time there will be fewer and fewer that are able to compete. Compilers "know" the same tricks that are used by assembly programmers - usually more than the average assembly programmer - and they apply them more often and consistently.

In my experience good compilers produce optimal or near optimal code for small functions. It's obviously possible to beat a compiler by saving an instruction here or there. However compilers are really in their element on large amounts of code, where inter-procedural optimizations can make a difference that is simply impossible to match by hand. Even basic optimizations like inlining are hardly used by assembly programmers - and that while inlining can give significant codesize savings.

One large example I know of is of a hard disc manufacturer who switched from 100% handcoded assembler on a 16-bit architecture to 100% C++(!) on ARM and got a significant improvement in terms of codesize and performance. I don't remember the exact number but guess it was around 20%. The gap would be about 10-15% wider today as compilers improve over time - assembly code does not.

Another case involved over 100K lines of assembler code written because programmers were not happy with the code quality of the (old) GCC compiler they used. When they upgraded to a modern commercial compiler it became clear that it generated such good code that most of the assembly was redundant. Consider the cost of writing and maintaining such a large amount of assembly code vs the cost of upgrading to a commercial compiler...

Wilco

- G
- Grant Edwards
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Mon, Oct 4, 2004 8:59 PM

I've found this to be especially true of architectures with a good-sized set of registers (ARM, H8, 68K, SPARC, etc.). Human assembly language programmers just can't keep track of what's in 8-16 different registers. They resort to using RAM-based variables so that they can use labels to make the program readable and maintainable. A compiler, however, _can_ keep track of what variables and intermediate values are in all those registers and will use ram far less often. This can result in significant speed and size improvements.

--
Grant Edwards                   grante             Yow!  YOU'D cry too if it
                                  at               happened to YOU!!
                               visi.com

- W
- Wilco Dijkstra
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Mon, Oct 4, 2004 11:20 PM

Tiny examples don't really show how good compilers are - you need something more substantial than that.

I've seen people claim this sort of thing about memcpy and write their own version, only to discover that it is an order of magnitude slower than the built-in memcpy... High-level optimizations like checking the alignment and dispatching into special code for each possibility mean it is hard to beat.

It's true that memcpy is traditionally written in assembler - it is one of those cases where saving a single cycle from the inner loop could make lots of applications run a little bit faster, so it is worth it. But it isn't too difficult to get 90% of the optimum in C.

Even the smallest loop is not going to be tighter than a call to memcpy :-)

You may want to consider another compiler or perhaps another architecture (or both!). Compilers for 8- and many 16-bit CPUs are not that good, this is usually a combination of a complex CISC architecture and few professional compilers being produced. Things are quite different in the

32-bit RISC space!

Yes. Knowing both high and low level languages as well as the basics of how compilers map one into the other is necessary in order to be able to write good code. For the last few percent you need to look at the compiler output and change the source until you get the code you want.

Wilco

- N
- Neil Kurzman
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Tue, Oct 5, 2004 3:34 AM

Some 8051 Examples (Keil) Uses unsigned byte where ever possible. the use of int requires 2 bytes and may cause lib call for signed math. use > not >= if possible the 51 does not have a

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
19 years ago

Tue, Oct 5, 2004 4:53 AM

... snip ...

That depends highly on the assembly programmer. While modern compilers do very well, they will never beat an experienced and good assembly programmer. Your figures above tell me the programmers were not expert.

This doesn't count the extra specialized knowledge and time needed to make a good assembly program.

--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
     USE worldnet address!