Which GCC Version to use with ARM7 ?

- R
- Rasmus Fink
  
  Contact options for registered users
posted
17 years ago

Wed, May 17, 2006 7:25 AM

Hi,

For a new product we are going to use an ARM7 and are now going to make an important choice : Should we stay with IAR or should we use GCC. We are current leaning a bit towards GCC but we need som extra input here :

What version should we use ?

3.x.x is not the newest but I beleive many of its bugs are known. 4.1.0 is the newest and generally considered stable.

What are your experiences regarding code density, bugs and snags on the different versions of GCC ?

Cheers Rasmus

- R
- Richard
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 17, 2006 7:59 AM

I get asked this question a lot as I use several ARM7 compilers. What people normally neglect to say, as you do, is what your criteria for selection is. Cost, usability, vendor support????

I have had no problem with either IAR of GCC V4.x.x (see

formatting link

if you are using a Windoze host). IAR *could* save you a lot of time if you want to do some complex debugging, or have many download/debug cycles, and therefore pay for itself many times over.

To confuse you more, if this is a commercial venture and you are considering GCC then you might also consider

formatting link

as an alternative.

Regards, Richard.

formatting link

*Now for ARM CORTEX M3!*

- J
- John Devereux
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 17, 2006 8:43 AM

I don't think I have ever found a bug with gcc-arm. No problems with the compiler, it is excellent. I have been using the GNUARM distribution which comes with "Newlib". What device (ram and rom size) are you targetting? This may affect your library choice. Newlib is very powerful but needs care, and avoidance/replacement of some areas, if using with smaller devices.

Rowley Associates package gcc with their own debugger and libraries, this may be an option for you. Not used them but looks good. They sell low cost debugging hardware too it would appear.

--

John Devereux

- R
- Richard
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 17, 2006 8:58 AM

Some of the earlier 3.x.x versions had big problems in ISR code generation and ARM/THUMB interworking. I think this has been fixed for some time now.

Regards, Richard.

formatting link

*Now for ARM CORTEX M3!*

- P
- Peter Dickerson
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 17, 2006 9:05 AM

now.

Pre-3.4.x had very poor Thumb code generation (not much optimization, lots of unnecessary moves etc) but has improved dramatically. I use 3.4.3 and

4.1.0 with little to choose - 99% thumb. When targetting ARM7 with no cache Thumb is almost always a performance win as well as a space win.

Peter

- R
- Rasmus Fink
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 17, 2006 9:24 AM

Thanks for the input, guys.

It's nice to hear that the 4.x.x is considered stable not only by its authors. Have any of you compared the code size generated by V3.x.x vs

4.x.x ?

The device is a AT91SAM7S256, so code space is right now not really an issue, _YET_ - but the expected product life time is ~7 years, so much can still happen...

/Rasmus

Richard wrote:

- P
- Peter Dickerson
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 17, 2006 11:58 AM

generation

now.

I found 4.1.0 to be a tiny bit bigger than 3.4.3 but also to feel a bit faster. I'm guessing a few things product inline code rather than call support routines resulting in a little bloat in return for speed.

Peter

- N
- Not Really Me
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 17, 2006 1:57 PM

For GNU tools we have used both Rowley and Microcross. Of the two we find the Microcross to be generally better. The IDE for debugging seems rather finicky on the Rowley. The linker config is also better on the Microcross.

--
Scott
Validated Software Corp.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 17, 2006 7:56 PM

For smaller programs, gcc 4.1 has the potential to produce smaller and faster code by compiling the entire program at once, letting it do inter-procedural optimisations even across modules.

- J
- John Devereux
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 17, 2006 9:01 PM

Something related to this that I found makes a big difference for me is the compiler switches:

-ffunction-sections -fdata-sections -Wl,--gc-sections

This puts every function and every data object into its own section. The -gc-sections link option then strips out sections that are not used. This happens even if they are global and appear in the same module (source file) as items that *are* used.

You also need to modify the link control file changing

*(.data) to *(.data.*) and *(.text) to *(.text.*)

To pick up the modified section names.

This then allows you e.g. to write libraries with lots of extra functions in them, many of which functions might not get used in every application.

This seems to work fine in 3.4 (as well as 4.1, presumably).

--

John Devereux

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, May 18, 2006 8:02 AM

Yes, this works (on most gcc targets) for modern gcc versions. The fun win gcc 4.1 comes when you use the "--combine" and "-fwhole-program" options (along with -O2 or -O3 optimisation). The --combine option tells the compiler to take all the C files on the command line together and compile them at once, including doing inter-procedural optimisations. The -fwhole-program flag can be thought of as creating a new scope level between global and file static, with ordinary global or extern data falling in this level. Only "main" and explicitly declared "externally_visible" items are now at the true global level. Thus the compiler knows all uses of ordinary global data and code, and can optimise appropriately.

For example, supposing you have a function in a file "uart.c" such as:

void setBaud(unsigned int newBaud) { unsigned int divisor = (osc / 16) / newBaud; divLoReg = (divisor & 0xffff); divHiReg = (divisor >> 16); }

with "osc" being defined as a constant in a different module. Another module, say "protocol.c" calls this function as "setBaud(19200)".

In many cases, the setBaud function is only ever called from one place in the program, and with a constant value. Yet the compiler must generate the full function, and use an expensive division operation even though all the values are known at compile time. The traditional way to improve this is by making setBaud a macro or, better, a static inline function.

Using the "-combine" option, if uart.c and protocol.c are compiled at the same time, the compiler can inline the definition of setBaud into the implementation in protocol.c, and reduce the whole thing down to a couple of memory operations. The code for the setBaud function is still generated, of course, which is a waste of space. It can be removed using the "-ffunction-section" method described by John above, or by using the "-fwhole-program" flag which lets the compiler figure out that it doesn't have to generate code for setBaud at all.

Obviously a function like this one, which is called once, is not time-critical - but the principle applies.

That's the theory, anyway - I don't know how well it works in practice other than for a simple test case on the Coldfire.

mvh.,

David

- R
- Roberto Waltman
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, May 18, 2006 2:52 PM

Interesting, thanks for the information. How stable/reliable gcc 4.x is when asked to perform this type of optimizations?

Also, does anybody know of a tool that would perform these analysis and conversions at the C/C++ source code level, so it can be used with other compilers?

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, May 18, 2006 6:01 PM

I haven't used gcc 4.1 much as yet, and only on the ColdFire, but I've found no problems with it so far. I haven't made use of -combine or

-fwhole-program for anything other than small test cases. However, I've not heard of any issues with 4.1 from anyone else - the gcc team have classified it as stable and are already onto 4.2. In the case of the ColdFire, the code generator is pretty stable and hasn't changed much in years, so the changes are all in the front-end and middle-end, which benefit from being shared with common PC ports and thus are extensively tested.

What you would need would be a tool to collect together all your source files into one source file. Every "static" name should be changed to have the original module's name as a prefix, and every global name (except "main") should be made static. Code that abuses the preprocessor by using different #defines for the same macro name depending on the module are going to have problems.

- A
- Anton Erasmus
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, May 19, 2006 7:51 PM

One of the big things Rowley supplies is flash programming software for most of the new Flash ARM MCUs. This is usable even with cheap home built JTAG interfaces. What sort of flash programming support does Microcross provide ?

Regards Anton Erasmus

- R
- Rasmus Fink
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, May 22, 2006 6:36 AM

Hi again,

Thanks for the inputs everyone. Just keep them tips'n tricks comin' :-)

I'll post some test results when I dig further into the project.

Cheers Rasmus

David Brown wrote:

- J
- James Dabbs
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, May 22, 2006 11:02 AM

I would look at GCC V4 because of its *much* better THUMB implementation. With that particular part, in most cases, THUMB is the better choice for both size and performance.