Trimming fat in g++ and newlib

T

Tim Wescott 9 years ago

All you anti-C++ guys -- just hold it in. You won't convince me and I won't convince you.

I made a comment in Pozz's thread on avoiding malloc that I had tried C++ on a processor with 8k of memory and couldn't make it fit.

The experiment, as I remember it, was to write a basic hollow shell, build it in C and in C++, and see that it used something like 6k more flash with C++ than with C (it may not be 6k -- it was a lot with respect to 8k, and not much with respect to 64k).

It was the gnu toolchain with newlib (it's the arm-none-eabi toolchain).

In theory, if you're thrifty with your C++, you shouldn't pull in any more junk than you would in C. Obviously it's different in practice.

Has anyone managed to really trim this down? It's kind of immaterial anyway, because if you're doing something that will reasonably fit into

8kB of flash then you don't need all of the features of C++ that make it easier to author big programs. But I'm curious.

Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!

Vote

W

Wouter van Ooijen 9 years ago

Op 21-Dec-16 om 8:13 PM schreef Tim Wescott:

IMO that is a nonsense argument, but that's not the point of your question.

But (as a strong C++ proponent) I am very interested in the example. I have no idea what a hollow shell is, but can I find the source code of both versions somewhere? I'd be very interested to repeat the experiment.

wouter "Objects? No Thanks!" van Ooijen

>

Vote

T

Tim Wescott 9 years ago

It doesn't exist anymore -- I wrote it as throwaway code to verify the tool chain and didn't keep it. Sorry about that.

But, you should be able to come pretty close if you just write a main.c that blinks an LED, with a linker file and whatnot to make it load onto a processor and work. Then, just change the name to main.cpp and build again.

Here's the compiler flags that I use:

-mthumb -g -mcpu=cortex-m0 -nostdlib -ffunction-sections -fdata-sections - Wl,--gc-sections -ffast-math -I. -I.. -I../st -fno-exceptions -fno-rtti

If things haven't changed, the footprint on both RAM and flash will be significantly greater when you use C++.

Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!

Vote

J

John Speth 9 years ago

I can believe you'll see such a problem with gnu. I suspect gnu is a mish-mash of things we really have no control over. I call it the hidden cost of free.

I ran your GPIO toggler experiment with IAR Embedded Workbench for ARM. I found very small and believable output footprint sizes when I wrote and compared output from C and C++ code. The C++ code used a class that toggles a bit, very simple. The code growth was tiny. The max footprint was 190 code bytes from C++ and 140 bytes from C.

Then I used new() to instantiate my toggle class in the C++ project. The footprint rose to 740 bytes, which I believe is a plausible growth that comes with the addition of new() and not much more. The map files showed new and its subordinates included and nothing else.

My conclusion is beware of gnu.

JJS

Vote

P

Paul Rubin 9 years ago

Is that not expected? new() basically pulls in the system malloc/free machinery.

Vote

J

John Devereux 9 years ago

Have you looked at the GNU ARM Eclipse "framework"? The minimal c++ "blinky" project is 1.2k flash on an M0.

Adding some c++ strings made it ~2k.

This was using the "GNU ARM Embedded" toolchain (which perhaps you are already using).

formatting link

John Devereux

Vote

D

David Brown 9 years ago

Have you made sure you are using newlib nano? The standard newlib is big and complete, while the newlib nano version cuts a lot of features that are rarely used in embedded systems (such as wide character support), has a much smaller malloc implementation, more limited printf, etc.

Vote

S

Stefan Reuther 9 years ago

Am 21.12.2016 um 20:13 schrieb Tim Wescott:

This totally depends on what your C++ program is doing.

If you're using 'new', you're indirectly using 'malloc'. If you're using destructors, you're indirectly using exception handling (unless you find all knobs required to turn it off).

If you start by compiling your C code with a C++ compiler, you should receive comparable object code. Then you can start using some of C++'s easier functionality, such as classes-with-member-functions, specific casts, templates, namespaces for grouping stuff, etc.

I have written device drivers and (parts of) boot loaders in C++. In the case of device drivers, I even got better code than the C version: C++ makes it easy to put all attributes of a driver into a class, giving the ability of multiple instances for free as well as more efficient object code than with global variables, and allows generic re-usable data structures ("queue of interrupt events"). Sure, this can also be done in C, but it is more cumbersome, and the drivers I had been given to start with didn't do any of that.

Of course I had to avoid C++ features. In addition to destructors and 'new', I also had to avoid constructors because constructors for static objects also require runtime support which isn't available in a boot loader. That alone could be a few k of code.

Stefan

Vote

W

Wouter van Ooijen 9 years ago

Why that conclusion? If you want to use the heap, you pay a price (in code and probably also in data-overhead). But why use the heap in a small embedded system?

Wouter "Objects? No Thanks!" van Ooijen

Vote

W

Wouter van Ooijen 9 years ago

Just as important as the compiler flags: which linker script did you use? The ARM-GCC I use

formatting link

doesn't provide any.

And which startup script? (For small programs the size of the vector table at ORG 0 can be lager than the code itself...)

Wouter "Objects? No Thanks!" van Ooijen

Vote

A

antispam 9 years ago

What is your definition of "footprint"? On smallest M0 I have interrupt vectors take 192 bytes, which means that smallest "general" program will have more than 200 bytes.

Concerning GNU tools, look at content of f0/lcd_i2c directory in the lising at:

formatting link

This is a simple demo program built using gcc, g++ and libopencm3 (_no_ C library). It uses character LCD library extracted from libmaple, which is C++ code. 'main' is in C and libopencm3 routines are C too. Sizes are:

text data bss dec hex filename 500 0 0 500 1f4 LiquidCrystalBase.o 208 0 0 208 d0 LiquidCrystal_I2C.o 312 0 4 316 13c lcd_i2c.o 198 4 16 218 da liq_hello_i2c.o 2288 12 20 2320 910 lcd_i2c.elf

Explicit C++ code together gives 906 bytes, explicit C code 312 bytes,

192 byte goes for vectors, 878 remaining code bytes is mostly libopencm3 and arithmetic routines. AFAICS most overhead is the same for C and C++. The program above uses:

-fno-rtti -fno-exceptions -Wl,--gc-sections \ -nostartfiles -nostdlib

options and custom linker script. I also provide dummy '__cxa_pure_virtual' and '__aeabi_atexit' (otherwise it would not link without compiler provided startup files).

IMO the example shows that it is possible with rather small effort to get resonably tight code using g++. But one have to avoid bloat due to libraries.

Waldek Hebisch

Vote

T

Tim Wescott 9 years ago

208 0

4

I wonder if the __cxa_pure_virtual was what was pulling in all the junk when I tried this.

I figured there was a way to sweet-talk the tool chain into trimming things; I'm just not enough of an expert to know how.

Tim Wescott Control systems, embedded software and circuit design I'm looking for work! See my website if you're interested http://www.wescottdesign.com

Vote

T

Tim Wescott 9 years ago

Hey David:

Probably not. Between you and Waldek I think I'm going to remember that this can be done, but not how, and maybe be back here in a few years asking for more detail.

Tim Wescott Control systems, embedded software and circuit design I'm looking for work! See my website if you're interested http://www.wescottdesign.com

Vote

L

Les Cargill 9 years ago

For inspiration, I'd look at the dialect of C++ used on Arduinos.

Les Cargill

Vote

P

Paul 9 years ago

Which one as most of the cores are different and have different levels of maintenance. The Arduino/Genuino has many processors

Uno/Mega - AVR Zero/Due - Arm several - x86

and others.

Most are g++ and particular processor support libraries.

Paul Carpenter | paul@pcserviceselectronics.co.uk PC Services Raspberry Pi Add-ons Timing Diagram Font For those web sites you hate

Vote

W

Wouter van Ooijen 9 years ago

AFAIK the arduino's don't use a dialect, but a framework. Within the init and update functions that you as a user provide, you can use the full C++ language.

I use C++ for teaching programming small microcontrollers, and for programming them myself. I don't use the 'full' C++ language (but who does?):

- no heap (library/linker enforced)

- no RTTI (compiler switch)

- no exceptions (compiler switch)

- no floats (library/linker enforced)

- no run-time initialized gloabls (linker enforced)

For the micro-controller classes I still use traditional (inhertance/virtuals-based) abstractions, because that mechansism is still dominant today, especially for programming larger systems.

For my own work I use template-based abstractions that are far more efficient, and allow the stack size to be calculated from the code.

I target mostly Cortex-M0, but occasionbally I try MCP (16-bit) or AVR (8-bit) to see how my techniques work on those platforms.

I think the "you don't need C++ in an 8k chip" is wrong because using good libraries will always save you time, and good libraries require good abstraction mechanisms. C++ has templates, which are *very* usefull for writing efficient libraries, especially for avoiding code bloat.

You can check my talk for the basic principles I use:

formatting link

Wouter "Objects? No Thanks!" van Ooijen

Vote

J

John Devereux 9 years ago

I would say it is IAR that you have no control over. Do they even provide source code for the libraries and run time system yet?

I downloaded the GNU ARM Eclipse distribution and GCC ARM toolchain. A minimal example class based "led toggler" c++ program for cortext M0 is

1244 code bytes in total. Most of that is ISRs and hardware init code - which there is full control over.

There is essentially zero c++ "bloat" and zero "hidden cost of free".

gcc has full support for the current c++ standard. I.e. it has full support for c++11 and c++14 out of the box (with experimental c++17 support). Judging by the website the IAR compiler only supports the c++ from 2003. This is important, a lot of the changes are very useful for embedded - it does seem like a new language. It is what has prompted me to start using it for new embedded projects even on smaller systems.

I would like to know this too. From the lack of response I am starting to suspect he means the *additional* memory only; the base "run time system" could be huge for all we know.

John Devereux

Vote

T

Tim Wescott 9 years ago

Huh. Things have gotten better then.

Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!

Vote

Trimming fat in g++ and newlib

Join the Discussion

Didn't find your answer?