Do you have a question? Post it now! No Registration Necessary
- Mogens Dybæk Christensen
October 5, 2005, 7:59 am

Short status: Target is 68332, compiler and linker is Microtec C++.
Application is written mainly in C++.
We are running out of FLASH memory, and a check in the linker map
revealed, that 800 Kbyte out of almost 2 Mbyte is used for the strings
segment. Quite a lot for an embedded system with no GUI.
Further checks with the cygwin command
> strings prom.bin |sort|uniq -c
reveals, that most of the strings are RTTI information for C++, and
many are repeated 50 or 100 times!
(Strings finds printable strings in the binary, sort and uniq is used
to sort the strings and count duplicates.)
The raw output of strings is approx. 800K as expected, and if the
duplicates are removed it is squezed to 120K!
Is there a way to eliminate the duplicate strings? Logically the
linker should be able to analyze what is entered into the strings
segment, and eliminate identical strings that are already there.
Since the object format is said to be IEEE, it may be possible to use
another linker, e.g. GNU ld, without replacing the compiler (which has
its "specialities").
Has anyone tried that?
--
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
We've slightly trimmed the long signature. Click to see the full one.

Re: How to eliminate duplicate strings?
Short status: Target is 68332, compiler and linker is Microtec C++
Application is written mainly in C++
We are running out of FLASH memory, and a check in the linker ma
revealed, that 800 Kbyte out of almost 2 Mbyte is used for the string
segment. Quite a lot for an embedded system with no GUI
Further checks with the cygwin comman
> strings prom.bin |sort|uniq -
reveals, that most of the strings are RTTI information for C++, an
many are repeated 50 or 100 times
(Strings finds printable strings in the binary, sort and uniq is use
to sort the strings and count duplicates.
The raw output of strings is approx. 800K as expected, and if th
duplicates are removed it is squezed to 120K
Is there a way to eliminate the duplicate strings? Logically th
linker should be able to analyze what is entered into the string
segment, and eliminate identical strings that are already there
Since the object format is said to be IEEE, it may be possible to us
another linker, e.g. GNU ld, without replacing the compiler (which ha
its "specialities")
Has anyone tried that
--
mdc at manbw dk - MAN B&W Diesel A/S, Copenhage
www.manbw.com - Electronics & software dept
mdc at manbw dk - MAN B&W Diesel A/S, Copenhage
www.manbw.com - Electronics & software dept
We've slightly trimmed the long signature. Click to see the full one.

Re: How to eliminate duplicate strings?

As stated earlier in this thread, we do need RTTI. Althoug this is an
embedded system, we use templates and dynamic casts. Nobody really
wants to give an estimate of the redesign to take it out.
And yes, you can always make another program than the one you
have. But you don't get it for free. ;-)
Ever heard of super tankers breaking apart due to engine failure
during a hurricane? ;-) We don't want that happen to our system.
In a such mission critical system, the cost of test, verification and
approval can be prohibitive.
--
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
We've slightly trimmed the long signature. Click to see the full one.

Re: How to eliminate duplicate strings?

You mean we rely on information stored i RAM, or what? So does the
underlying RTOS.
The basic decision is to use C++, which som people argue is not
"safe". I think the compiler is far bettet to throw around pointers to
objects and structures than a human programmer. And the application
_is_ that complex. And it works. That is why we don't want to just
cook up another solution. This is not the toy business. ;-)
--
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
We've slightly trimmed the long signature. Click to see the full one.

Re: How to eliminate duplicate strings?

In some cases, the compiler has an option to merge duplicate
strings, however this usually happens only within a single module.
I had a similar problem one time, in this case it was a point of sale
terminal. I was asked to make several enhancements to the existing
application that had was already completely filling the available
code space in the terminal.
I noted that there was a fair number of duplicated strings, and
that they were spread through several modules.
I wrote a program to scan all of the source files, and identify all
strings and the number of occurances of each. On a second pass,
it replaces all literal strings (ie: not already variables) occuring more
than once with character array references, and also generates
XSTRINGS.H and XSTRINGS.C which contained definitions for the
string arrays. It also accepts a file listing strings NOT to change in
case you happen to be unlucky enough to be working on a system
allowing writable strings and someone actually did that.
You could try something like that - it worked very well for me.
Regards,
Dave
--
Dunfield Development Systems http://www.dunfield.com
Low cost software development tools for embedded systems
Dunfield Development Systems http://www.dunfield.com
Low cost software development tools for embedded systems
We've slightly trimmed the long signature. Click to see the full one.

Re: How to eliminate duplicate strings?
snipped-for-privacy@use.techsupport.link.on.my.website (Dave Dunfield) writes:

Our problem is similar, also close to maximum in the hardware platform.
But unless you run the "string fixer" on some intermediate file
produced by the compilers C++ pass, it won't do the job here. Most
strings are created in that step, not in the source.
I still think the right place is in the linker, which has all relevant
information.
The vendor, Microtec/Mentor Graphics, gave som suggestions on linker options,
but it did not change anything. Haven't heard from them for some days, but
the problem has got a number. ;-)
A hack to make GNUs ld link Microtec's object files, and optimize the strings,
may also be a solution. The formats are close, but not identical.

Our problem is similar, also close to maximum in the hardware platform.
But unless you run the "string fixer" on some intermediate file
produced by the compilers C++ pass, it won't do the job here. Most
strings are created in that step, not in the source.
I still think the right place is in the linker, which has all relevant
information.
The vendor, Microtec/Mentor Graphics, gave som suggestions on linker options,
but it did not change anything. Haven't heard from them for some days, but
the problem has got a number. ;-)
A hack to make GNUs ld link Microtec's object files, and optimize the strings,
may also be a solution. The formats are close, but not identical.
--
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
We've slightly trimmed the long signature. Click to see the full one.

Re: How to eliminate duplicate strings?
mdc@_manbw.dk (Mogens Dybæk Christensen) writes:

Just for your info, Microtec support came up with the same "solution":
Edit the intermediate assembler files in 325 compilations to add
MERGE_START/MERGE_END where appropriate. No definition of appropriate.
:-(

Just for your info, Microtec support came up with the same "solution":
Edit the intermediate assembler files in 325 compilations to add
MERGE_START/MERGE_END where appropriate. No definition of appropriate.
:-(
--
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
We've slightly trimmed the long signature. Click to see the full one.

Re: How to eliminate duplicate strings?

[...]

I find it quite surprising that most compilers for embedded
programming don't seem to have an automatic optimization mode for this.
My in-house developed Pascal/Modula2 compiler does it as one of the
first steps in its optimization routines. The final assembler file can
look like this snippet:
;
;; String references
;
STR2:
STR4:
STR18:
STR22:
STR0: .DB "Saving... ",0
STR3:
STR5:
STR19:
STR23:
STR1: .DB "OK",0
STR6: .DB "PIN=",0
STR7: .DB "ID=",0
STR8: .DB "I=",0
STR9: .DB " sec",0
STR11:
STR10: .DB "DL",0
STR13:
STR12: .DB "OL",0
STR15:
STR21:
STR14: .DB ", ",0
STR16: .DB "Calibrating",0
STR17: .DB " ",0
STR26:
STR20: .DB "A/D not calibrated!",0
STR24: .DB "No program saved!",0
STR25: .DB "Terminal module",0
STR27: .DB " <- Illegal input!",0
STR28: .DB "AT",0
STR29: .DB "AT+CPIN=",0
STR30: .DB "AT+CMGS=",0
STR31: .DB "*** Alarm condition restored ***",0
--
http://www.flexusergroup.com /
http://www.flexusergroup.com /

Re: How to eliminate duplicate strings?

Oh but they do! The one at hand just failed to use it on the RTTI
string tables --- and the workaround they proposed was to turn it on
for those, too, by massaging the intermediate asm source a bit.
--
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Re: How to eliminate duplicate strings?

Well, doesn't that almost force the solution: turn off RTTI --- you
almost certainly won't be needing that in an embedded system, anyway.
--
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Re: How to eliminate duplicate strings?

Unfortunately, that would require some redesign. The exact amount is not
known just now, but we have reasons that it was not turned off.
If we could eliminate the duplicates, we would be up and running without
touching the source code!
--
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
We've slightly trimmed the long signature. Click to see the full one.

Re: How to eliminate duplicate strings?

Careful with the assessment that everything found by 'strings' is
actually a string. Code can look like text, to the 'strings' utility,
especially if you feed it a flat binary core image instead of a
structured object file format.
Looking at 'size -A' of individual .obj files or the debuggable object
file might be a better test, here.

And for actual strings, it's probably doing that already. But I'm far
from certain that such compression can be done on RTTI tables without
breaking them. If they could, wouldn't the compiler/linker vendor
have done it already?
--
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Re: How to eliminate duplicate strings?

Hi Hans-Bernhard
Thanks for your interest in the problem.
I am aware of the false strings in the output. They may account for
some %, but inspection of the output from strings reveals lots of real
strings, which are duplicated.
We actualle reverse-converted the S19 file that was produced by the
build process, and ran strings on that file. This should eliminate all
debug information etc. The size of that output is very close to what
the linker map says about the strings segment, so I think we are
looking at the real thing.
Microtec claims to use IEEE format, and GNU m68k-elf-objdump can read
their .obj files. It shows, that there is a binary RTTI segment (which
I cannot interpret), but the type strings are in the string
segment. So probably the RTTI segment is a set of pointers into the
strings segment.
Thus it should not change anything to the running code, if the address
of one string is replaced by the address of another identical string
(and the first string removed from the binary image). But the linker
does _not_ do that at the moment.
Unfortunately, the Microtec dialect of IEEE seems incompatible with
GNU m68k-elf-ld, which gave an assert when I tried.
- We are now in contact with Microtec support, but no solution till
now.
--
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
mdc at manbw dk - MAN B&W Diesel A/S, Copenhagen
www.manbw.com - Electronics & software dept.
We've slightly trimmed the long signature. Click to see the full one.
Site Timeline
- » Reverse current into a lithium battery
- — Next thread in » Embedded Programming
-
- » timer system - any thoughts?
- — Previous thread in » Embedded Programming
-
- » Plug&Play solution to reach an embedded HTTP server under a router
- — Newest thread in » Embedded Programming
-
- » Webcam e distanza focale
- — The site's Newest Thread. Posted in » Electronics Hobby (Italian)
-