Performance hit when using 'ar' instead of 'ld -r' to put static libraries together

Hello,

I am working on a uClinux-based embedded system on an ARM processor with no MMU. Because of the lack of MMU, we can only use static libraries.

Until very recently, all the "libraries" in our code were put together with 'ld -r' (and as such, were merely large objects). In order to clarify things and lighten the weight of our applications, I tried to switch to putting together real static libraries using 'ar'.

Now, it seems (and I emphasize the word 'seems') that we get a performance hit with the libraries put together with 'ar', although it makes no sense to me why it should be so.

I want to test the waters and see if anyone heard of or experienced similar issues (but who in their right mind would pretend to put together static libraries with 'ld -r'?) and could bring some clarifications to me.

Thanks in advance for your comments.

--
Bertrand
Reply to
Bertrand Mollinier Toublet
Loading thread data ...

What sort of performance hit ?

Reply to
Geronimo W. Christ Esq

I don't have definite figures (in terms of CPU usage, etc.), but the application in question plays audio and requires close to 100% CPU. When it requires strictly less than 100%, it is able to play the whole audio file without stuttering not any other issue, while when requiring more than 100% if CPU, stuttering occurs.

With this said, we observe that when the libraries are put together with 'ld -r', the audio plays smoothly, while when they are put together with 'ar', the audio stutters.

That's the performance hit I am talking about: apparently, putting static libraries together with 'ar' instead of 'ld -r' results in the final executable linking against those libraries requiring more CPU to perform the same task.

I hope this clarifies things.

--
Bertrand
Reply to
Bertrand Mollinier Toublet

Are you sure the linker links in the same routines with the libraries and the object files ? I cannot remember the exact rules of what ld does when a symbol occurs in more than one module, but link order definately can make a difference.

Regards Anton Erasmus

Reply to
Anton Erasmus

Well, that's a very valid question. I checked that the symbol list definitely is the same. I have to admit I have not made sure that the code taken in was the same. The engineer in charge of the library of contention assures me he does not have two versions (optimized and non-optimized) of the same code...

--
Bertrand
Reply to
Bertrand Mollinier Toublet

The only reason why the 2 versions should execute differently is that they are different. As far as I can see, there are 2 options for this. Code in different positions - i.e. It might be that one version cashes better than the other - although this is quite unlikely, or that the code is different. This option I think is the most likely. I think a routine in an .o file will override a routine in a normal system library, while not necessaraly when in ones own library. Everything in a .o file is linked in, while only code that has referenced symbols in libraries are linked in. It should be possible to get a verbode output indicating exactely which portions of the code are linked in using both methods.

Regards Anton Erasmus

Reply to
Anton Erasmus

I have seen the same performance issues when using both methods, and I think that the reasons why have to do with the fact that the two commands generate two different types of 'libraries'

ar will simply archive all the specified object files and put them into a file. It does absolutely no symbol resolution between the files, and it is up the the application which is using this file to resolve each symbol. This is done by searching through each .o file within the archive. If it finds the symbol it needs, it loads the .o file, and then starts all over again to resolve the next symbol. Also, after loading the .o file from the archive, it adds all the other unresolved symbols from the .o file to its list. Only after all the symbols are resolved, can the system continue.

Using 'ld -r' will actually link all the specified object files into one big .o file. The .o file generated will have a unified symbol table, so the application using it will only need to load one .o file, as opposed to the above scenario. Using ld will also report any symbol conflicts (duplicates) which you will not get using ar.

Chuck

Reply to
Chuck Gales

It seems to me that you are addressing performance issues regarding the linking process itself. My impression was that the OP had issues with performance with the executable generated by the diferent linking processes.

Regards Anton Erasmus

Reply to
Anton Erasmus

That's right. Compilation/linking time is irrelevant as we can throw a sufficiently powerful cross-compiling server at it :-)

Anton, I am still investigating whether some different/unoptimized code might make its way into the executable when switching from 'ld' to 'ar'.

--
Bertrand
Reply to
Bertrand Mollinier Toublet

I apologize for the misunderstanding. I guess I misinterpreted what the issue was.

As for actual execution performance, the only thing that I could think that might cause an issue, but would have to be verified, is the order of linking and the ordering of the functions within the executable. Is one method causing longer 'jumps' requiring more processor time?

Reply to
Chuck Gales

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.