code optimiation

Dear all,

Are there any hard and fast rules for code optimization in C targetting processor.

Thanks and regards

Reply to
aamer
Loading thread data ...

In message , aamer writes

No.

Which compiler? Which target?

--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills  Staffs  England     /\/\/\/\/
 Click to see the full signature
Reply to
Chris H

a

Thanks Chris for your reply. I am working on ARM7TDMI simulator and th basics I know about code optimisation is

  1. converting floating point code to fixed point.
  2. writing assembly for critical modules.

apart from this are there any methods???

do you have any idea about the text books on advanced C programmin covering code optimization methods.

regards

Reply to
aamer

The only hard and fast rule is the "as if" rule in the C99 standards. Optimized code must function as if it were implemented as the standard describes.

There is surprisingly little good information available on optimization of C. Compiler texts in general devote a lot of space to parsing an activity that takes a small fraction of the time to implement a compiler.

The best that most texts do is describe individual optimization techniques. As far as I know none of them deals with the much tougher problem of managing optimization in compiled code. Determining where to apply optimizations and more important where not to and compiling with an application level optimization strategy makes a big difference in the ultimate code generated.

Regards

-- Walter Banks Byte Craft Limited Tel. (519) 888-6911

formatting link
snipped-for-privacy@bytecraft.com

Reply to
Walter Banks

You may want to look carefully at both of these.

1) Fixed point is about precision and floating point is about dynamic range. A few months ago we did some detailed metrics between fixed and floating point code. The biggest surprise in well implemented comparisons is although the fixed point was slightly smaller and faster than floating point the conclusion was that the usage choice would be depend on other things in the application.

2) In well implemented compilers asm is not an advantage. In most compilers algorithm choice in critical modules is more important that asm vs C implementation.

What is your application area?

Regards

-- Walter Banks Byte Craft Limited Tel. (519) 888-6911

formatting link
snipped-for-privacy@bytecraft.com

Reply to
Walter Banks

In message , aamer writes

Is this for yourself or something for general use? It's a big job.

What ARM compilers are you thinking of working with (and why not use their simulators)?

Writing ASM is not always optimisation (anyone wants to argue can we do this is a separate thread please :-)

Lots. Some depend on the compiler you are using.

Are you are trying to write optimised C code or a simulator that handles optimised C code where you need to match the source to the object code?

--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills  Staffs  England     /\/\/\/\/
 Click to see the full signature
Reply to
Chris H

Yes.

You're welcome.

--
Grant Edwards                   grante             Yow!  Gibble, Gobble, we
                                  at               ACCEPT YOU...
 Click to see the full signature
Reply to
Grant Edwards

[...]

That's quite unclear for a problem statement.

Are you working on producing yet another ARM simulator, or are you working on a C program for some ARM chip, with a simulator as your current platform?

If the former: it's unclear a) why you're doing that, and b) why you think that's an embedded system, and thus on-topic in this newsgroup.

If the latter, it's unclear why you think mentioning the simulator is of any relevance.

Either way you're missing the most important rules of "code optimization":

1) Don't do it. 2) _Still_ don't do it. 3) If you're really sure you have to: _measure_ before you do it.

And consider algorithm changes before you invest your time into code changes that likely as not will have no effect at all, or even make things worse.

Reply to
Hans-Bernhard Bröker

I'd advocate using types like "uint_fast8_t" instead of "unsigned int"; that way you'll get good performance out of all kinds of machine, whether they be 8-Bit, 16-Bit or 5-billion-Bit. For instance if you use "unsigned int" on an 8-Bit microcontroller where an 8-Bit integer would suffice, then your code will be at least twice as slow because multiple instructions are used everytime you do simple arithmetic.

Also I'd advocate using "built-in" parts of the language where possible, e.g.:

unsigned arr[12] =3D {0};

instead of:

unsigned arr; memset(arr,0,sizeof arr);

(Also the former is fully portable for dealing with types like pointers and floating point types whose "zero value" might not be all- bits-zero)

Another thing would be about the use of the post-increment and post- decrement operators in a conditional. For instance:

void strcpy(char *dst, char const *src) { while (*dst++ =3D *src++); }

The idiom of using *p++ is widespread, but unfortunately its use is no longer advisable because hardware has moved on. I think it was the PDP11 that had a single instruction for dereferencing a pointer and also incrementing it at the same time, thus it was beneficial to use *p

++ wherever possible -- however modern machines don't have such an instruction, so the assembler produced for *p++ when used as the conditional in an if statement, for instance, might be sub-optimal. So I'd say opt for:

for ( ; *dst =3D *src; ++dst, ++src) ;

Moving on...

On most machines, I would use pointers instead of element indices for iterating thru an array. For example:

char *p =3D arr; char const *const pend =3D arr + LENGTH;

do if ('a' =3D=3D *p) return 1; while (pend !=3D ++p);

intead of:

unsigned i =3D 0;

do if ('a' =3D=3D arr[i]) return 1; while (LENGTH !=3D ++i);

The latter, on most architectures, is a hell of a lot slower. But then again there are some PC's that have a single instruction for "pointer

  • offset", so I can't discredit that technique altogether.

On all architectures, I advocate the use of look-up tables instead of switch statements where applicable, especially when it's possible to have a look-up table containing function pointers.

If you're ever dealing with a struct that has a lot of information in it which is common to a "type", then it might be advisable to follow C+

+'s idom of removing that stuff from the struct and replacing it with a pointer to a single object which contains all the relevant information for that time (a V-Table, that is).

Emmm they're the main ones that come to mind right now.

Reply to
Tomás Ó hÉilidhe

Many compilers allow two kinds of optimization: Optimization for speed and optimization for space. The latter is more common on smaller processors with limited code storage.

Here are a few tricks I've played around with to make C code run faster:

When you have arrays of structures, pad the structures to end up 8,16,32 or 64 bytes long. That allows the compiler to index into the array by a left shift of the index. This probably isn't worth the trouble on a 32-bit processor with a hardware multiply instruction.

When traversing arrays of elements longer than one byte, the compiler will sometimes generate faster code if you use a local pointer (which ends up in a register) and increment the pointer rather than incrementing an array index, then multiplying the index by the element size.

Some things like this may still be worthwhile on 8 and

16-bit processors. With 32-bit processors like the ARM, which can combine shifts with other functions in a single instruction, I generally trust the compiler writer, then verify by looking at the generated assembly code.

Since I'm still solving the same kind of data logging problems that I was working with a decade ago, while the processors now have about 8 times the speed and memory with about 1/2 the power consumption, I've mostly quit worrying about optimizations. That's allowed me to concentrate on clean, maintainable code that can be delivered on schedule. Since I'm working in a niche market where unit cost is not the primary constraint, I can afford to use good tools and good materials.

As others will no doubt tell you: It's better to think about algorithms than to worry about optimizations.

Mark Borgerson

Reply to
Mark Borgerson

My main-points (for speed and size) are:

  1. Benchmark what's worth optimizting

  1. Do all algorithmic optimizations first.

  2. Get a good compiler. If you're using GCC consider compiling a new one.

  1. Learn what the restrict-keyword from C99 does. Most compiles support it these days. Use restrict whenever possible, but never if you're not sure if it can be applied.

  2. Don't use unsigned integers for loop-variables unless you need the wrap-around feature.

  1. Let the compiler decide what to inline and what not. Don't inline functions just because you think the code will benefit from it.

  2. Embedded CPU's often have small caches and slow external memory. Try to keep your working-set small. Packing multiple booleans or enums in a single integer may look dirty (less so if you hide the dirty details with macros), but if it can increase the cache efficiency a lot.

And last: It's not worth to outsmart the compiler. Changing loops from indexing to pointer increment style is not worth it anymore. The compiler will do this job for you.

Reply to
Nils
  1. Starting with clear, well structured, code will help if you need to optimize it later.

Also use the benchmark to determine if there is a performance issue in the first place. Though aiming for efficient code is a lofty goal, other goals like correctness, robustness, maintainability, clarity...etc are often at least as (and usually more) important. No one will care how fast your code can produce incorrect results.

Algorithmic optimizations can improve performance by orders of magnitude, code optimizations rarely improve performance by more than

30% and usually much less than that.

Or more general: never assume some 'clever trick' will generate faster or smaller code - instead prove that the 'clever trick' will yield the desired effect. With prove I mean measure (before and after) and/or check the compiler output (which also helps to develop a feel what is expensive and what not). Also remember that not all compilers are alike; some compilers optimize certain code sequences better than others.

I have seen too many examples of people obfuscating the source code assuming they are helping the compiler to generate more efficient code, while in reality they made things performance wise no better and sometimes even worse.

Reply to
Dombo

Not necessarily true:

................... while (my_unsigned8var < 5) {}

1BF0: MOVF x85,W 1BF2: SUBLW 04 1BF4: BNC 1BF8 1BF6: BRA 1BF0 ................... while (my_signed8var < 5) {} 1BF8: BTFSC x86.7 1BFA: BRA 1C02 1BFC: MOVF x86,W 1BFE: SUBLW 04 1C00: BNC 1C04 1C02: BRA 1BF8 ...................

On this particular combination of target and compiler (PIC18 with CCS C) unsigned in always faster than signed.

Reply to
Tom

I've seen various other compilers/targets where use of unsigned loop indexes is faster. For example, one of the tips/tricks listed when using GCC for the MSP430 target:

Tips and trick for efficient programming

[...] 10. Use unsigned int for indices - the compiler will snip _lots_ of code.

On second thought, that might be refering to array indexes instead of loop indexes. Hmm...

--
Grant Edwards                   grante             Yow! I want to read my new
                                  at               poem about pork brains and
 Click to see the full signature
Reply to
Grant Edwards

I write fully-portable code all the time and I find it to be a simple task a lot of the time. The C Standard provides you with plenty of information to write fully-portable algorithms and programs.

Reply to
Tomás Ó hÉilidhe

I myself only use signed integer types when I need to store negative numbers. Other reasons for going with unsigned are:

1) With signed integer types, you get undefined behaviour upon overflow. 2) On machines other than two's complement, arithmetic can be less efficient with signed. 3) You can be left with a trap representation if you play around with the bits of a signed, depending on the system.

I see signed integer types as nasty and so I only use them when I really have to.

Reply to
Tomás Ó hÉilidhe

"I'm only 21 years of age".

Chances are good that you're pontificating to someone who's been earning money at this game since before you were an orgasm.

So is your experience vast, or your statement half-vast?

--
Tim Wescott
Control systems and communications consulting
 Click to see the full signature
Reply to
Tim Wescott

Most compilers don't need optimization, since most compilers are for things other than general-purpose programming languages. Therefore far more compiler writers need to know about parsing than need to know about optimization.

Try books like _Advanced Compiler Design and Implementation_ by Steven Muchnick.

Eric

Reply to
Eric Smith

This is a good point.

It was one of the books I was referring to that has good descriptions of individual optimization techniques but deal with optimization management and application level optimization strategy very well.

Regards

-- Walter Banks Byte Craft Limited Tel. (519) 888-6911

formatting link
snipped-for-privacy@bytecraft.com

Reply to
Walter Banks

In message , Nils writes

Most have no cache.

Not always true.

Absolutely. It usually hampers the compiler.

--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills  Staffs  England     /\/\/\/\/
 Click to see the full signature
Reply to
Chris H

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.