whose 8051 cc overlays static inline stack frames

- P
- PPAATT
  
  Contact options for registered users
posted
20 years ago

Sat, Jan 10, 2004 1:59 PM

I wonder if any of the 8051 cc folk I see here are working to solve such troubles as:

Once upon a time, a C compiler targeting the 8051 blew more space and time than I could afford in its translation of:

extern void x(char chx); static /* inline */ void c(char chc) { ...; x(chc); ... } static /* inline */ void b(char chb) { ...; c(chb); ... } void a(char cha) { ...; b(cha); ... }

1) The compiler did substitute static ram for stack, by trusting my promise that none of this code ever reentered itself.

2) But this compiler did not support the inline keyword, so I lost two bytes of stack per call/ return, pointlessly.

3) And this compiler blew three bytes of static ram for the ch1 ch2 ch3 args, even though C's pass-by-value means they all could coexist in the same space.

I did engage the compiler vendor at the time in an e-mail conversation. I failed to persuade my contact that my suffering was worth logging, much less resolving. Eventually the code we shipped manually expanded the work inline, using ASCII graphics to mark the inline subroutine boundaries, and stored everything in globals rather than locals, a safe practice only if our manual analysis of control flow vs. data flow holds true over time.

Pat LaVarre

formatting link

- H
- Hans-Bernhard Broeker
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sat, Jan 10, 2004 3:53 PM

Of course it did. On a '51, that's about the only chance you have. Putting arguments or automatic variables on the stack is a non-option if there's no more than 128 Bytes of stack to begin with. Without static call-tree analysis and static allocation/overlaying of variables, C would be close to be impossible to implement on a '51.

Supporting the inline keyword is not the real issue --- anyway, "inline" is an official C keyword only as of the C99 revision of the standard, which I'm quite certain your C compiler never claimed to be compliant to. The real issue is actually inlining code where possible, which a compiler is allowed regardless of whether you used the inline keyword or not.

Whatever made you believe this? It's wrong. Memory re-use can only be decided by analysis of the actual code, and how it uses the value. Pass-by-value has quite exactly nothing to do with that.

--
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

- P
- Pat LaVarre
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sat, Jan 10, 2004 9:09 PM

Thanks for helping me learn I should have said more, sorry we disagree over how elastic the meaning of jargon can be.

Again my C89 fragment was:

extern void x(char chx); static /* inline */ void c(char chc) { ...; x(chc); ... } static /* inline */ void b(char chb) { ...; c(chb); ... } void a(char cha) { ...; b(cha); ... }

Ouch now I see I neglected to mention: I also know that in the actual code here shown as "..." ellipses, there were no mentions of cha chb chc. That's the observation that tells me we can store cha chb chc all in the same static byte.

Is there no 8051 cc compiler available that can make that same observation and act on it?

Yes the compiler in question was a 1989 C not a 1999 C compiler.

I don't mind having to tell the compiler I do want code inlined whenever it saves me the time/space of call/return. I can even survive having to tell the compiler with explicit options or a pragma or a C99 keyword.

What killed me was having bought a compiler supposedly targetted at 64 KiB rom images that didn't have any way to inline, and technical support that didn't even understand what I wanted.

Any 8051 cc customers out there have a better experience to share?

Pat LaVarre

- C
- Chris Hills
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sat, Jan 10, 2004 10:58 PM

In article , Pat LaVarre writes

AFAIK there is only one C99 compiler and that is the Tasking Tricore compiler.

Most are C90 or C95. This is C90 with A1 and the TC's

AFAIK none of them do inline. the good ones optimise very well anyway and will effectivly inline where they can.

Which compiler was it?

/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\ \/\/\/\/\ Chris Hills Staffs England /\/\/\/\/\ /\/\/ snipped-for-privacy@phaedsys.org

formatting link

\/\/ \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

- H
- Hans-Bernhard Broeker
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sun, Jan 11, 2004 1:38 AM

Pat LaVarre wrote: [...]

IMHO, sloppy use of jargon has no place in what really was a rather thinly veiled public accusation of the entire community of '51 C compiler makers of being lazy. If you want to voice a complaint, you should give a complete and accurate record of the facts.

You may be overlooking some of the pickier details of C, most notably the "aliasing problem". If there's even a single pointer being used to access any char object hidden in those "..."s, the compiler has to assume it no longer knows whether that, e.g. cha is still needed even after the return of function b(). What appears obvious to you isn't necessarily obvious to the compiler, too. It may generally be impossible for it to find out.

I see no reason why they shouldn't --- figuring this out should be no unsurmountable obstacle for the kind of static analysis these compilers have to run. But, as they say, the proof of the pudding is in the eating. Post a complete, compilable example, and people might even feed it to their compilers of choice and report on results.

Ah, heck, I'll give it a shot myself...

--
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

- P
- Pat LaVarre
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sun, Jan 11, 2004 12:04 PM

I'll answer that offline.

Here I consciously named no names to leave me free to point out the technical support folk appeared clueless, seemingly knowing even less of C language law than I do.

Pat LaVarre

- P
- Pat LaVarre
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sun, Jan 11, 2004 12:20 PM

Thank you. I suggest we try:

--

#define tbd /* ... */
extern void x(char chx);
static /* inline */ void c(char chc) { tbd; x(chc); tbd; }
static /* inline */ void b(char chb) { tbd; c(chb); tbd; }
void a(char cha) { tbd; b(cha); tbd; }

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sun, Jan 11, 2004 1:51 PM

Pat LaVarre wrote:

Surprising it is valid code, and gcc can be made to restrict its actions to the normal integer promotions (I think):

c:\c\junk>gcc -c -fomit-frame-pointer -O3 junk.c c:\c\junk>objdump -dS junk.o

junk.o: file format coff-go32

Disassembly of section .text:

00000000 : #define tbd /* ... */ extern void x(char chx); static /* inline */ void c(char chc) { tbd; x(chc); tbd; } 0: 0f be 54 24 04 movsbl 0x4(%esp,1),%edx 5: 89 54 24 04 mov %edx,0x4(%esp,1) 9: e9 f2 ff ff ff jmp 0 e: 90 nop f: 90 nop

It also passes splint. However splint -strict is outraged :-)

c:\c\junk>splint junk.c Splint 3.0.1.6 --- 11 Feb 2002

Finished checking --- no warnings

c:\c\junk>splint -strict junk.c Splint 3.0.1.6 --- 11 Feb 2002

junk.c(2,20): Declaration parameter has name: chx A parameter in a function prototype has a name. This is dangerous, since a macro definition could be visible here. (Use either -protoparamname or -namechecks to inhibit warning) junk.c: (in function c) junk.c(3,43): Undetected modification possible from call to unconstrained function x: x An unconstrained function is called in a function body where modifications are checked. Since the unconstrained function may modify anything, there may be undetected modifications in the checked function. (Use -modunconnomods to inhibit warning) junk.c(3,43): Statement has no effect (possible undected modification through call to unconstrained function x): x(chc) Statement has no visible effect --- no values are modified. It may modify something through a call to an unconstrained function. (Use -noeffectuncon to inhibit warning) junk.c: (in function b) junk.c(4,43): Undetected modification possible from call to unconstrained function c: c junk.c(4,43): Statement has no effect (possible undected modification through call to unconstrained function c): c(chb) junk.c: (in function a) junk.c(5,23): Undetected modification possible from call to unconstrained function b: b junk.c(5,23): Statement has no effect (possible undected modification through call to unconstrained function b): b(cha) junk.c(2,13): Function x declared but not defined A function or variable is declared, but not defined in any source code file. (Use -declundef to inhibit warning) junk.c(5,6): Function a declared but not used A function is declared but not used. Use /*@unused@*/ in front of function header to suppress message. (Use -fcnuse to inhibit warning) junk.c(5,34): Definition of a junk.c(2,13): Function x exported but not declared in header file A declaration is exported, but does not appear in a header file. (Use -exportheader to inhibit warning) junk.c(5,6): Function a exported but not declared in header file junk.c(5,34): Definition of a

Finished checking --- 11 code warnings

--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
     USE worldnet address!

- P
- Pat LaVarre
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sun, Jan 11, 2004 1:53 PM

Ouch.

Good to hear. I think I remember:

I wrote a dis/assembler pair with some flow analysis to review the machine code actually produced by the C development environment. The image produced contained dozens of addresses reached by only one call instruction, whoops. As for causes:

1) I remember the specific example I gave here. People had written some subroutines merely to structure code and to make variables local, not to be called twice or more.

2) Also I remember the C compiler working to allow future separate compilation that never actually did occur. The C compiler would write machine code as if anything extern (i.e. called from another source file) could be called twice. And the linker didn't know how to distinguish a subroutine called once.

Also I remember my own quick-and-dirty assembler reassembled the image dramatically faster than the compiler could compile & link it. I think I remember I could even change the assembly code and reassemble faster than the C environment could `make`.

Pat LaVarre

- H
- Hans-Bernhard Broeker
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sun, Jan 11, 2004 2:21 PM

OK. I compiled that. And the compiler I chose *did* find your requested optimization. It allocated all of cha, chb and chc to register R7, and the calls to b, c and x all became plain JMP operations. I.e. the code looks like this:

; FUNCTION _c: JMP _x ; FUNCTION _b: JMP _c ; FUNCTION _a: JMP _a

I didn't test that, but I guess the "linker code packing" feature will reduce that even further, to make _a consist of just JMP _x. It does have an optimization that is supposed to "follow through" on chained jumps like these and retarget directly to the final one.

Get your own Keil eval copy and see for yourself.

Well, as the saying goes, if it's asm code you want, I trust it you know where to find it.

C compilers for '51 can do some rather impressive tricks these days, but there still *is* a limit to what they can do. C99, if anyone actually decides to do it, will help a bit. Not that much through the 'inline' keyword as through the new 'restrict' which lets you help the compiler get across the performance limitations caused by aliasing.

--
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

- J
- Jan Homuth
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sun, Jan 11, 2004 2:21 PM

Hans-Bernhard, I agree.

Please allow me to add my 2 cents worth in the discussion. Using the TASKING CC51 v7.0r8, large memory model

The code snippet:

#define tbd /* ... */ #define STATIC_INLINE _inline extern void x(char chx);

STATIC_INLINE void c(char chc); STATIC_INLINE void b(char chb);

STATIC_INLINE void c(char chc) { tbd; x(chc); tbd; }

STATIC_INLINE void b(char chb) { tbd; c(chb); tbd; }

void a(char cha) { tbd; b(cha); tbd; }

compiled with the following compiler options:

memory model large (using XDATA as default), static data overlay allowed with size 20 bytes (for non-reentrant functions)

the following optimization features enabled:

- CSE (common subexpresseion elimination)

- constant and copy propagation

- peephole optimizer

- invariant code relocation

- optimization into compound assignents

- code order rearranging

- extra flow optimization pass

- register parameter passing

results in the following assembly output:

; TASKING 8051 C compiler v7.0r8 Build 148 ; options: -ne -It:\tk008024\rel7_0r8\include -Ms -rl -ivo=0x0000 -Ci8051 ; -OAcdFhikLmpsVrtw -c20 -b0 -a20 -A1 -wstrict -s -mid=128 $CASE NAME TEST_OVERLAY ; test_overlay.c 1 #define tbd // _nop(); /* ... */ ; test_overlay.c 2 #define STATIC_INLINE _inline ; test_overlay.c 3 extern void x(char chx); ; test_overlay.c 4 ; test_overlay.c 5 STATIC_INLINE void c(char chc); ; test_overlay.c 6 STATIC_INLINE void b(char chb); ; test_overlay.c 7 ; test_overlay.c 8 STATIC_INLINE void c(char chc) ; test_overlay.c 9 { ; test_overlay.c 10 tbd; ; test_overlay.c 11 x(chc); ; test_overlay.c 12 tbd; ; test_overlay.c 13 } ; test_overlay.c 14 ; test_overlay.c 15 STATIC_INLINE void b(char chb) ; test_overlay.c 16 { ; test_overlay.c 17 tbd; ; test_overlay.c 18 c(chb); ; test_overlay.c 19 tbd; ; test_overlay.c 20 } ; test_overlay.c 21 ; test_o verlay.c 22 ; test_overlay.c 23 void a(char cha) ; test_overlay.c 24

PUBLIC _?a TEST_OVERLAY_A_DA SEGMENT DATA OVERLAY( 0 ) RSEG TEST_OVERLAY_A_DA PUBLIC _a_BYTE _a_BYTE: DS 1 ; cha = _a_BYTE (register parameter) TEST_OVERLAY_A_PR SEGMENT CODE RSEG TEST_OVERLAY_A_PR _?a: USING 0 MOV _a_BYTE,R7 ; test_overlay.c 25 tbd; ; test_overlay.c 26 b(cha); LCALL _?x ; test_overlay.c 27 tbd; ; test_overlay.c 28 } RET

; test_overlay.c 29

EXTRN CODE(_?x) EXTRN CODE(SMALL) END

Please note that the _inline extended keyword does exactly what is expected: It places the _inline function's code in the instruction sequence instead of making a call. Very useful to save some microseconds in time critical modules. However it does increase the code size. But that is a traditional trade-off

To cite the manual: "With the _inline keyword, a C function can be defined to be inlined by the compiler. An inline function must be defined in the same source file before it is 'called'. When an inline function has to be called in several source files, each file must include the definition of the inline function. This is typically solved by defining the inline function in a header file.

Not using a function which is defined as an _inline function does not produce any code. Also during a debug session, the inlined function is not known.

The pragmas asm and endasm are allowed in inline functions. This makes it possible to define inline assembly functions. ..."

Maybe this helps to resolve the issue.

regards /jan

Hans-Bernhard Broeker schrieb > [...]

- P
- Pat LaVarre
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sun, Jan 11, 2004 3:29 PM

Thanks for saying, but surprising why?

From the c: prompt and x90 = nop I gather you compiled for x86.

Are nop past a jmp spurious in x86? If spurious, why emit them despite -O3?

Perhaps this English merely says we gave splint only the same separately compiled source file view of code that cc gets without the benefit of an integrated linker, else:

Help, lost me?

// Fun to see 'Splint 3.0.1.6 --- 11 Feb 2002' results // appear as a gratis web service, thank you.

// // Now I wonder if we prefer the less incisive example:

#define FIXME() do { ((void) 0); } while (0) /* ... */

extern volatile int i; extern void x(char chx); extern void a(char cha);

volatile int i = 0; void x(char chx) { i = chx; }

static /* inline */ void c(char chc) { FIXME(); x(chc); FIXME(); } static /* inline */ void b(char chb) { FIXME(); c(chb); FIXME(); } int main(int argc, char * argv[]) { argv = argv; FIXME(); b(argc); FIXME(); return 0; }

Also passes the 3.3 gcc -c -Wall -W of Mac OS X 10.3 Developer. (No splint delivered there. Monday I hope to try an x86 Linux.)

Does this make the example "incomplete"? If yes, what fix do we prefer?

How should we express the idea of a side-effect not to be omitted from the machine code, since in 8051 we have no standard libraries.

'Does this make the example "incomplete"? If yes, what fix do we prefer?'

How should we express the idea of a root entry point. Surely not the int main(int argc, char * argv[]) standard of C89 and Unix?

Aye, the tbd statements have no effect on purpose.

To express this idea of a consciously-empty-statement, I think I remember gcc folk advocate an explicit ((void) 0).

I hesitate because I remember gcc -Wall -W rejecting cast-to-void as a way of saying arg-intentially-not-used in Linux sg utils. But I see now the 3.3 gcc -c -Wall -W of Mac OS X 10.3 Developer does accept cast-to-void as a way of saying zero-intentionally-not-used.

Yes. All the same, naming parameters in a usenet post helps us refer to them.

Yes.

Pat LaVarre

- P
- Pat LaVarre
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sun, Jan 11, 2004 4:06 PM

Only when applied to subroutines actually called more than once. For subroutines called once, inline wins on code size, run time, locality, etc.

Does this actually work? Back when I was paid to work 8051, I couldn't talk here. Now that I'm not paid to work 8051, (a) I have little money for 8051 tools and (b) I won't experience a concentrated interest. The model of time-limited eval in anticipation of much money exchanged doesn't fit me now.

Thanks for the demo. I think I see:

a) LCALL followed by RET. I wonder why that's not an LJMP.

b) One byte allocated for the reused parm, not three or four.

c) Max stack depth of return-from-a and return-from-x, rather than the return-from-a return-from-b return-from-c return-from-x stack I saw before.

Google suggests we're here talking of

formatting link

Not available online? The cited caveats sound normal to me.

Thanks for the demo, results sound good.

Good. I remember seeing unreasonable JMP to JMP in machine code.

Sorry I'm not sure which "it" we mean.

Possibly I miss the point of leaving the JMP to JMP in the object.

Perhaps the overall development experience improves if only in the linker do we invest time into looking for such silliness, even though in separate compilation by definition we spend link time again for each make, not just for each compile.

Here we may have lost me.

I mean to be saying I know of people who are paying for extra chips they don't need, merely because the C compiler they chose wastes space unnecessarily.

Human compilers work better, aye, but each version is different and none are reliably available over time.

Sounds like the C99 inline keyword is gaining a following, so in time at least that trouble will go away.

gcc 3.4 indirect sibcalls are the first ray of hope I've caught for "whose 8051 cc omits the insignificant bytes of call instructions".

I find teachers most willing to help me when we both know the teacher has impressed me.

I haven't yet met an impressive compiler, not when I openly review its work with the help of a paired dis/assembler and flow analysis.

Pat LaVarre

formatting link

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sun, Jan 11, 2004 6:33 PM

To me, on first glance, it lacked a main, #includes, etc. On second glance, it doesn't need them. However it really should have a #include of the access header, specifying the one function externally visible. Either that or a main.

... snip ...

Has to do with controlling data alignment.

The "splint -strict" run was primarily for amusement. It is only useful when you have annoted the source very thoroughly as to intention and usage etc.

Why was your reply not posted as a reply to my article? The references are fouled up.

--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
     USE worldnet address!

- I
- Ian Bell
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sun, Jan 11, 2004 7:21 PM

Forgive my ignorance but isn't 'called more than once' intrinsic in the definition of subroutine?

Ian

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sun, Jan 11, 2004 9:54 PM

No, not from the point of view of the writer. Breaking something up into logical units that perform simple understandable actions correctly facilitates writing accurate code. It prevents creating long monolythic obtuse routines.

There is usually a tradeoff point in numbers of calls where net code becomes smaller. Which is why the inlining decision is better left to the compiler, in many cases.

--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
     USE worldnet address!

- J
- Jan Homuth
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Mon, Jan 12, 2004 12:51 AM

Pat,

Aaaw.. c'mon. Did you also read the excerpt from the help manual ?

That's an abvious one. I did not say that in general _inline causes bigger code size. Only if applied more than once it does. You are right there.

options: -ne -It:\tk008024\rel7_0r8\include -Ms -rl -ivo=0x0000 -Ci8051

If it were an LJMP how would a return be possible ? LCALL stores the return address on the 8051' stack. If you use LJMP there is no way to "know" where to return to.

The compiler translates a call to a global object (which function x() is) to an LCALL instruction since the course of execution will have to return to this point one time or another. Yes yes yes .... I know: I am not talking about the use of function pointers or RTOS environments. That is an entirely different theater.

void a(char cha) { tbd; b(cha); tbd; }

a calls b which calls c which calls x() b and c are _inline functions.

a() is a regular function being visible throughout the application as x() is.

On the "C" side this object can be made visible to other translation units by "extern" declaration.

Whatever happens in these functions -- it has to follow a method agreed upon : this method is to return from a call and implement a call as a call and not as a goto (as you imply by asking for an LJMP instruction). Thus: LCALL not LJMP.

Yes. All th ere is.

Available with the demo...

regards /jan

- P
- Pat LaVarre
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Mon, Jan 12, 2004 5:02 PM

Somehow we're speaking past each other out of context.

I'm saying:

LCALL p ... p: LCALL q RET q: ... RET

often may equivalently be written:

LCALL p ... p: LJMP q ... q: ... RET

Am I yet more clear than mud? The second expression of this same idea requires only enough stack to fit one return address. The first, more naive, expression, wastefully requires enough stack to fit two return addresses.

For the example of:

#define tbd /* ... */ extern void x(char chx); static inline void c(char chc) { tbd; x(chc); tbd; } static inline void b(char chb) { tbd; c(chb); tbd; } void a(char cha) { tbd; b(cha); tbd; }

what I call reasonable is:

a: ljmp x

I think we saw the tasking.com/ compile instead produce:

a: lcall x ret

Situations where an 8051 processor behaves better when asked to lcall, ret rather than ljmp are rare.

I use that if a subroutine called only once has some good reason to be stored elsewhere, rather than inline.

Sorry I misunderstood, not on purpose, honestly, I did and I do customarily review all the text of this thread, and my own drafts of my own text, repeatedly before posting.

Pat LaVarre

- P
- Pat LaVarre
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Mon, Jan 12, 2004 5:10 PM

Help.

1) Is there no portable way for 8051 .c to express the idea of an unspecified side effect that should not be omitted? My attempt was:

extern void x(char chx);

2) What kind of main do we like, if we need an arg? Surely not the Unix:

int main(int argc, char * argv[]) { ...

All clear now thank you.

Sorry this happened, more sorry to hear it bothered you. As yet my news clients cannot simultaneously achieve all of:

1) Available gratis cross-platform (Mac/ Linux/ Windows). 2) Unbroken lines. 3) Correct references. 4) Instant replies.

Usually I give up (4), this time I chose perhaps wrongly to give up (3).

Pat LaVarre

- J
- Jan Homuth
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Tue, Jan 13, 2004 2:12 PM

Pat, A simple matter:

The compiler cannot execute optimization on a call to an external routine of a different translation unit. (C source module)

This would mean having a feature like 'global call optimization'. That is a good idea. Thanks for the inspiration.

Since the compiler does not have a feature x() must be CALL'ed. (Please do not forget that x() has a parameter that is to be passed. The compiler has calling conventions that must be used consistently)

I am aware that there is potential for improvement.

Let me ask you a question. For the code snippet presented, what is the result of the tools available to you ?

grtnx /jan

Pat LaVarre schrieb in im Newsbeitrag: snipped-for-privacy@posting.google.com...