1) The compiler did substitute static ram for stack, by trusting my promise that none of this code ever reentered itself.
2) But this compiler did not support the inline keyword, so I lost two bytes of stack per call/ return, pointlessly.
3) And this compiler blew three bytes of static ram for the ch1 ch2 ch3 args, even though C's pass-by-value means they all could coexist in the same space.
I did engage the compiler vendor at the time in an e-mail conversation. I failed to persuade my contact that my suffering was worth logging, much less resolving. Eventually the code we shipped manually expanded the work inline, using ASCII graphics to mark the inline subroutine boundaries, and stored everything in globals rather than locals, a safe practice only if our manual analysis of control flow vs. data flow holds true over time.
Of course it did. On a '51, that's about the only chance you have. Putting arguments or automatic variables on the stack is a non-option if there's no more than 128 Bytes of stack to begin with. Without static call-tree analysis and static allocation/overlaying of variables, C would be close to be impossible to implement on a '51.
Supporting the inline keyword is not the real issue --- anyway, "inline" is an official C keyword only as of the C99 revision of the standard, which I'm quite certain your C compiler never claimed to be compliant to. The real issue is actually inlining code where possible, which a compiler is allowed regardless of whether you used the inline keyword or not.
Whatever made you believe this? It's wrong. Memory re-use can only be decided by analysis of the actual code, and how it uses the value. Pass-by-value has quite exactly nothing to do with that.
--
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.
Ouch now I see I neglected to mention: I also know that in the actual code here shown as "..." ellipses, there were no mentions of cha chb chc. That's the observation that tells me we can store cha chb chc all in the same static byte.
Is there no 8051 cc compiler available that can make that same observation and act on it?
Yes the compiler in question was a 1989 C not a 1999 C compiler.
I don't mind having to tell the compiler I do want code inlined whenever it saves me the time/space of call/return. I can even survive having to tell the compiler with explicit options or a pragma or a C99 keyword.
What killed me was having bought a compiler supposedly targetted at 64 KiB rom images that didn't have any way to inline, and technical support that didn't even understand what I wanted.
Any 8051 cc customers out there have a better experience to share?
IMHO, sloppy use of jargon has no place in what really was a rather thinly veiled public accusation of the entire community of '51 C compiler makers of being lazy. If you want to voice a complaint, you should give a complete and accurate record of the facts.
You may be overlooking some of the pickier details of C, most notably the "aliasing problem". If there's even a single pointer being used to access any char object hidden in those "..."s, the compiler has to assume it no longer knows whether that, e.g. cha is still needed even after the return of function b(). What appears obvious to you isn't necessarily obvious to the compiler, too. It may generally be impossible for it to find out.
I see no reason why they shouldn't --- figuring this out should be no unsurmountable obstacle for the kind of static analysis these compilers have to run. But, as they say, the proof of the pudding is in the eating. Post a complete, compilable example, and people might even feed it to their compilers of choice and report on results.
Ah, heck, I'll give it a shot myself...
--
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.
Here I consciously named no names to leave me free to point out the technical support folk appeared clueless, seemingly knowing even less of C language law than I do.
It also passes splint. However splint -strict is outraged :-)
c:\c\junk>splint junk.c Splint 3.0.1.6 --- 11 Feb 2002
Finished checking --- no warnings
c:\c\junk>splint -strict junk.c Splint 3.0.1.6 --- 11 Feb 2002
junk.c(2,20): Declaration parameter has name: chx A parameter in a function prototype has a name. This is dangerous, since a macro definition could be visible here. (Use either -protoparamname or -namechecks to inhibit warning) junk.c: (in function c) junk.c(3,43): Undetected modification possible from call to unconstrained function x: x An unconstrained function is called in a function body where modifications are checked. Since the unconstrained function may modify anything, there may be undetected modifications in the checked function. (Use -modunconnomods to inhibit warning) junk.c(3,43): Statement has no effect (possible undected modification through call to unconstrained function x): x(chc) Statement has no visible effect --- no values are modified. It may modify something through a call to an unconstrained function. (Use -noeffectuncon to inhibit warning) junk.c: (in function b) junk.c(4,43): Undetected modification possible from call to unconstrained function c: c junk.c(4,43): Statement has no effect (possible undected modification through call to unconstrained function c): c(chb) junk.c: (in function a) junk.c(5,23): Undetected modification possible from call to unconstrained function b: b junk.c(5,23): Statement has no effect (possible undected modification through call to unconstrained function b): b(cha) junk.c(2,13): Function x declared but not defined A function or variable is declared, but not defined in any source code file. (Use -declundef to inhibit warning) junk.c(5,6): Function a declared but not used A function is declared but not used. Use /*@unused@*/ in front of function header to suppress message. (Use -fcnuse to inhibit warning) junk.c(5,34): Definition of a junk.c(2,13): Function x exported but not declared in header file A declaration is exported, but does not appear in a header file. (Use -exportheader to inhibit warning) junk.c(5,6): Function a exported but not declared in header file junk.c(5,34): Definition of a
Finished checking --- 11 code warnings
--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
Available for consulting/temporary embedded and systems.
USE worldnet address!
I wrote a dis/assembler pair with some flow analysis to review the machine code actually produced by the C development environment. The image produced contained dozens of addresses reached by only one call instruction, whoops. As for causes:
1) I remember the specific example I gave here. People had written some subroutines merely to structure code and to make variables local, not to be called twice or more.
2) Also I remember the C compiler working to allow future separate compilation that never actually did occur. The C compiler would write machine code as if anything extern (i.e. called from another source file) could be called twice. And the linker didn't know how to distinguish a subroutine called once.
Also I remember my own quick-and-dirty assembler reassembled the image dramatically faster than the compiler could compile & link it. I think I remember I could even change the assembly code and reassemble faster than the C environment could `make`.
OK. I compiled that. And the compiler I chose *did* find your requested optimization. It allocated all of cha, chb and chc to register R7, and the calls to b, c and x all became plain JMP operations. I.e. the code looks like this:
; FUNCTION _c: JMP _x ; FUNCTION _b: JMP _c ; FUNCTION _a: JMP _a
I didn't test that, but I guess the "linker code packing" feature will reduce that even further, to make _a consist of just JMP _x. It does have an optimization that is supposed to "follow through" on chained jumps like these and retarget directly to the final one.
Get your own Keil eval copy and see for yourself.
Well, as the saying goes, if it's asm code you want, I trust it you know where to find it.
C compilers for '51 can do some rather impressive tricks these days, but there still *is* a limit to what they can do. C99, if anyone actually decides to do it, will help a bit. Not that much through the 'inline' keyword as through the new 'restrict' which lets you help the compiler get across the performance limitations caused by aliasing.
--
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.
PUBLIC _?a TEST_OVERLAY_A_DA SEGMENT DATA OVERLAY( 0 ) RSEG TEST_OVERLAY_A_DA PUBLIC _a_BYTE _a_BYTE: DS 1 ; cha = _a_BYTE (register parameter) TEST_OVERLAY_A_PR SEGMENT CODE RSEG TEST_OVERLAY_A_PR _?a: USING 0 MOV _a_BYTE,R7 ; test_overlay.c 25 tbd; ; test_overlay.c 26 b(cha); LCALL _?x ; test_overlay.c 27 tbd; ; test_overlay.c 28 } RET
; test_overlay.c 29
EXTRN CODE(_?x) EXTRN CODE(SMALL) END
Please note that the _inline extended keyword does exactly what is expected: It places the _inline function's code in the instruction sequence instead of making a call. Very useful to save some microseconds in time critical modules. However it does increase the code size. But that is a traditional trade-off
To cite the manual: "With the _inline keyword, a C function can be defined to be inlined by the compiler. An inline function must be defined in the same source file before it is 'called'. When an inline function has to be called in several source files, each file must include the definition of the inline function. This is typically solved by defining the inline function in a header file.
Not using a function which is defined as an _inline function does not produce any code. Also during a debug session, the inlined function is not known.
The pragmas asm and endasm are allowed in inline functions. This makes it possible to define inline assembly functions. ..."
From the c: prompt and x90 = nop I gather you compiled for x86.
Are nop past a jmp spurious in x86? If spurious, why emit them despite -O3?
Perhaps this English merely says we gave splint only the same separately compiled source file view of code that cc gets without the benefit of an integrated linker, else:
Help, lost me?
// Fun to see 'Splint 3.0.1.6 --- 11 Feb 2002' results // appear as a gratis web service, thank you.
// // Now I wonder if we prefer the less incisive example:
#define FIXME() do { ((void) 0); } while (0) /* ... */
Also passes the 3.3 gcc -c -Wall -W of Mac OS X 10.3 Developer. (No splint delivered there. Monday I hope to try an x86 Linux.)
Does this make the example "incomplete"? If yes, what fix do we prefer?
How should we express the idea of a side-effect not to be omitted from the machine code, since in 8051 we have no standard libraries.
'Does this make the example "incomplete"? If yes, what fix do we prefer?'
How should we express the idea of a root entry point. Surely not the int main(int argc, char * argv[]) standard of C89 and Unix?
Aye, the tbd statements have no effect on purpose.
To express this idea of a consciously-empty-statement, I think I remember gcc folk advocate an explicit ((void) 0).
I hesitate because I remember gcc -Wall -W rejecting cast-to-void as a way of saying arg-intentially-not-used in Linux sg utils. But I see now the 3.3 gcc -c -Wall -W of Mac OS X 10.3 Developer does accept cast-to-void as a way of saying zero-intentionally-not-used.
Yes. All the same, naming parameters in a usenet post helps us refer to them.
Only when applied to subroutines actually called more than once. For subroutines called once, inline wins on code size, run time, locality, etc.
Does this actually work? Back when I was paid to work 8051, I couldn't talk here. Now that I'm not paid to work 8051, (a) I have little money for 8051 tools and (b) I won't experience a concentrated interest. The model of time-limited eval in anticipation of much money exchanged doesn't fit me now.
Thanks for the demo. I think I see:
a) LCALL followed by RET. I wonder why that's not an LJMP.
b) One byte allocated for the reused parm, not three or four.
c) Max stack depth of return-from-a and return-from-x, rather than the return-from-a return-from-b return-from-c return-from-x stack I saw before.
Google suggests we're here talking of
formatting link
Not available online? The cited caveats sound normal to me.
Thanks for the demo, results sound good.
Good. I remember seeing unreasonable JMP to JMP in machine code.
Sorry I'm not sure which "it" we mean.
Possibly I miss the point of leaving the JMP to JMP in the object.
Perhaps the overall development experience improves if only in the linker do we invest time into looking for such silliness, even though in separate compilation by definition we spend link time again for each make, not just for each compile.
Here we may have lost me.
I mean to be saying I know of people who are paying for extra chips they don't need, merely because the C compiler they chose wastes space unnecessarily.
Human compilers work better, aye, but each version is different and none are reliably available over time.
Sounds like the C99 inline keyword is gaining a following, so in time at least that trouble will go away.
gcc 3.4 indirect sibcalls are the first ray of hope I've caught for "whose 8051 cc omits the insignificant bytes of call instructions".
I find teachers most willing to help me when we both know the teacher has impressed me.
I haven't yet met an impressive compiler, not when I openly review its work with the help of a paired dis/assembler and flow analysis.
To me, on first glance, it lacked a main, #includes, etc. On second glance, it doesn't need them. However it really should have a #include of the access header, specifying the one function externally visible. Either that or a main.
... snip ...
Has to do with controlling data alignment.
The "splint -strict" run was primarily for amusement. It is only useful when you have annoted the source very thoroughly as to intention and usage etc.
Why was your reply not posted as a reply to my article? The references are fouled up.
--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
Available for consulting/temporary embedded and systems.
USE worldnet address!
No, not from the point of view of the writer. Breaking something up into logical units that perform simple understandable actions correctly facilitates writing accurate code. It prevents creating long monolythic obtuse routines.
There is usually a tradeoff point in numbers of calls where net code becomes smaller. Which is why the inlining decision is better left to the compiler, in many cases.
--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
Available for consulting/temporary embedded and systems.
USE worldnet address!
If it were an LJMP how would a return be possible ? LCALL stores the return address on the 8051' stack. If you use LJMP there is no way to "know" where to return to.
The compiler translates a call to a global object (which function x() is) to an LCALL instruction since the course of execution will have to return to this point one time or another. Yes yes yes .... I know: I am not talking about the use of function pointers or RTOS environments. That is an entirely different theater.
void a(char cha) { tbd; b(cha); tbd; }
a calls b which calls c which calls x() b and c are _inline functions.
a() is a regular function being visible throughout the application as x() is.
On the "C" side this object can be made visible to other translation units by "extern" declaration.
Whatever happens in these functions -- it has to follow a method agreed upon : this method is to return from a call and implement a call as a call and not as a goto (as you imply by asking for an LJMP instruction). Thus: LCALL not LJMP.
Somehow we're speaking past each other out of context.
I'm saying:
LCALL p ... p: LCALL q RET q: ... RET
often may equivalently be written:
LCALL p ... p: LJMP q ... q: ... RET
Am I yet more clear than mud? The second expression of this same idea requires only enough stack to fit one return address. The first, more naive, expression, wastefully requires enough stack to fit two return addresses.
I think we saw the tasking.com/ compile instead produce:
a: lcall x ret
Situations where an 8051 processor behaves better when asked to lcall, ret rather than ljmp are rare.
I use that if a subroutine called only once has some good reason to be stored elsewhere, rather than inline.
Sorry I misunderstood, not on purpose, honestly, I did and I do customarily review all the text of this thread, and my own drafts of my own text, repeatedly before posting.
The compiler cannot execute optimization on a call to an external routine of a different translation unit. (C source module)
This would mean having a feature like 'global call optimization'. That is a good idea. Thanks for the inspiration.
Since the compiler does not have a feature x() must be CALL'ed. (Please do not forget that x() has a parameter that is to be passed. The compiler has calling conventions that must be used consistently)
I am aware that there is potential for improvement.
Let me ask you a question. For the code snippet presented, what is the result of the tools available to you ?
grtnx /jan
Pat LaVarre schrieb in im Newsbeitrag: snipped-for-privacy@posting.google.com...
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.