It sounds as if you are comparing experienced, knowledgeable embedded assembler programmers with C programmers who don't have a clue about using C in embedded systems. A better comparison, surely, would be to compare to an experienced, knowledgeable embedded C programmer -- the kind who has a very good idea of the machine code that will result from every line of his C source.
You are arguing... The generality of the example means that any of the meaningless instructions can be replaced with a sequence of useful instructions.
What human factor? We are comparing two languages, not two humans.
Is your GIF code an actual example where you got a factor of 10? If so, compared to which C code? If not, how did you derive the factor of 10?
It's probably easy to locate some badly written GIF code, a good version might be more difficult... In any case both versions would need to adhere to the same spec.
compete with asm. There is no mechanism in the language to specify that the elapsed time between two points in code should always be a specific amount of time. It is also not an
I don't like that method. As Walter points out, assembly makes sense when you need predictable timing of sections of code, as the semantic is simply missing from C.
For example (and I can supply a nicely prepared white paper on the subject to those interested), in some cases the speed at which an arbitration scheme can operate depends largely on the predictability of the open-collector/open-drain arbitration line operation by each processor. In such cases, and note here we are talking about open drain outputs where a '1' bit is often handled slightly differently than a '0' bit when driving, you want the time required to output a '1' and the time required to output a '0' in some series of bit values to be _exactly_ the same. Not even one cycle different, if possible. You cannot specify this in C. It doesn't have way for one side of an if branch to be specified as taking the same cycle count as the other. It's possible to add this to C. But so far, no compilers offer it.
This example is one of several where the semantic scope of C fails, since it simply has no way to direct the compiler towards a goal.
However, your suggestion above also isn't good enough. I need to have the least number of possible cycles on this processor -- the least number where both branches have exactly the same timing, that is.
I can't afford to have two function calls, associated overhead, etc. And frankly, I don't believe that the implementors of such things would truly get this down to exact cycle count precisions. They'd screw up, almost for sure. Finally, I don't want to try 320, then when that works, try 310, then when that works, try 280, etc. This kind of playing around is painful and I wouldn't want any part of it.
What makes sense is blocking out some code between two #pragma's or specially interpreted comment lines, as is used by lint. Which to use, I don't know. But the Bulldog compiler illustrates some means by which it was actually done, I think.
Code between the #pragma's would be required to have a fixed execution time, regardless of any code edge transitions within it. Simple. It would be the compiler optimizations which would attempt to find the minimum possible timing. But it doesn't have to be perfect in that sense, just as no C compiler is now required to always produce the fastest possible implementation. Just fixed.
I've only had exactly one case in my entire professional life where I was permitted to implement the exact same application, from top to bottom, in both assembly and C. It was on a PIC processor using Microchip's C compiler tools and their assembly under MPLAB. I was the only programmer on the project and the first implementation was in assembly and had to fit into a processor with 4k of code space. The second implementation was to be placed into the then newer PIC18 with
32k of code space with an eye to start adding features, once it was exactly ported into C at the start.
This kind of real-life application test doesn't happen that often.
We'd first decided upon the assembly route in the earlier incarnation because of some tests we did with the earlier C compiler and the very limited code space we had available to us in the available PICs that were appropriate to the design. At that time, the PIC18 line of parts was 'very new' and we still hadn't even been able to get samples. So there was no way to consider it, seriously. We kept our eye on them, though.
In this case, also, I was the only programmer. I've been using C since 1978 and I've actually written a toy C compiler on my own, so I hope you can accept the fact that I do know a little about how to use it. I knew the application well from having written it already in the first place. So going into the C incarnation, it was probably the best of possible circumstances for the C side -- I knew all the details of various functions needed. Similarly, the assembly had fit into just 2k. After writing a nicely designed equivalent in C, the footprint was just over 12k, by comparison. Data footprint wasn't that much different, to be honest.
The time I had available for writing both the assembly version and the C version was also similar, by the way -- four months or so. One of the time-wasters in writing the C had to do with the compiler's use of static compiler temporaries. This was bad news for interrupt events calling C, because the compiler's live variable analysis was not able to cover that circumstance. And it cost me time to track that down and design a work-around for the case. Others just had to do with learning all the #pragma's needed to deal with variable placement, etc.
Now, I'm a very experienced assembly code writer, too. I would put myself against any C compiler on small or large programs without any anticipation worries that I wouldn't be able to beat the output on any measure enough so that anyone else looking at it from the outside would agree that it was enough better to be worth having, and on a similar schedule.
One of the advantages in writing assembly is that your semantic options are wider. All tools are two-edged swords. C's advantages are also its disadvantages. Same for assembly. But assembly does unarguably have wider semantic options, whether we are talking about extremely miniscule details such as access to applying status bits in ways that C cannot directly support (and if you imagine I mean clearing or setting some status bit you have no clue what I'm talking about here), to mid-level semantic choices such as exact timing regardless of code edge taken, or large-scale semantic choices such as the fixed and varying assignment of registers and value passing modes, mixing various styles of function prologues/epilogues, or coroutine semantics (simply unavailable in C.) Whether or not these mean as much to your application as some of the benefits of C is another matter.
And none of any of this means one must use either C or use assembly. Most of my applications use both, to be honest. So for me, it's an amalgam that works more often. But one of the really big reasons for using C, is that other programmers for C are easier to find. And that can be very important.
But there are times and places for assembly code. And some applications are so competitive that the nickels and dimes are important, or the power consumption is important, or the die size is important, or... and assembly can at times make the difference.
My hope is that everyone be proficient at both.
I don't know if anyone else here can say that they actually were put in a position of doing the exact same project twice -- once in assembly and once in C -- that they had excellent experience in using both and were competent at both and can then make a real comparison of a real world case. But I have had that experience once. And there was a remarkable difference in code side (not data size.)
I've also challenged myself in writing code snippets in both. I suppose a lot of us may have done that. I'd be happy to provide one such example here and let anyone try their compilers on it and see what I did in assembly, by comparison. But others will legitimately argue this proves nothing and I'd agree with them. Still, it may open an eye or two.
Meanwhile, C is a very good choice for many if not most applications, where a good C tool is available.
That's not what I was referring to. I was referring to constructs like __SP__ to refer to the stack pointer (different meaning of intrinsic, perhaps you use a different term). I see that as non-C and certainly non portable. Similary intrinsics that access status flags.
Not the same thing at all as forcing the result to returned in the carry flag. C pretty much requires wider return values, although maybe you could come up with something using C99's new boolean type.
I sometimes drop to asm (your 1% is not a bad estimate) for performance but more usually for interrupt epilogue/prologue and task switching. I know those items are non-portable and I prefer to have precise control over what is happening in those cases.
Actually, most of them aren't. There are some "intrinsics" like or the offsetof() macro that are standardized, but stuff like __nop(), __interrupt(), or __rotate_with_carry() aren't. Not even close.
The majority of them aren't.
And most programmers would never know, because they would have written
xa = yb + zc;
, i.e. used the next-highest data type, and let the compiler take care of carry.
--
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.
They seem to be non-portable even between versions for different processors. Ulf mentioned __delay_cycles on the IAR (for AVR), doesn't show on the IAR version for the MSP430. But it does offer __set_SP_register.
sounds as if you are comparing experienced, knowledgeable embedded
Who ever said such a thing? C is no magic bullet that guarantees the practicioner will understand embedded hardware, software and systems, and the ways to squeeze the most performance out of the whole package. Forget assembler, you could just as well claim that a good embedded C programmer will produce code that is 10x smaller and/or faster than a C programmer who doesn't have a clue about embedded programming. You might even get a lot more takers on that claim, BTW.
cannot compete with asm. There is no mechanism in the language to specify that the elapsed time between two points in code should always be a specific amount of time. It is also not an
I am inclined to agree. On the other hand I've been lucky enough not to need to control timing via instruction timing. If my timing needs are that dire I usually look to hardware.
Not my suggestion really. Just commenting that if you needed a construct to synchronize your timing that was a useful syntax. I can think of cases where someone might find it useful. Communications protocols come to mind where you want to sample exactly n cycles after a certain point. If you just want to be consistent on multiple paths that's a different problem. I understand some crypto functions might benefit from the latter.
Why would there be function calls. For instruction level timing I would expect the compiler to implement it internally.
That way lies madness no question. I don't even trust interrupt keywords so it's not likely I'll use the construct myself anytime soon.
I am rather curious as to how well it might be done by a compiler though, so I'm happy to have set up a straw man.
cannot compete with asm. There is no mechanism in the language to specify that the elapsed time between two points in code should always be a specific amount of time. It is also not an
I think the Bulldog compiler is a rich source of ideas on this score. It had to deal with different timings on adjacent DRAM banks, which means it had to know about this when compiling. It had to deal with pushing up code blocks across code edges and conditionally toss the results. It needed programmer information about which branches were more likely and could aide in generating that information. Etc. It did quite a bit. It was designed for VLIW, but some of the ideas may be appropriate for today's embedded use -- or at least provide one possible model. A lot of new(ish) compiler idea details were exposed there.
My preference is something akin to this pidgeon code:
failure= 0; for ( i= 0; i < 7; ++i ) { #pragma begin_fixed_execution_time if ( v & mask ) { port.bit= 1; tris.bit= 1; nop; /* simulated single-cycle delay */ nop; nop; if ( port.bit == 0 ) { failure= 1; break; } } else { port.bit= 0; tris.bit= 0; /* no sampling required */ } #pragma end_fixed_execution_time v >>= 1; }
Where I don't have to care about anything except that the per-bit timing is fixed.
"Wilco Dijkstra" schreef in bericht news:S4bXg.2759$ snipped-for-privacy@newsfe3-gui.ntli.net...
What would it look like... a circular buffer in which the squared readings are stored, update the sum of all values stored in the buffer and divide by buffer lenght, and calculate a square root of that average as final output result. Repeat for each new value stored in the circular buffer.
Of course you would use a buffer with 2^n lenght, to keep it all as easy as possible. What is left, is a series of adds, shifts and subs.
It's not even an interesting example. It's just one of those tasks that scream for optimisation because such tasks often run at a high repetition rate and when done sloppy it would use too much processing power. So there is real work to be done, no matter if you do it in C or ASM.
The big advantage (imo) using C is with the rest of application. Writing code for that is relative easy, since it is easy you can allow yourself to make it flexible, add nice features, add clever features, make it behave a bit intelligent, resulting in a much better product.
I'm not interested in his 60 byte mean and lean code. Everybody can do that. I want to see the remaining part of his application.
--
Thanks, Frank.
(remove 'q' and '.invalid' when replying by email)
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.