Getting the size of a C function

Hmmmm. I hadn't thought of that watchdog idea, since TI recommends shutting off the watchdog and disabling interrupts while programming flash.

I also agree about the return not being normal---there's probably not much chance returning to the address on the stack is going to work out, so a reset is probably the best idea after a firmware update.

I should have said that I wouldn't use this idea on a function designed to run forever---or at least not one that the compiler might think runs forever. I would also examine the resulting code to make sure the compiler was doing what I intended.

I think that the ideas I have described will work on some processors and compilers for some functions, but not on all compilers for all processors and functions. If you do a lot of embedded systems programming, restrictions like that are nothing new.

Mark Borgerson

Mark Borgerson

Reply to
Mark Borgerson
Loading thread data ...

Closer to comp.arch.embedded, &func may not be the memory address of a function on smaller micros with more than 64KB (or sometimes 64K words) of flash. gcc for the AVR, for example, uses trampolines for function pointers on devices with more than 64K words flash - &func gives the address of a jump instruction in the lower 64K memory, which jumps to the real function. That way you can use 16-bit function pointers with larger memories.

Reply to
David Brown

Admit it, you do something that can't be done in C. By far the simplest is to generate assembler code, and add a small instrumentation to that. Start by accessing the function through a pointer to subroutine. Then you can store an sram address there when needed.

--

--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Reply to
Albert van der Horst

Anything that relies on the compiler being stupid, or deliberately crippled ("disable all optimisations") or other such nonsense is a bad solution. It is conceivable that it might happen to work - /if/ you can get the compiler in question to generate bad enough code. But it is highly dependent on the tools in question, and needs to be carefully checked at the disassembly level after any changes.

In this particular example of a highly risky solution, what happens when the compiler generates proper code? The compiler is likely to generate the equivalent of :

int MoveMe(..., bool findend) { if (findend) "jump" Markend(); // do all the stuff }

Or perhaps it will inline Markend, MoveMe, or both. Or maybe it will figure out that MoveMe is never called with "findend" set, and thus optimise away that branch. All you can be sure of, is that there is no way you can demand that a compiler produces directly the code you apparently want it to produce - C is not assembly.

Reply to
David Brown

You get good and bad compilers for all sorts of processors, and even a half-decent one will be able to move code around if it improves the speed or size of the target - something that can apply on any size of processor.

I don't know about typical "comp.lang.c" programmers, but typical "comp.arch.embedded" programmers use compilers that generate tight code, and they let the compiler do its job without trying to force the tools into their way of thinking. At least, that's the case for good embedded programmers - small and fast code means cheap and reliable microcontrollers in this line of work. And code that has to be disassembled and manually checked at every change is not reliable or quality code.

Reply to
David Brown

On 24 Jan, 21:44, David Brown wrote: ...

I *think* Mark is aware of the limitations of his suggestion but there seems to be no C way to solve the OP's problem. It does sound like the problem only needs to be solved as a one-off in a particular environment.

That said, what about taking function pointers for all functions and sorting their values? It still wouldn't help with the size of the last function. Can we assume the data area would follow the code? I guess not.

James

Reply to
James Harris

You give me a great way to segue into something. There are cases where you simply have no other option than to do exactly that. I'll provide one example. There are others.

I was working on a project using the PIC18F252 processor and, at the time, the Microchip c compiler was in its roughly-v1.1 incarnation. We'd spent about 4 months in development time and the project was nearing completion when we discovered an intermittent (very rarely occurred) problem in testing. Once in a while, the program would emit strange outputs that we simply couldn't understand when closely examining and walking through the code that was supposed to generate that output. It simply wasn't possible. Specific ASCII characters were being generated that simply were not present in the code constants.

In digging through the problem, by closely examining the generated assembly output, I discovered one remarkable fact that led me to imagine a possibility that might explain things. The Microchip c compiler was using static variables for compiler temporaries. And it would _spill_ live variables that might be destroyed across a function call into them. They would be labelled something like __temp0 and the like.

There was _no_ problem when the c compiler was doing that for calls made to functions within the same module, because they had anticipated that there might be more than one compiler temporary needed in nested calls and they added the extra code in the c compiler to observe if a decendent function, called by a parent, would also need to spill live variables and would then construct more __temp1... variables to cover that case. Not unlike what good 8051 compilers might do when generating static variable slots for nested call parameters for efficiency (counting spills all the way down, so to speak.)

However, when calling functions in _other_ modules, where the c compiler had _no_ visibility about what it had already done over there on a separate compilation, it had no means to do that and, of course, there became a problem. What was spilled into __temp0 in module-A was also spilled into __temp0 in module-B and, naturally, I just happened to have a case where that became a problem under the influence of interrupt processing. I had completely saved _all_ registers at the moment of the interrupt code before attempting to call any c functions, of course. That goes without saying. But I'd had _no_ idea that I might have to save some statics which may, or may not, at the time be "live."

Worse, besides the fact that there was no way I could know in advance which naming the c compiler would use in any circumstance, the c compiler chose these names in such a way that they were NOT global or accessible either to c code or to assembly. I had to actually _observe_ in the linker file the memory location where they resided and make sure that the interrupt routine protected them, as well.

This required me to document a procedure where every time we made a modification to the code that might _move_ the location of these compiiler generated statics, we had to update a #define constant to reflect it, and then recompile again.

Got us by.

Whether it is _reliable_ or not would be another debate. The resulting code was very reliable -- no problems at all. However, the process/procedures we had to apply were not reliable, of course, because we might forget to apply the documented procedure before release. So on that score, sure.

Life happens. Oh, well.

Jon

Reply to
Jon Kirwan

You'd need to sort *all* the functions of an application (include non-global functions), and there would still be the possibility that some function or other stuff you don't know about resides between 'consecutive' functions f() and g().

Reading f() might be alright but overwriting it would be tricky.

--
Bartc
Reply to
bartc

In general, no universally "good" assumptions exist. Partly also because the very idea itself of "moving a function" in memory at run-time is itself not yet well-defined by those talking about it here.

Any given function may have the following:

code --> Code is essentially strings of constants. It may reside in a von-Neumann memory system or a Harvard one. It therefore may be readable by other code, or not. Many of the Harvard implementations include a special instruction or a special pointer register, perhaps, to allow access to the code space memory. But not all do. In general, it may not even be possible to read and move code. Even in von-Neumann memory systems where, in theory there is no problem, the code may have been "distributed" in pieces. An example here would be an implementation I saw with Metaware's c compiler where they had extended it to support a type of co-routine called an 'iterator.' In this case, the body-block of a for-loop would be moved outside the function's code region into a separate function so that their implementation could call the for-loop body through their very excellently considered support mechanism for iterators. You'd need to know where that part was, as well, to meaningfully move things.

constants --> A function may include instanced constants (which a smart compiler may "understand" from something like 'const int aa= 5;', if it also finds that some other code takes an address to 'aa'.) These may also need to be moved. Especially if one is trying to download an updated function into ram before flashing it for permanence as a "code update" procedure. These constants may also be placed either in von-Neumann memory systems and be accessed via PC-relative or absolute memory locations -- itself a potential bag of worms

-- or in Harvard code space if the processor supports accessing it or in Harvard data space, otherwise, especially if there is some of that which is non-volatile.

static initialized data --> A function may include instanced locations that must be initialized prior to main(), but where the actual values of these instances are located in some general collection place used by who-knows-what code in the crt0 library routine that does this job of pre-initing. Once again, more issues to deal with and wonder about.

And that's just what trips off my tongue to start.

It's a tough problem to solve generally. To do it right, the language semantics (and syntax, most likely, as well) itself would need to be expanded to support it. That could be done, I suppose. But I imagine a lot of gnashing of teeth along the way.

Jon

Reply to
Jon Kirwan

...

...

Since you've commented, Bart, do you have any thoughts on making metadata about functions available in a programming language? Maybe you already do this in one of your languages.

The thread got me thinking that if a function is a first-class object perhaps some of its attributes should be transparent. Certainly its code size and maybe its data size too; possibly its location, maybe a signature for its input and output types. Then there are other attributes such as whether it is in byte code or native code, whether it is relocatable or not, what privilege it needs etc.

If portability is not needed a function object could also be decomposed to individual instruction or subordinate function objects. I'm not saying I like this idea - portability is a key goal for me - but I'm just offering some ideas for comment.

Any thoughts on what's hot and what's not?

Followups set to only comp.lang.misc.

James

Reply to
James Harris

That's true. But it is also true that you can verify that a particular compiler DOES produce the desired code an use that code effectively. For embedded programming, it doesn't particularly matter if 50 other compilers don't produce what you want, as long as the compiler your are using does.

Mark Borgerson

Reply to
Mark Borgerson

None of that is at odds with writing a flash update routine once, verifying that the end of the code is properly marked and using the code. If you are worried about changes in optimization levels for future compiles, you can generate a binary library for the flash update function and link that in future applications. AFAIK, linking a library does not generally result in any change in the binary code of the library if the library was generated from position-independent code. (And, if your going to copy a function to a differnt location for execution, it had better be position-independent.)

That said, I will have to look very carefully at some of the MSP430 code that I have generated---the compiler may access I/O locations using PC-relative addressing. That would totally mess up code that got simply copied to RAM. However, that's an altogether different problem than simply finding the length of the function.

Mark Borgerson

Reply to
Mark Borgerson

In embedded development, /every/ rule has an exception, except this one :-).

There are definitely times when you have to manually check your outputs, or write code that only works with specific compiler options, or add assembly code hacks that rely on details of the compiler working. But you don't do it unless you have no better way - you certainly don't design in your hacks at the first step.

Another rule for embedded development is always know your tools, and preferably pick /good/ tools. Microchip are known to be good for many things - the quality of their 16-bit PIC C compilers is definitely not one of them.

Reply to
David Brown

You are correct that there is no standard C way to solve the problem. But for the majority of compilers used in embedded development, there are ways that will reliably solve this problem when working /with/ the compiler, rather than /against/ the compiler. We are not trying to get a highly portable solution here, but it is always better to find a design that could be reused if possible. And it is always better to work with the features of your toolset, especially when there is no standard C solution, rather than trying to find ways to limit your tools.

For this problem, the best solution is generally to use a specific section for the functions in question. This can often be done using the gcc "__attribute__" syntax (even for non-gcc compilers), or by using compiler-specific pragmas. Any tools suitable for embedded development will support something to this effect, and give you control over the linking and placement of the function (this is assuming, of course, you are working with a microcontroller that supports execution from ram).

The details of how you do this depend on the situation. For example, you may be happy to dedicate the required ram space to the function, or you may want to copy it into ram only when needed. The former case is the easiest, as you can arrange for the linker to put the code in flash, but linked as though it were in ram. There is no need for any position-independent code, and you can happily debug and step through the code in ram. You can often "cheat" and put the code in the ".data" section, then you don't even have to think about the linker file or copying over the function - the C startup code handles that (since it treats the function like initialised data). With gcc on the msp430, you have your function defined something like this:

static void critical __attribute__ ((section(".data"))) progflash(...)

Of course, you still have to ensure that the function doesn't call other functions - or that these are also in ram. And it is worth checking the disassembly here if you are not sure - it is easy to accidentally include library functions calls. But the difference is that you have a reliable and safe way to achieve the effect you want, that is independent of details such as the compiler flags or precise compiler version, and will continue to work even if the source is changed. Because you are working /with/ the tools, you can take full advantage of debugging and optimisation. And though the details may vary for different processors or toolchains, the principle can be re-used. As with all code that cannot be implemented in standard C, there is always the possibility of this solution failing with future compilers or different devices, and you must check the results carefully - but this is the best you can get.

You can't make any assumptions about the ordering of code or data. You cannot practically speaking make function pointers for all functions without a great deal of effort, and making an unnecessary pointer to a function cripples the compiler's optimisations of that function and functions that call it.

Reply to
David Brown

True enough - but it /does/ matter that the compiler you are using produces the code you want each of the 50 times you change and compile the program, or when you change the compiler flags and compile them, or when you update the compiler and recompile (I recommend keeping exactly the same compiler version for any given project, but sometimes that is not practical). If you have code that relies on working around the compiler, you need to check it /every/ time, and you are never able to take advantage of your tools to generate the best code.

Reply to
David Brown

:)

Just to be argumentative (no other good reason, really), one of my applications requires equal execution times across two code edges. In other words, the execution time must be constant regardless which branch is taken. c doesn't provide for that, quite simply. So the very first thing I do porting this application to a new processor is to ensure that I can achieve this well, or if not, exactly what the variability will be (because I must then relax the clocking rate to account for it.) It's one of those unknowns that must be locked down, immediately.

So yes, I hack at the very first step in this case. But I'm just toying. In general, I take your point here.

.....

As an aside, one of the first things I may do with a new c compiler and target is to explore methods to support process semantics. The c language doesn't provide quite a number of very useful semantics, this being one of them.

(Another I enjoy the use of is named, link-time constants. They are not variable instances, in case you are confused about my wording here. Instead, they are much like #define in c except that these constants are link-time, not compile- time, and if you change them there is no need to recompile all the c code that uses them. You just change one file that creates those constants and re-link. The linker patches in the values, directly. Saves recompile time. Probably every assembler supports them, and every linker _must_ support them. But c does not provide syntax to access the semantic that is available in its own linker.)

With cooperative switching (and I use that where possible, because it is much easier to implement and support) I may be able to write fairly simple routines in assembly to support it (a dozen lines, or two.) But there is no escaping the idea that whatever I do there relies on details about the compiler. Different compilers on the MSP430, for example, make different choices about register assignments, which must be preserved across calls, which are scratchable, and which are used to optionally pass parameters (and the conditions under which registers may be chosen to pass them.)

With pre-emptive switching, it opens up a Pandora's box. Library routines that may use static memory, for example. But if pre-emptive switching is a part of the product, then I face the problems squarely and usually up front in the development. It's crucial to know exactly what works and how well it works, right away.

I also enjoy the use of coroutine thunking, from time to time. This, and process semantics, make for clear, very readable code that works well and is able to be maintained by a broader range of programmers (so long as they don't try and rewrite the core o/s code, of course.)

I still take your point. But I hope you don't mind a small moment of banter just to add to your suggestion that every rule has exceptions, including the rule of not hacking things at the outset. ;)

Well, there is that. I cannot defend their use of _static_ memory for compiler temporaries, as they chose to do. It's unconscionable. Their argument to me (one or two of those who actually _wrote_ its code) was that it led to faster emitted code -- in short, it appeared to show off their parts better. And they felt they "had it covered."

Well, they were wrong and a false bargain was made.

I'm sure they aren't the only ones guilty of choosing to sell the smell of sizzle over the quality of meat, though. Not by a long shot.

Jon

Reply to
Jon Kirwan

Being argumentative /is/ a good reason if it makes us think.

That's an example of when you need special consideration. My point is that you only do that sort of thing if you have no better way to implement the required functionality.

Are you talking about using constants in your code which are evaluated at link time, much in the way that static addresses are handled? Maybe I've misunderstood you, but that strikes me as a poor way to handle what are really compile-time constants - it's bad modularisation and structure (sometimes a single file is the best place to put these constants - but it should be because that's the best place, not because you want to fit some weird way of compiling). It is highly non-standard, potentially leading to confusion and maintenance issues. It also limits the compiler's options for optimising the code. And if re-compilation time is a serious issue these days, you need to consider getting better tools (PC and/or compiler), or making better use of them (better makefile setup, or use ccache).

Of course, it is always fun getting your tools to do interesting things in unusual ways - but it's not always a good idea for real work.

Yes, these are more examples of where you need to work with the compiler details.

It is certainly perfectly possible to use static memory for compiler temporaries, and it will certainly be faster than the normal alternative (temporaries on a stack) for many small processors. But it has to be implemented correctly!

Reply to
David Brown

First off, I think you might be confusing me with the OP, and he did cross post to comp.arch.embedded. Anyway, I agree that this is a lot trickier than just using the sizeof operator or doing some pointer math. That's why it is a head scratcher. 'I' think that this is a very target/compiler specific issue/problem, but I have been wrong before. That's why I left it open for someone that may have already cracked this nut. I don't have any experience doing what the OP wants to do, and haven't experimented in any way shape or form. I think I would write the function in assembly so I knew EXACTLY what was going on and I knew there were no external dependencies on library code or jumps to other functions, and go from there.

Reply to
WangoTango

code).

Unless, you put the function into a separately-compiled library to be linked in when you build the program the next 50 times. If you change compilers, you may have to rebuild and verify the library.

Mark Borgerson

Reply to
Mark Borgerson

Understood, and agreed.

Well, you are of course correct in the sense that a specific constant value shouldn't be scattered throughout a series of modules like casting dust to the winds. It's not a good idea. Your point is wisely made. However, you are also wrong in suggesting, once again, some absolute rule that _always_ applies. In this case, my point remains because there is _some_ need for the semantic. It doesn't matter if there are better ways for most things, if there are some times a need for this semantic.

I think you understood me, correctly. Just in case there is any question at all, I'm talking about this semantic, if you are familiar with the Microsoft assembler:

QUANTUM EQU 47 PUBLIC QUANTUM

You can't do that in c. There is no syntax for it.

In the above example, this constant might be the default number of timer ticks used per process quantum in a round robin arrangement. But as you say, you are correct to suggest that this kind of value usually only needs placement in a single module, so the advantage may arguably be reduced to a theoretical one, not a practical one. (Though I suppose I could always posit a specific case where this QUANTUM might be used in several reasonable places.)

However, there are times where there are values which may be required in several modules. These may be field masks and init values, for example, of hardware registers or software control flags. It's not always the case that writing a specific subroutine to compose them for you is the better solution. Sometimes, it's better to expose the constants, broadly speaking, and use them in a simple, constant-folding, c language way. Libraries in c are riddled with these.

In addition, these public link-time constants can be used to conditionally include or exclude code sections. In fact, almost every compiler uses this fact in one way or the other. CRT0, in particular, may take advantage of such features to conditionally include or exclude initialization code for libraries which may, or may not, have been linked in. And most linkers support the concept in some fashion -- because it is needed.

And yes, I'd sometimes like c-level access to it.

Yes. No question.

Well, _if_ one is going to use statics _then_ of course it has to be implemented correctly! Who could argue otherwise?

The problem is in the _doing_ of that. It requires (or at least I imagine so, right now, being ignorant of a better way) looking at the entire program block to achieve. And that is a bit of a step-change away from the usual c compiler mode of operation. It _might_ be implemented in the linker stage, I suppose. Though I'm struggling to imagine something a little less than a Rube Goldberg contraption to get there in the linker side.

As an aside, I have a lot of other things I'd like in c or c++ which just aren't there. For example, I dearly miss having access to thunking semantics in c or c++ (which does NOT break the c/c++ program model in any way, shape, or form and could easily be implemented as part of either language with no dire impacts at all. I might use this for efficient iterators (don't imagine that I'm talking about std library iterators here, which are somewhat similar in use but in no way similar in their implementation details -- they are much less efficient), so also do I miss this. There is no good reason I can think of not to have it and its utility is wonderful. (I'd be so happy to talk about it, at some point, as the examples are excellent and easily shown.)

Jon

Reply to
Jon Kirwan

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.