Getting the size of a C function

- J
- john
  
  Contact options for registered users
posted
14 years ago

Fri, Jan 22, 2010 10:53 PM

Hi,

I need to know the size of a function or module because I need to temporarily relocate the function or module from flash into sram to do firmware updates.

How can I determine that at runtime? The sizeof( myfunction) generates an error: "size of function unknown".

Thanks.

- D
- David Empson
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Fri, Jan 22, 2010 11:03 PM

In general, C does not provide a mechanism to find the size of a function. Some compilers might implement sizeof(function) but it is not standard C.

If your compiler always outputs functions to the object code in the same order as they appear in the source code, you could take the address of the next function and the address of the function in question, convert them to (char *) and get the difference between them. This assumes you never rearrange your source code - comment well!

If your compiler outputs functions in a somewhat unpredictable order then this won't work.

The technique I used for a similar problem was to examine the object code to determine the size of the function manually, added a safety margin to allow for potential code growth, and embedded that as a constant in the source code. It then needs to be re-checked after source changes (or a revised compiler) to confirm that the size hasn't grown too much.

--
David Empson
dempson@actrix.gen.nz

- B
- Bob
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Fri, Jan 22, 2010 11:03 PM

If you give us the details on your target and tools, someone here will surely be able to help you do the thing you *actually* want to do. Bob

- W
- WangoTango
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Fri, Jan 22, 2010 11:04 PM

Good question, and I would like to know if there is an easy way to do it during runtime, and a portable way would be nice too. I would probably look at the map file and use the size I calculated from there, but that's surely not runtime.

You can get the starting address of the function pretty easy, but how about the end? Hmmm, gotta' think about that.

Jim

- J
- jacob navia
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Fri, Jan 22, 2010 11:50 PM

john a écrit :

(1) There is the method already mentioned that subtracts two function addresses. If your compiler is "well behaved" that could work except for the last function in the module...

(2) Another method is to generate an assembly listing and insert at the end of each function a "marker" by just using (the equivalent) of .byte 0,1,2,3,4,5,6,7,8,9,8,7,6,5,4,3,2,1 Then, at runtime you load the code and search for the terminator marker Obviously the terminator should contain at least one illegal instruction to be sure that it doesn't appear in the code itself

(3) Yet another method is to generate a linker map table and read the size of each function from the table, what comes to method (1) but at compile time.

(4) Another method is to locate all function prologues and function epilogues ofthe functions in the code you generate. Locating the prologue means searching for the sequence of instructions that the compiler generates for each function start, probably the saving of some registers and the allocating of stack space for the local variables. Caveat: It could be that for certain functions the compiler doesn't generate any prologue... specially if the function doesn't call any other functions and receives no arguments...

Locating the epilogue means searching for the return instruction Caveat: It could be that the compiler generates several... You should find the last one, before you find a prologue.

From all those possibilities, the option (3) looks the more promising one to me. Method (1) isn't very precise and there is the problem of the last function in a compilation unit.

Method 2 is a PITA since you have to generate the assembly, insert the markers, re-assemble...

Method (4) needs a disassembler, and a LOT of parsing work, and it is very sensitive to compilation options.

- V
- Vladimir Vassilevsky
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Fri, Jan 22, 2010 11:54 PM

Several times I encountered the following construction: //-------------------------- your_function() { }

next_function() { }

*fu() = next_function; *bar() = your_function;

size_of_your_function = ((int)fu) - ((int)bar); //-------------------------

Of course, this is not guaranteed to work as it depends on many things, however I've seen that solution used in bootloaders.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

- B
- BGB / cr88192
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 12:40 AM

my recommendation: in this case, it might actually be better advised to generate the function as a chunk of arch-specific ASM or machine code (ASM is preferable IMO, but requires an assembler...), which could then be located wherever (such as the heap).

the reason for suggesting this is that, for many archs, relocating compiled code (for example, via memcpy) may very well cause it to break. at least with custom ASM, one can be more certain that the code will survive the relocation.

another possibility would be to compile some code itself as a relocatable module (such as an ELF or COFF object or image or whatever is common on the arch), which can then be stored as a glob of binary data (this can be done fairly easily by writing a tool to convert the module into a an array of bytes in C syntax which can be linked into the image). when needed, this module is itself relocated to the target address, and jumped to.

this would allow more complex modules to be used (and is less-effort in the non-trivial case than would be writing it in ASM or raw machine code).

keep in mind that there is no really "good" or "general purpose" ways to do these sorts of tasks.

- G
- Grant Edwards
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 1:32 AM

IMO, the "right" thing to do is to tell the compiler to put the function into a separate section and then have it linked so that it's "located" to run in RAM at the proper address but stored in ROM.

That way you know the code will work correctly when it's run from RAM. Defining approprate symbols in the linker command file will allow the program to refer to the start and end of the section's address in ROM.

The OP needs to spend some time studying the manuals for his compiler and linker.

- K
- Keith Thompson
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 1:40 AM

You can't even portably assume that &func is the memory address of the beginning of the function. I think there are systems (AS/400) where function pointers are not just machine addresses.

Given whatever it is you're doing, you're probably not too concerned with portability, so that likely not to be an issue. But there's no portable way in C to determine the size of a function, so you're more likely to get help somewhere other than comp.lang.c.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org  
Nokia
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 4:23 AM

I've seen it done like this:

whatever my_eeprom_burning_code() { // insert deathless prose here }

void end_my_eeprom_burning_code(void) { }

As long as the second function doesn't get optimized away or moved, you're home free.

--
www.wescottdesign.com

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 4:26 AM

Check your tools -- newer ones will often let you set the segment of a function (usually with something like "#pragma ramcode"), and many of those will automatically load a function into ram at startup.

If you can't do it on a function-by-function basis, you may be able to do it file-by-file, or coerce the linker to relocate the text segment from one whole object file into a segment of your choosing.

Then you either put that segment (with just your magic function(s)) into RAM, or you find out that your linker will.

--
www.wescottdesign.com

- D
- David Empson
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 5:07 AM

Except if the compiler outputs the functions in reverse order, as one I've used does (which means you need a "begin_my_eeprom_burning_code" dummy function instead). You need to know the pattern generated by your particular compiler, which might depend on factors other than the order the functions appear in the source code.

--
David Empson
dempson@actrix.gen.nz

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 5:10 AM

Do you need to be able to run it from RAM? If so, simply memcpy()ing it may not work. And you would also need to copy anything which the function calls (just because there aren't any explicit function calls in the source code, that doesn't mean that there aren't any in the resulting object code).

- M
- Mark Borgerson
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 5:28 AM

At the expense of a few words of code and a parameter, you could do

int MoveMe(...., bool findend){ if(!findend){

// do all the stuff the function is supposed to do

} else Markend();

}

Where Markend is a function that pulls the return address off the stack and stashes it somewhere convenient. Markend may have to have some assembly code. External code can then subtract the function address from the address stashed by Markend(), add a safety margin, and know how many bytes to move to RAM.

Mark Borgerson

- P
- Paul Keinanen
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 5:57 AM

Do you actually want to execute that function while in RAM or just store some bytes into safety during the update ?

If you intend to execute, the code must be position independent e.g. with PC relative branch, however accessing some fixed addresses, such as memory mapped peripherals, some absolute addressing mode must be used, no PC relative addressing modes can not be used.

- J
- jacob navia
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 8:03 AM

Mark Borgerson a écrit :

Sorry Mark but this is totally WRONG!

The return address contains the address where the CPU RETURNS TO when the current function is finished, not the end of the current function!!!

The return address will be in the middle of another function, that CALLED this one.

- G
- Gordon Burditt
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 8:44 AM

There might not even be a clearly-defined *definition* for "the size of a function". One obvious problem is if it's inlined several times in several places. Where, exactly, does the function "start"? Where does it "end"?

It's also possible that code is shared between functions. If several functions have many places where the code does something like:

errno = EINVAL; unlock_critical_section(); return -1;

the compiler might generate code for that once and branch to it from all of the functions. This is more likely to happen if "return" isn't a one-instruction method of restoring registers, a stack frame pointer if applicable, and adjusting the stack, if there is one. It's the embedded processors that are more likely to need multiple instructions to return from a function. They are also more likely to need aggressive optimization for code space.

You could end up with the strange math that a compilation unit containing 4 functions has 16k of code, but any one of the functions need 10k of code.

That's a rather dubious assumption in the presence of inlining (and C90 doesn't have a way to stop inlining). Also, the assumption that a function consists of a contiguous block of code (worry about the *data* later) dedicated to that function only is not guaranteed to hold. In practice, it will probably work OK if you don't turn on aggressive optimization.

- M
- Mike Harrison
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 11:02 AM

Another way to do it would be to specify the function to go in a specific user-defined segment of fixed size at a fixed adress. The linker will then at least tell you if it has exceeded the allocated size. Most microcontroller-orientated compilers will have a way to do this as it is a common requirement for bootloaders etc.

- S
- Stefan Reuther
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 12:05 PM

You can't in standard C, because functions are not contiguous objects.

Most environments have some way of placing a function in a special section (using pragmas or things like __attribute__), and a possibility to acquire position and size of that section (using linker magic).

In general, you cannot assume a function generates just a single blob of assembly code in the ".text" sections. For example, functions containing string or floating-point literals, or large switches, often generate some data in ".rodata", static variables end up in ".data" or ".bss", and if you're doing C++, you'll get some exception handling tables as well.

Stefan

- F
- Flash Gordon
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 23, 2010 12:15 PM

You forgot to mention the method which, in my experience, is by far the best, most reliable, and easiest method.

Read the manual!

This is NOT a glib suggestion, on the one occasion where I needed to do something similar, but for different reasons, I read the manual and low and behold the implementation documented a nice and relatively easy way to achieve when I wanted. In fact, using any other method was almost guaranteed to produce a function that did not work correctly. After all, there could be references to absolute addresses which will be wrong after the code is moved!

In my case, it was compile the function and get it in to a specific section (I can't remember how now) and then tell the linker to locate the section at one location for programming in to the ROM but set up all the addresses as if it would be placed in another section. Then use some link-time constants (can't remember the details) for moving it. As I say, it was all fully documented in the manuals!

There is every chance that someone on comp.arch.embedded might know how to do it on the target platform, if the OP specifies the target platform.

--
Flash Gordon