newlib, FreeRTOS, reentrancy, heap and related questions

- P
- pozz
  
  Contact options for registered users
posted
4 years ago

Thu, Feb 6, 2020 12:26 PM

Usually arm gcc compiler uses newlib (or newlib-nano) for standard C libraries (memset, malloc, printf, time and so on).

I sometimes replace newlib functions, because I don't like them. First of all, I replace snprintf because newlib implementation uses malloc and I don't like to use malloc, mostly if it can be avoided. And for printf-like functions, there are a few implementations that don't use malloc.

When newlib is used with FreeRTOS, there are two heaps: one used by FreeRTOS and one used by newlib. Dave[1] suggests to replace FreeRTOS heap with newlib heap. Why don't do the contrary? newlib malloc is not reentrant, you should implement malloc_lock/unlock. If we're able to force newlib to use FreeRTOS heap, I think it would be better, because FreeRTOS malloc is natively multitasking safe.

Many times I need to have a date and time. time() and gettimeofday() from newlib are good, but I need to implement a _gettimeofday() function for my platform. So the value added by newlib is very little (I could implement a time() or gettimeofday() myself).

So the final question. If I remove printf/malloc/time/... from newlib, is newlib(-nano) needed yet? Is it possible to avoid using newlib at all? I know I could need memset, strchr, strtok, but those functions can be implemented if needed (I think there are many implementations available, even in newlib project).

Of course, I don't need stdio features (open, close, read, write, exit, ...).

I know newlib is for resource constrained embedded devices, so it is already small, but I have the feeling that it's bigger than I need, mostly with FreeRTOS.

[1]

formatting link

- R
- Richard Damon
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Feb 6, 2020 12:37 PM

The FreeRTOS heap functions implement a smaller set of fuctionality than do the standard C library as implemented in newlib, in particular the realloc function, which is used (I am pretty sure) in some of the library.

It is fairly trivial to write a 'heapc.c' memory allocator file that has FreeRTOS use the native malloc/free with also provides teh needed malloc_lock/unlock to make them thread safe.

Yes, it is quite possible to totally implement you own 'standard library' yourself and totally replace newlib-nano. The question comes does doing so buy you enough to be worth the time.

- P
- pozz
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Feb 6, 2020 12:57 PM

Do you mean that realloc(), that is used by some newlib functions, isn't implemented by FreeRTOS heap? I have the sensation that I don't use those newlib functions that use realloc(), because I avoid to use functions that use heap at all.

In this case, I think using FreeRTOS heap only instead of newlib heap has some sense.

Why not the contrary? Why don't use FreeRTOS heap management only? Is it impossible for some reason to completely avoid newlib heap functions?

I'm wondering how many standard functions I need to implement without newlib. 1, 10 or 100? How complex would be to implement them? Of course, only me can answer.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Feb 6, 2020 1:58 PM

It depends a lot on what you need to do. A good many of the standard C functions are rarely useful in small embedded systems (hands up those who need the locale functions or wide character handling). Most systems can also do without the maths functions - perhaps also implementing their own for a more appropriate balance between speed and accuracy.

There are a few oddities to watch out for. Technically, you can't actually implement malloc() in standard C, nor can you implement memcpy or memmove, due to the type aliasing and effective type rules. If you want to be sure of problem-free code that is safe regardless of optimisation, link-time optimisation, new generations of compilers, etc., then you'll be quite careful and make good use of gcc attributes.

Another possibility here is to move to C++. It has better support for memory allocation functions, memory pools, etc. And you can avoid one of the silliest aspects of malloc/free - the fact that the memory allocator implementation has to store the size of the allocated block somewhere. Usually when you are freeing memory, you know the size already - storing it in the block just makes things less efficient. C++ also has functions for "washing" pointers so that you can keep the compiler informed about aliasing and re-use of memory, for better efficiency and safety. (I haven't tried much of this myself as yet, but I expect to do so before long.)

- P
- pozz
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Feb 6, 2020 3:06 PM

Yes, indeed those are my assumptions.

If I really need heap, I would use FreeRTOS implementation that is thread-safe and already available.

What about copying byte by byte? Here[1] you can see newlib memcpy implementation. If PREFER_SIZE_OVER_SPEED or __OPTIMIZE_SIZE__ is defined, the implementation is really copying byte by byte.

I don't really know how newlib used by my compiler (CubeIDE from ST) was compiled, maybe I'm using dumb version of memcpy already.

This is an extract from a listing:

08025850 : 8025850: b510 push {r4, lr} 8025852: 1e43 subs r3, r0, #1 8025854: 440a add r2, r1 8025856: 4291 cmp r1, r2 8025858: d100 bne.n 802585c 802585a: bd10 pop {r4, pc} 802585c: f811 4b01 ldrb.w r4, [r1], #1 8025860: f803 4f01 strb.w r4, [r3, #1]! 8025864: e7f7 b.n 8025856

I'm not an expert of assembly, but it seems to me it is implemented in the simple and not optimized way.

[1]

formatting link

- D
- Dave Nadler
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Feb 6, 2020 3:21 PM

realloc() use within newlib is documented on my web page you referenced.

Whatever you do, check the map and make sure you aren't accidentally using memory management functions you haven't covered.

Hope that helps, Best Regards, Dave

- P
- pozz
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Feb 6, 2020 4:14 PM

You're right, I didn't notice:

So my sensation was correct, replacing printf-like functions with others that don't use malloc is sufficient to avoid using newlib heap at all. Most probably there are other newlib functions that need realloc(), but I think they can be replaced as well as printf.

With those assumptions, are there any other drawbacks in using FreeRTOS heap only?

Yes, good suggestion.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Feb 6, 2020 8:41 PM

It is not the actual copying that is the problem - copying by char is simple and safe (though often inefficient). The issue is that the C standards say memcpy also copies the effective type in certain circumstances - there is no way to specify that in C, and it is therefore a special feature of the library memcpy. A homemade memcpy does not have that same feature. (In a similar vain, there is no way to get memory in standard C that has "no declared type" except via the library malloc and friends - a homemade malloc won't do.) I am not sure what the best solution is here.

Anyway, for memcpy make sure the compiler can use the builtin versions where possible (avoid -ffreestanding, or use -fbuiltin) as this will give far better code.

- P
- pozz
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Feb 6, 2020 10:23 PM

Could you make an example? I didn't understand.

Anyway as you can see, newlib just implements memcpy in pure C language when compiled without optimizations. Are you saying it's bugged?

I don't use -ffreestanding, but I don't know if I'm using -fbuiltin. Anyway you are suggesting to use builtin functions that are functions built *in* the compiler and not in the newlib.

Another reason to consider useless newlib.

- P
- pozz
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Feb 6, 2020 10:26 PM

Dave, in your page you suggest to use -Wrap compiler option to replace newlib malloc with FreeRTOS malloc.

I think it's not necessary. If you define a malloc() function, the linker should use it instead of newlib malloc, without emitting any warning or errors about duplicate definition.

I use this to replace printf-like functions.

Linker searches for a function in the object files that you're linking and only if it isn't able to find it will try with libraries.

- R
- Richard Damon
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Fri, Feb 7, 2020 4:23 AM

The FreeRTOS memory management provides a malloc and a free equivalent, but no realloc (or calloc, but you can implement calloc with malloc).

Since the C standard library includes realloc, it is quite possible that some library function uses it (it could be handy for instance to implement strings). I don't kow for sure if it is called, just pointing out that it might.

If you code needs realloc, then implementing that with just malloc and free functionality is inefficient, basically every realloc needs to malloc a new block and copy. You could modify the FreeRTOS functions to add realloc, but that is more work than adding the malloc_lock functions to make them safe.

One other disadvantage of using the FreeRTOS heap is that it is compile time fixed in size, while the newlib heap will automatically use all of the free memory.

- R
- Richard Damon
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Fri, Feb 7, 2020 4:29 AM

One big problem with that is that much of newlib internally doesn't call malloc, but another function that malloc also calls, so replacing malloc doesn't work. This is one reason the Standard doesn't tell you that you can override library functions like this, you really need to understand the implementation, and put on your 'implementer' hat to make that sort of change.

- R
- Richard Damon
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Fri, Feb 7, 2020 4:44 AM

The issue isn't that the code needs to do something special, but that the code being implementation code is allowed to do something special.

The rules on effective types provides limits on what 'standard' code is able to do and stay within the rules. Code that doesn't follow those rules might not work right, or it might work.

memcpy as pure C, breaks the rules, so the langague doesn't promise that it will work. The writers of newlib understand the implementations, and know what they need to do so that for that implementation, it will work. Sometimes this means adding an special implementation provide construct to make the code work right, sometimes it is just making sure the compiler is in the right mode when compiling that code so as to avoid creating the problem, in particular do what is needed to keep the optimizer from looking into the function and see the effective type rule violation.

- P
- pozz
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Fri, Feb 7, 2020 8:19 AM

Yes, we discussed this point. I tend to avoid using heap at all, including standard functions that need heap.

newlib printf implementation uses heap and one of the first thing I do is replacing printf with another one.

Yes, but I started with the assumption I don't need heap at all.

Can't you optimize the static heap size *before* compilation to cover all the free memory available? This isn't automatic, but it's still possible.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Fri, Feb 7, 2020 8:51 AM

The compile can also know that the memcpy in newlib is the library memcpy, and give it special privileges without it being marked in any special way or being compiled in particular modes. Many library functions are in some way special or privileged in C, simply because they are C standard library functions, and in theory you can't duplicate that with your own plain C code. Basically, the compiler knows a great deal about what a library function will do, even if it does not have direct access to the source code. This can be used for optimisation and static analysis. If you write your own versions of these functions and don't follow these rules, all sorts of bad things can happen. In practice, of course, your won alternative C code will often work exactly the same - and sometimes there are compiler features like attributes or pragmas that can give the compiler the extra information.

An example of this, consider this code:

int sum(const int * p, int n) { int s = 0; while (n--) s += *p++; return s; }

int test(void) { const int N = 4;

int * p = malloc(N); for (int i = 0; i < N; i++) { p[i] = i; } int s = sum(p, N); free(p); return s; }

We allocate space for 4 ints with "malloc", fill the array, pass it on to a calculation function, free the resources with "free", and return the result.

gcc compiles test to:

test: mov eax, 6 ret

It knows what malloc and free do, and can eliminate them entirely. It could not do that with hand-made memory allocation functions.

It also knows that malloc returns either 0, or a pointer that cannot alias any existing memory, and it can use that for optimisation. But if the source of memory that a hand-made malloc uses is a C-defined array, then this knowledge will be wrong as the malloc-returned pointer will alias an existing array. Not good.

- P
- Paul Rubin
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Fri, Feb 7, 2020 9:04 AM

(cough) that allocates N bytes, not N ints.

Wow! I think it saw the consts and basically ran the code at compile time.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Fri, Feb 7, 2020 9:35 AM

Suppose you have a block "b" of memory allocated on the heap with malloc

- it has no "declared type" because it was not part of a C-defined object. You are free to store data of any kind in "b", and it takes on a type based on the access you used to store to "b" (unless you use character type access, which leaves it untyped). Let's say you treat "b" as an array of floats and fill it up - now its effective type is float[].

Suppose you have another C object or array "s" with a specific type from somewhere - such as an array of char* pointers.

You want to copy the contents of "s" into "b".

You could do this in several ways:

Read from "s" as char* pointers, converting to a float using a union, and write it to "b". Then "b" is still an array of floats and the compiler knows that any access to it as an array of char* pointers is undefined behaviour - it can assume it can't happen. Beware the nasal demons!
Make a pointer "char* * p = (char* *) b" and use that as the destination when copying from "s" to "b". The compiler knows that "b" is now an array char* pointers, and can be accessed as such. Everything works, but you need a specific copying function each time.
Make a generic function that copies using unsigned char, and call that to copy from s to b. Then b takes on the effective type of s, and so b is now an array of char* pointers. Everything works, but copying is inefficient.
Make a generic function that copies using uint32_t for speed, and call that to copy from s to b. Then b becomes an array of uint32_t, and accessing it as an array of char* pointers is undefined behaviour. Nasal demons again.
Call the standard library memcpy. Then b gets the effective type of s, and everything works. This is true whether the compiler generates a local loop, or calls the library function, and it is true whether the copying is done by byte or in larger lumps. The library memcpy is special here - you cannot duplicate that behaviour in standard C.

This kind of thing - type based alias analysis and the effective type rules in C - is difficult to get right. And it is not often that the compiler can use this extra information for optimisation. But sometimes it can. And sometimes it uses it for an optimisation that is correct according to the C code you wrote, but not according to what you wanted.

Understanding the rules is hard, and sometimes playing by the rules is even harder, so one solution is to change the rules. The "-fno-strict-aliasing" flag in gcc changes the semantics of C to say that the effective type of an object is always the type used to access it - this simplifies things a lot here, at the cost of occasionally missed optimisation opportunities. For example, the Linux kernel is always compiled with "-fno-strict-aliasing".

(Note that this flag does not help with the aliasing issue with home-made malloc, as that's a different thing entirely.)

No - it can be treated as special because it is the standard library for your implementation.

When you use one of the common "small" functions in the C standard library, like memcpy, memset, strcat, etc., the compiler knows what they do without knowing the source. If it can make smaller or faster code inline with the same effect as specified in the standards, then it may do so. Typically for memcpy that means the compiler knows the size of the copy and the alignments at compiler time. For example:

uint32_t rawfloat(float f) { uint32_t u;

memcpy(&u, &f, sizeof(u)); return u; }

This will be turned into a register move (if needed, depending on the cpu), with nothing stored in memory and no library calls made. And unlike faffing around with pointer casts, it is correct C code. And unlike using a type-punning union, it is correct C++ code as well as correct C code.

But more general calls to memcpy will be passed on to the library function.

- U
- upsidedown
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Fri, Feb 7, 2020 9:36 AM

While dynamic memory fragmentation can be a serious issue in systems that needs to run a long time (years or decades) without reboots. For this reason it is a good idea to avoid using malloc and free (or at least avoid using free :-). Fragmentation occurs when variable size allocations with different lifetimes are used.

However, functions like printf may allocate some resources at entry and release them at exit and the heap state is the same before the printf function after it has been exited. In fact in this case dynamic memory is used in the same way as stacks. Much of the functionality could have been implemented using stack allocation. For some historical reasons (very small stacks on some early processors), C-language malloc/hree is used much more frequently compared to other languages using stack work space.

In a single task system or in multitasking environment with private heaps using this kind of stack-like usage should not cause fragmentation. However in a multitasking environment with a single shared heap, memory fragmentation can occur, if some other task makes long lasting allocations while printf is being executed. So in reality, the whole printf function should be protected against task switching.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Fri, Feb 7, 2020 9:48 AM

Just checking that you were paying attention :-)

Yes, exactly.

The point is that because the compiler knows what malloc and free do - they are specified in the standards - it can use that knowledge for optimisation.

(The exact point at which it will change from run-time calculation to compile-time calculation is dependent on the compiler, target, options, etc.)

- P
- pozz
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Fri, Feb 7, 2020 1:31 PM

After your considerations, why use a printf that uses heap? There are other good implementations that don't use heap at all and so are intrinsically thread-safe.