Threads vs Forks in embedded environment : Some Conclusions

A

Abhishek 19 years ago

Forks:

Forked processes may not always have its own copy of ALL the segments of the update engine. Most processes of Linux will do a "copy-on-write" for a "page", i.e. a process will get its own copy of a page only if it modifies it. So the RAM requirement is not high. So the only overhead is creation of the kernel structures.

If we have a MMU, the memory consumption of a process may be lower than we think because of the "copy-on-write" semantics.

Switching and interprocess communication time is more. But this is not an overhead if there is not very frequent switching and communication between processes, as in our case where each process will execute its own copy of update engine and work on independent patch parts.

We do not require any additional libraries to supports forks as in case of threads. The problem of concurrency and synchronization complexity is not evident among processes created with fork.

Threads:

Linux has a unique implementation of threads. To the Linux kernel, there is no concept of a thread. Linux implements all threads as standard processes. The Linux kernel does not provide any special scheduling semantics or data structures to represent threads. Instead, a thread is merely a process that shares certain resources with other processes. Each thread has a unique task struct and appears to the kernel as a normal process which just happens to share resources, such as an address space, with other processes.

Threads are created like normal tasks, with the exception that the clone () system call is passed flags corresponding to specific resources to be shared. This leads to a behavior identical to a normal fork (), except that the address space, file system resources, file descriptors, and signal handlers are shared. In other words, the new task and its parent are what are popularly called threads.

This approach to threads contrasts greatly with operating systems such as Microsoft Windows or Sun Solaris, which have explicit kernel support for threads (and sometimes call threads lightweight processes).The name "lightweight process" sums up the difference in philosophies between Linux and other systems. To these other operating systems, threads are an abstraction to provide a lighter, quicker execution unit than the heavy process. To Linux, threads are simply a manner of sharing resources between processes (which are already quite lightweight)

Threads require support libraries, so extra space is required in flash memory. If we have to ship just one program that requires the threading library (as in our case the update engine), then we have to ship the threading library. Minimizing the threading library cost is only possible if we can identify all multithreaded programs in the base Linux distribution. Once we have the library in the flash image to support just one such program, it costs "nothing" for additional programs to also link to it. Updation of libraries may also be required so this may increase the installation time.

Threads have Moderate RAM requirement but it depends upon number of threads. The advantage of threads is their lower resource consumption. Multiple threads typically share the state information of a single process, and share memory & other resources directly. Though threads share resources, in our case the sharing is not substantial.

Threads take much less CPU time to switch among themselves than between processes, because there's no need to switch address spaces. In addition, because they share address space, threads in a process can communicate more easily with one another. Of course inter thread communication can be easier than inter process communication, as we can use shared memory objects, but additional care must be taken to use thread save functions wherever necessary.

Another problem is concurrency and synchronization complexity. Sharing, locking, deadlock; race conditions come vividly alive in threads. Processes don't usually have to deal with this, since most shared data is passed through pipes. Threads can share file handles, variables, signals, etc. this may lead to error conditions if not handled properly.

Applications executed in a thread environment must be thread-safe. This means that functions (or the methods in object-oriented applications) must be reentrant-a function with the same input always returns the same result, even if other threads concurrently execute the same function. Accordingly, functions must be programmed in such a way that they can be executed simultaneously by several threads.

Vote

A

Abhishek 19 years ago

Plz give ur comments

Vote

M

Michael Schnell 19 years ago

AFAI understand, that is wrong for Kernel 2.6. See;

formatting link

With Kernel 2.6. threads are a Kernel concept and they are different from processes: They have a common PID and the time slice is common for all threads that belong to a common process.

-Michael

Vote

M

Michael Schnell 19 years ago

Why do you think so ? There are libraries that do complete user land implementation of threads (i.e. a multitasking scheduler done for a single Linux process). This is much more Posix conform that the "Linux Threads" implementation for Kernel 2.4 (each thread a Linux process). This "Sub-OS" of course needs a lot of code memory. But with Kernel 2.6 NPTL feature this is not necessary any more, as the Kernel supplies a Posix conform thread model.

-Michael

Vote

L

lkml 19 years ago

Vote

M

Michael Schnell 19 years ago

I seemed to remember to have read this somewhere but maybe I'm wrong.

But nonetheless the threads of an application are part of a common concept in the Kernel and not completely independent processes. So they are (more) compliant to POSIX and feature less overhead. So IMHO this is what the OP should use instead of independent processes or of library based user-space threads. Of course he would need to use Kernel 2.6 to make it work.

Vote

Threads vs Forks in embedded environment : Some Conclusions

Join the Discussion

Didn't find your answer?