Scheduling between threads of a same process

Hi,

Consider 3 processes (p1, p2 and p3) Consider that the process p1 has 2 threads ( t1, t2) and lets assume that t1 takes 15 ms and t2 takes 10 ms. If the timeslice allocated for every process by RR(round-robin) is 5 ms. In the above scenario, how will the threads get scheduled ?

That is, will it follow any of the below sequence of execution ?

1) t1,t2,p2,p3,t1,t2,p2,p3 (t1 and t2 are executed first in the place of p1 and followed by the execution of p2,p3) Sharing the timeslice of 5ms allocated to p1 among the threads t1 and t2 (like, 2.5ms for t1 and 2.5ms for t2). Looks good, but is it possible to run t2 ahead of t1 here ?

2) t1,p2,p3,t2,p2,p3,t1,p2,p3 (t1 is executed during the first cycle of RR and t2 is executed in the consecutive cycle of RR and so on..) I think, this causes delay in the execution of t2 .

3) t1,p2,p3,t1,p2,p3,t1,p2,p3,t2,p2,p3,t2,p2,p3 (t1 is executed in first cycle of RR and in the consecutive cycles also , until the t1 is finished. And after finishing t1, it moves to execute the t2) I think, this method will not be followed as it would lead to starvation of thread t2. So, normally, this should not be used.

Further, with reference to point no.1 stated above, Is it possible to split-up/allocate 3ms for t1 and 2ms for t2 ? Any links that talk in detail about these scheme/technique ?

I use Redhat 9.0 (linux). Is there any link that explains about the scheduling between the threads in a process with clear illustrations for various scenarios?

Thx in advans, Karthik Balaguru

Reply to
karthikbalaguru
Loading thread data ...

There are 3 processes (p1, p2 and p3). Here, p2 takes 25ms and p3 takes 30ms. The process p1 takes 25ms. But, the process p1 has 2 threads ( t1, t2) and t1 takes 15 ms and t2 takes 10 ms. The timeslice allocated for every process by SCHED_RR(round-robin) is 5 ms.

Considering the above scenario, I am interested to know whether the following happens (assuming p2 takes 25ms and p3 takes 30ms) - t1,t2,p2,p3,t1,t2,p2,p3,t1,p2,p3,p2,p3,p2,p3,p3

I am trying to understand 'scheduling between the threads in a process' . Is there any link that talks in detail about these with some nice illustrations. Any ideas ?

Thx in advans, Karthik Balaguru

Reply to
karthikbalaguru

[snip] That depends. You should never assume anything about thread execution sequence in multithreaded design, you might have multiple cores. You posted in multiple groups unrelated to this subject!! Ask questions in : comp.programming.threads instead.
Reply to
edwin ng

No reasonably sane scheduler would do this. You are suggesting that the scheduler *punish* threads t1 and t2 for sharing a vm rather than rewarding them for reducing the cost of context switches. Why should a program get larger timeslices because it uses two processes to do its job rather than two threads?

DS

Reply to
David Schwartz

1) One way of thinking is ->

But, threads are treated just like process in linux and hence it will be handled similar to process.

Here the total time for p1 will be equal to the summation of total time taken by t1 and t2. Here, p1 takes 25ms, so it can be represented in such a way that t1 takes 15ms and t2 takes 10ms.

Since threads are being treated like process in linux, it would also be contending for the time of the cpu just like any other process.

So, i think, p1-p2-p3 sequence becomes t1-t2-p2-p3.

In the scenario , t1 takes 5ms, t2 takes 5ms,p2 takes 5ms and p3 takes

5ms .

So, timeslice of process p1 (5ms) is replaced by threads t1 (5ms) & t2 (5ms) which accounts to 10ms. But, here since the activity of the process is split up among the taks, it should quicken the completion of p1 and hence after

45ms, p1 will be completed not in the list to be switched to. So, there will be less context switches(that is, it will be there between the p2 and p3 in the remaining time) Also, The advantage lies during context switch between t1 and t2 (less overhead and fast).

But, the starting time of execution of p2 is delayed by 10ms. So, is this scheme being followed ?

Threading reduces the time of operation as the file operators or registers manipulated by one thread is easily accessible for the another thread of the same process and hence it will not be necessary to perform other unwanted operations to manage this scenario. But, anyhow, thread addition will increase the starting time of execution of the next process as threads are treated just like any other process while scheduling. How is this being handled ?

2) While thinking about this, i thought of another scheme of t1 and t2 for p1. Here, It may allocate 2.5ms to t1 and 2.5ms to t2 , p2 taking 5ms and p3 taking 5ms . Does the scheduler split its time of p1 among the threads t1 & t2 ? But, here the threads belonging to a specific process has to be tracked continuously and it has to be scheduled among them for the particular process. That is, the timeslice of the process p1 has to be split among the threads based on some scheme. But, Is this being used ? If this scheme is being used, then, what is the scheduling mechanism followed for scheduling those threads of a particular process ?

In general, need to know about the mechanism of scheduling between the threads belonging to the same process . Any ideas ?

Thx in advans, Karthik Balaguru

Reply to
karthikbalaguru

1) One way of thinking is ->

But, threads are treated just like process in linux and hence it will be handled similar to process.

Here the total time for p1 will be equal to the summation of total time taken by t1 and t2. Here, p1 takes 25ms, so it can be represented in such a way that t1 takes 15ms and t2 takes 10ms.

Since threads are being treated like process in linux, it would also be contending for the time of the cpu just like any other process.

So, i think, p1-p2-p3 sequence becomes t1-t2-p2-p3.

In the scenario , t1 takes 5ms, t2 takes 5ms,p2 takes 5ms and p3 takes 5ms .

So, timeslice of process p1 (5ms) is replaced by threads t1 (5ms) & t2 (5ms) which accounts to 10ms. But, here since the activity of the process is split up among the taks, it should quicken the completion of p1 and hence after 45ms, p1 will be completed not in the list to be switched to. So, there will be less context switches(that is, it will be there between the p2 and p3 in the remaining time) Also, The advantage lies during context switch between t1 and t2 (less overhead and fast).

But, the starting time of execution of p2 is delayed by 10ms. So, is this scheme being followed ?

Threading reduces the time of operation as the file operators or registers manipulated by one thread is easily accessible for the another thread of the same process and hence it will not be necessary to perform other unwanted operations to manage this scenario. But, anyhow, thread addition will increase the starting time of execution of the next process as threads are treated just like any other process while scheduling. How is this being handled ?

2) While thinking about this, i thought of another scheme of t1 and t2 for p1. Here, It may allocate 2.5ms to t1 and 2.5ms to t2 , p2 taking 5ms and p3 taking 5ms . Does the scheduler split its time of p1 among the threads t1 & t2 ? But, here the threads belonging to a specific process has to be tracked continuously and it has to be scheduled among them for the particular process. That is, the timeslice of the process p1 has to be split among the threads based on some scheme. But, Is this being used ? If this scheme is being used, then, what is the scheduling mechanism followed for scheduling those threads of a particular process ?

In general, need to know about the mechanism of scheduling between the threads belonging to the same process . Any ideas ?

Thx in advans, Karthik Balaguru

Reply to
karthikbalaguru

seems like, u r not aware of the fact that in case of thread model of process management, its only the threads which are ever scheduled, and there'z nothing like a process (which is again just a thread, being known as thread leader) for the kernel. So when u say that process p1 has two thread, and and process p2 and p3 alone, it means that real execution enties are 5:

2 threads for the process 1 and 1 thread for each of p2 and p3 when u say p2 is executing, it is the single thread of p1 which is executing and nothing else. Plz read some good books on kernel internals, it will clear all your doubts ! In Linux, process term has just left for the existance of process descriptor (known by task_struct structure), however, there are no processess which really execute, its only threads of processes which really execute, even if u explicitly creatre a thread or dont, its only threads ! (single thread of execution, by default !)
Reply to
root thief

On a UNIX(*)-system, a 'process' is a collection of system ressources shared by n 'threads of execution'.

Reply to
Rainer Weikusat

Can you pls tell me the method to arrive at the value of 5 ? I think it should be 4.

Can you pls let me know whether the threads created for a process share the timeslice of the process OR the threads created have the same timeslice as that of the process.

That is, which of the following scenario is possible in linux (consider 2 threads are created for process p1)

1) P1 = 5ms will become t1(5ms) + t2(5ms) = 10ms . 2) P1 = 5ms will become t1(2.5ms) + t2(2.5ms) = 5ms

Thx in advans, Karthik Balaguru

Reply to
karthikbalaguru

No sane scheduler would ever do this. As I said, this punishes t1 and t2 for cooperating by halving their timeslices. A sane scheduler rewards cooperation.

Why should an implementation that uses two processes get twice as much CPU time as an implementation that uses two threads? That would unfairly bias one type of implementation over another.

DS

Reply to
David Schwartz

AFAIK, the scheduler does not know about processes and threads, unless NPTL ("native Posix threads") is activated in the Kernel (available as of 2.6.??). Thus it seems logical that - given equal priority - without NPTL, all thread of all processes get the same count of time slices. With NPTL the threads are grouped and scheduled appropriately (all processes get the same count of time slices).

OTOH this again is not regarded as "fair", as the scheduling mechanism is influenced by regarding the threads as either "I/O bound" or "CPU-bound". that is why recently a new scheduler has been created that is called "Completely Fair Scheduler".

For more informations please google for "NPTL" and "Completely Fair Scheduler". There should be lots of details in many documents.

-Michael

Reply to
Michael Schnell

Well, that is PROCESS DESCRIPTOR of the thread group leader known by task_struct structure, and NOT a Process, which is a collectino of system resourcess shared by its threads, if any.

and yeh, tht no of threads is 4, not 5 (typo !)

Reply to
root thief

No. In fact, with or without NPTL, all threads get the same count of time slices assuming the same priority.

What you are suggesting would be a disaster, and no sane scheduler would work that way. Consider two web servers running on the machine, one uses a process per client and one uses a thread per client, each have 100 clients. Why should the ppc server get 100 times the CPU of the tpc server?

Al things being equal, the tpc server is more efficient since its context switches are cheaper. If anything, the scheduler should reward the more efficient server, not punish it.

DS

Reply to
David Schwartz

What the scheduler _should_ do depends on the project.

Thus the newest Linux Kernel can be equipped with multiple schedulers that can be "plugged in". (e.g. the "O(1)"-scheduler and the "completely fair" scheduler).

-Michael

Reply to
Michael Schnell

So, If process P1 has 5 runnable threads and process P2 has 3 runnable threads, assuming all the 8 threads are at same priority, then each thread would receive one-eighth of the CPU time.

And this will get adjusted whenever a new thread gets created.

And , if process P1 has 7 threads and process P2 has 3 runnable threads, assuming all the 10 threads are at the same priority, then each thread would receive one-tenth of the CPU time.

Good example :):).

I find that there are some situations in which a high priority thread can get scheduled during every round robin. In that case, linux has come up with a dynamic priority scheme that will avoid the starvation of the other threads by increasing their priority if they get delayed for long . And this dynamic priority is compared before a thread gets executed. But, Can anyone tell me the scheme used by linux for calculation of dynamic priorities ? What kind of conditions are taken into consideration before raising a priority ?

Thx in advans, Karthik Balaguru

Reply to
karthikbalaguru

There are different scheduler plugins available. The newest and most sophisticated are the O(1) scheduler and the "completely fair scheduler". It should be possible to find informations about those in the Net.-Michael

Reply to
Michael Schnell

I was under the impression that switching threads and processes _under linux_ were equivalent. Perhaps you could point me at something that says otherwise?

Reply to
Jim Jackson

There is no way to 'switch a process' on UNIX(*) because processes ARE NOT SCHEDULED (anymore). Threads may be (or some abstraction the threading-implementation is based on). Context-switchting between threads belonging to the same process is cheaper than context-switching between threads belonging to different processes because they run in the same address space, ie cache contents, VM mapping registers, TLB are all unaffected.

Reply to
Rainer Weikusat

Maybe:

formatting link

-Michael

Reply to
Michael Schnell

Okay, let's phrase things precisely. All the kernel is ever does is switch between two kernel scheduling entities. The two cases we are comparing is one where the KSEs share a vm (which would happen if they were threads from the same process) and one where the KSEs do not share a vm (which would happen if they were threads from different processes).

If the threads share a vm, the OS does not have to change the memory map. This means no TLB flushes. It also means the TLB cache entries will still be valid and usable and new entries will not fault in.

To maximize this benefit, Linux tries to schedule threads that share a VM "in a row". That is, if it has threads A1 and A2 from process A and threads B1 and B2 from process B, it will try to schedule "A1 A2 B1 B2" over "A1 B1 A2 B2".

Unfortunately, it is very hard to follow discussions on this subject because people are often not precise in their terminology. I try to be very precise and use "thread" to mean a POSIX thread, "process" to mean all the threads that share a particular virtual memory view, and "kse" to mean what the kernel schedules. Unfortunately, kernel programmers often use the term "process" to mean a KSE, which typically corresponds to only a single thread in a multi-threaded process.

DS

Reply to
David Schwartz

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.