Do you have a question? Post it now! No Registration Necessary

Re: factors affecting context switch time

Hello,
Keyword: Jiffie
Study http://www.tldp.org/HOWTO/Benchmarking-HOWTO.html
+
Google for "Linux kernel Jiffie" or read
http://en.wikipedia.org/wiki/Jiffie or http://kerneltrap.org/node/464
// moma
http://www.futuredesktop.org/how2burn.html#Ubuntu <--Ubuntu resources

Re: factors affecting context switch time

CPU arch is probably the biggest factor. Many CPUs are _much_
faster than x86, mostly as a result of the horrible slowness
of interrupt processing on traditional 8259 PIC processing.
This could be much faster in a kernel compiled to use APIC.
If you're talking co-operative task switching, x86 may be
reasonably fast.
Actual times will vary depending on cache state and how long
the scheduler has to run.
-- Robert

Re: factors affecting context switch time
<Snip>
cpu arch ,
Yes

Yes
memory being used currently ?
Yes

You're welcome...
There are also a number of other factors:
Clock rate of the processor.
How much need to be saved/loaded to switch contexts.
Phase of the moon :-)
Day of the week :-)
I'm sorta hoping that this is a homework question and not a real project.
GS
cpu arch ,
Yes

Yes
memory being used currently ?
Yes

You're welcome...
There are also a number of other factors:
Clock rate of the processor.
How much need to be saved/loaded to switch contexts.
Phase of the moon :-)
Day of the week :-)
I'm sorta hoping that this is a homework question and not a real project.
GS

Re: factors affecting context switch time

Funny though it may sound it is a real project :) .. need to calculate
context switch time for a comparitive performance analysis and going to
use lmbench. It is a X vs Y situation where X and Y are both overheads.
X in this case is the time spent in performing 2 context switches while
Y is another piece of code which hope to measure using RTDSC
instruction.
If i know what factors affect the context switch time , then can say at
what times X is a greater overhead and at what time Y is more
expensive.
vivekian

Re: factors affecting context switch time

If you only look at context switch time (i.e. measure the time
it takes from the last instruction in one one process to the
first instruction in another), you get only half of the picture.
There is also considerable overhead resulting from having
to flush TLB and caches during context switch, but this sort
of overhead doesn't manifest itself as a discrete amount of
time where the CPU does not execute user code. Instead it
slows down execution of user code, because the CPU has to do
page table walks and cache refills along the way.
The x86, for example, needs to flush the TLB for every switch
of address space (i.e. process) while other archs (such as
PowerPC & MIPS) have a "tagged TLB" which makes this unnecessary.
The ARM architecture (at least up to the SA1100, don't know about
the newer ones) needs to flush caches during address space switch.
Here is an IMHO interesting paper on (part of) the subjet
http://i30www.ira.uka.de/research/publications/papers/index.php?lid=de&docid64%1
HTH
Rob
--
Robert Kaiser email: rkaiser AT sysgo DOT com
SYSGO AG http://www.pikeos.com
Robert Kaiser email: rkaiser AT sysgo DOT com
SYSGO AG http://www.pikeos.com
We've slightly trimmed the long signature. Click to see the full one.

Re: factors affecting context switch time

http://i30www.ira.uka.de/research/publications/papers/index.php?lid=de&docid64%1

Thanks robert. That is an interesting insight and lends a lot of weight
against the context switches argument . Still wondering though is this
ARM , x86 specific ? My report needs to take into consideration real
time systems and wonder if this would effect architechtures which are
meant for real time processing , since they are mostly MMU less .
vivekian

Re: factors affecting context switch time
Does this mean that there is TLB flush when I switch from user
mode to kernel mode? For example when a system call is made!
Somu
On Mon, 23 Jan 2006, vivekian wrote:

http://i30www.ira.uka.de/research/publications/papers/index.php?lid=de&docid64%1

mode to kernel mode? For example when a system call is made!
Somu
On Mon, 23 Jan 2006, vivekian wrote:

http://i30www.ira.uka.de/research/publications/papers/index.php?lid=de&docid64%1

--
cheers
Somu
cheers
Somu
We've slightly trimmed the long signature. Click to see the full one.

Re: factors affecting context switch time

http://i30www.ira.uka.de/research/publications/papers/index.php?lid=de&docid64%1

Another paper on fast address space switching on ARM which you may
find interesting:
http://i30www.ira.uka.de/research/documents/l4ka/2003/fass-and-tlb-sharing-on-strongarm.pdf
--
Catalin
Catalin

Re: factors affecting context switch time
That is interesting. Bcos I see an increase in L1 and L2 cache
misses with the increase in the
number of system calls I make for x86 architecture. The system is
running the bare minimum
kernel processes necessary. The read call uses the same buffer in
the tight loop. The buffer is
page aligned.
I am taking these measurements with a sys_read call in a tight
loop. How does one explain this?
S
On Mon, 23 Jan 2006, Catalin Marinas wrote:

http://i30www.ira.uka.de/research/publications/papers/index.php?lid=de&docid64%1

http://i30www.ira.uka.de/research/documents/l4ka/2003/fass-and-tlb-sharing-on-strongarm.pdf

misses with the increase in the
number of system calls I make for x86 architecture. The system is
running the bare minimum
kernel processes necessary. The read call uses the same buffer in
the tight loop. The buffer is
page aligned.
I am taking these measurements with a sys_read call in a tight
loop. How does one explain this?
S
On Mon, 23 Jan 2006, Catalin Marinas wrote:

http://i30www.ira.uka.de/research/publications/papers/index.php?lid=de&docid64%1

http://i30www.ira.uka.de/research/documents/l4ka/2003/fass-and-tlb-sharing-on-strongarm.pdf

--
cheers
Somu
cheers
Somu
We've slightly trimmed the long signature. Click to see the full one.

Re: factors affecting context switch time

Yes. The Cache/TLB structure is very archtecture specific. For
example, the problem with the ARM cache, IIRC, is that uses virtual
addesses to identify cache lines, so, if the mappings are changed, all
entries must be flushed. The x86 uses physical addresses instead, which
are independent of the current mappings. Likewise, the x86 MMU does
not support any tags on TLB entries, so, when switching between address
spaces, the TLB must be flushed, whereas with PPC or MIPS, you just
need to load the tag that identifies the current address space into
some register.

OK, I was assuming that a context switch includes an address space
switch. With no MMU, everything runs in one common address space,
of course, so a TLB or Cache flush would not be necessary for a
context switch with any of the mentioned architectures.
That said, there are quite a few real time systems that do
use the MMU, mainly to provide protection between programs.
Cheers
Rob
--
Robert Kaiser email: rkaiser AT sysgo DOT com
SYSGO AG http://www.pikeos.com
Robert Kaiser email: rkaiser AT sysgo DOT com
SYSGO AG http://www.pikeos.com
We've slightly trimmed the long signature. Click to see the full one.

Re: factors affecting context switch time
bitbucket@invalid-domain-see-sig.nil306-1.cs.fh-wiesbaden.de (Robert Kaiser)

Just to add some extra information, ARM processors starting with the
v6 architecture (ARM1136 and later cores) have tagged TLB's and also
have VIPT caches (Virtually Indexed, Physically Tagged) so that no
flushing is required at a context switch. The drawback of the
physically tagged caches is that they require a TLB look-up to get the
physical address before looking into the cache. This might not make
any difference with the modern, pipe-lined, processors though.

There is another thing to consider for OS's like Linux (not that Linux
can be used for hard real-time) - the application code and read-only
data pages are loaded from the filesystem on demand. I.e. the
application initially starts with only few pages loaded/mapped and
when branching happens to a location in a different page, the kernel
traps the prefetch abort and loads the new page into memory, mapping
it into the task's address space (on some architectures, this requires
the flushing of the whole TLB). This could cause significant
delays. Another situation is the malloc'ed memory which Linux doesn't
really allocate until it is accessed (can use calloc instead which
forces the write). Even if you don't have the swap enabled, Linux on
MMU systems can remove read-only pages from RAM if it runs short of
available memory.

For some ARM cores (pre v6 architecture), the MMU or MPU (Memory
Protection Unit) needs to be enabled to be able to use the caches.

Just to add some extra information, ARM processors starting with the
v6 architecture (ARM1136 and later cores) have tagged TLB's and also
have VIPT caches (Virtually Indexed, Physically Tagged) so that no
flushing is required at a context switch. The drawback of the
physically tagged caches is that they require a TLB look-up to get the
physical address before looking into the cache. This might not make
any difference with the modern, pipe-lined, processors though.

There is another thing to consider for OS's like Linux (not that Linux
can be used for hard real-time) - the application code and read-only
data pages are loaded from the filesystem on demand. I.e. the
application initially starts with only few pages loaded/mapped and
when branching happens to a location in a different page, the kernel
traps the prefetch abort and loads the new page into memory, mapping
it into the task's address space (on some architectures, this requires
the flushing of the whole TLB). This could cause significant
delays. Another situation is the malloc'ed memory which Linux doesn't
really allocate until it is accessed (can use calloc instead which
forces the write). Even if you don't have the swap enabled, Linux on
MMU systems can remove read-only pages from RAM if it runs short of
available memory.

For some ARM cores (pre v6 architecture), the MMU or MPU (Memory
Protection Unit) needs to be enabled to be able to use the caches.
--
Catalin
Catalin

Re: factors affecting context switch time
Catalin wrote :

though this does explain a lot about how pages are loaded into RAM ,
cant corelate it as to the effect it has on context switch time --
makes it slower ?
Also , another question -- maybe should post it as a different thread
.. There are some lines of C++ code for which the number of clock
cylces has to be measured. At present using RDTSC instruction. Is there
some way to make sure that the code runs uninterrupted so that a true
picture is available ? Or is there some tool available to measure how
much time the code takes to run ?
thanks,
vivekian

though this does explain a lot about how pages are loaded into RAM ,
cant corelate it as to the effect it has on context switch time --
makes it slower ?
Also , another question -- maybe should post it as a different thread
.. There are some lines of C++ code for which the number of clock
cylces has to be measured. At present using RDTSC instruction. Is there
some way to make sure that the code runs uninterrupted so that a true
picture is available ? Or is there some tool available to measure how
much time the code takes to run ?
thanks,
vivekian
Site Timeline
- » learning about linux drivers for 2.6 kernel
- — Next thread in » Embedded Linux
-
- » cheap embbeded linux controller with rj45 ethernet port ?
- — Previous thread in » Embedded Linux
-
- » Crosscompiling for ARM: reloc type R_ARM_ABS32 is not supported for PIC - ...
- — Newest thread in » Embedded Linux
-
- » Panasonic KX-TC 160 buczenie w sĹ‚uchawce.
- — The site's Newest Thread. Posted in » Electronics (Polish)
-