semop and SCHED_RR under higher CPU loads

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Hello group!

When running our application on full performances we encountered occasional
delays of 100ms. At >80% of CPU load they appear very often - once every few
seconds. If load is not so high, delays still appears but not so often.
Investigation leads us to finding that delays occur on semop() when
acquiring semaphore (this semaphore is used to protect a critical section,
which is executed very intensively ~4200/s). Further investigation makes us
to conclusion that delays occur due to round-robin scheduler
(sched_rr_get_interval() returns exact 100ms period, application processes
are running on RT priorities with SCHED_RR policy). When the process
exhausts its time-slice scheduler preempts it for RR interval. If this
happen when the semaphore is taken no one could return it for the 100ms. In
the mean time many processes try to execute the same critical section but
they are blocked. It has a consequence that all other processes, also those
with higher priority are blocked, while they are running their own
transitions (they should process messages from the message queue). These
results to full queues (>1000 messages) of blocked processes and make them
running for longer period (to empty theirs queues) and gives them a good
chance to be preempted by RR scheduler again. The circle is closed. When the
scheduler policy is changed to SCHED_FIFO all of this does not happen any
more, but the application runs more in bursts which is not desirable. Do you
agree with our findings from above? Do you have any suggestion how to
prevent the described problem and retain SCHED_RR policy?

Best regards,

Sani



Re: semop and SCHED_RR under higher CPU loads
Linux is not intended to be a hard realtime OS.

You can never rely on any time frame to be met.

Kernel 2.6 improved the soft-realtime behavior (meaning that the delay
you see might occur not so often as in Kernel 2.4).

You can enable the "preemptive Kernel" feature in the Kernel
configuration to activate this.

There are some "soft realtime patches" that might reduce the problem a
bit more (but still you need to be aware that such delays of several 100
msecs can occur now and then).

If you need hard realtime you could try RTAI or commercial offers by
TimeSys, MontaVista or SysGo.

-Michael

Re: semop and SCHED_RR under higher CPU loads
Hello,

thank you for the answer. In fact we are using Montavista 2.6.10 provided
kernel. I agree with you on the real time characteristics, but for this
platform the main goal we want to achieve are performances - if you increase
the real time characteristics this will have negative impact on the overall
performances due to the additional overhead added. We want to keep SCHED_RR
in order to keep the overall throughput unaffected.

Best regards,

Sani


Quoted text here. Click to load it



Re: semop and SCHED_RR under higher CPU loads

Quoted text here. Click to load it

This might prevent you from using the preemptive Kernel feature or even
suggests to use Kernel 2.4 (better overall performance, worse soft
realtime behavior).


Quoted text here. Click to load it

OK. But SHED_RR processes preempt other processes only as long as these
don't are serviced by the Kernel. And a Kernel service (e.g. file I/O)
might prevent preemption for a long period.

The "preemptive Kernel" feature would help in many (but not all) cases
but of course it reduces the overall performance.

-Michael

Re: semop and SCHED_RR under higher CPU loads
I just read that MontaVista launched "Pro 5.0" that they say introduces
a latency from as low as 5 µSecs,

-Michael

Re: semop and SCHED_RR under higher CPU loads
On 4D4%C26C8%D5, C9%CFCE%E712CA%B104B7%D6, Michael Schnell
Quoted text here. Click to load it

What this latency stands for?


Re: semop and SCHED_RR under higher CPU loads

Quoted text here. Click to load it

It doesn't "stands" for anything.




Site Timeline