Disabling interrupts to protect data

KIRAN · 2009-10-26T17:11:45+00:00

Hi Guys,I am working on some RTOS, in which I see lot of interrupts "enabling"and "disabling" code in most of the RTOS API's to protect kernel data.For example, Semaphore API's, Message Queue APIs, Runtime memorymanagement API's. Is enabling / disabling interrupts only to protectthe kernel data? Why I am asking this is whenever interrupts aredisabled, I am scared of losing timer interrupts. Any input isappreciated?Regards,Kiran

4

42Bastian Schick 16 years ago

And some explain it with the longest un-interruptible instruction such as saving or storing a bunch of registers.

42Bastian Do not email to bastian42@yahoo.com, it's a spam-only account :-) Use @monlynx.de instead !

Vote

V

Vladimir Vassilevsky 16 years ago

Yes. Also test-set is not suitable when manipulating stack and frame pointers or dealing with nested interrupts.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

Vote

D

David Brown 16 years ago

Reservations are hardware intensive to implement, unless you are willing to pay extra memory latency. But they scale well for multi-processor systems, making them a good solution for "standard" PPC processors. The PPC core in the MPC55xx is more like a microcontroller core than a "big" processor core, and Freescale probably saved a fair amount of silicon and complexity by dropping reservations so that you must use traditional interrupt disabling. I think it is the right choice for these devices, but then I haven't been trying to write optimal RTOS code that is portable across different PPC devices!

Vote

S

Stefan Reuther 16 years ago

Yep.

The problem here is how you "schedule" the deferred activity. You can make an array of sig_atomic_t variables, one for each possible activity, the interrupt sets them, the kernel exit checks and clears them. But this obviously does not scale.

When you make something generic, you'll start needing linked lists of stuff where you put the deferred actions in. So you now need some kind of atomic link-creation or swap routine. Okay, this can also be implemented without disabling interrupts, by having the interrupt routine detect whether it interrupted the atomic routine. It's possible, but it's easy to get wrong.

It's much simpler to just wrap the three statements into a CLI/STI. In particular when your processor has multi-cycle instructions which implicitly block interrupts for a dozen cycles while they execute - so why should these be permitted and a three-cycle CLI/STI should not?

Stefan

Vote

D

D Yuniskis 16 years ago

Append a pointer to the ASR that needs to be deffered to the end of a list of "deffered procedure calls". When the ISR returns, the OS schedules this "list" of DPC's (in sequential order).

Of course, you still have to guarantee that you have enough "overall" time to handle the tasks at hand, lest this list grow indefinitely. (i.e., your design has to "work"! :> )

This last point is why folks seem to resort to the more heavy-handed approach of unilaterally disabling interrupts for *everything* that is "hard" or "easy to get wrong".

There is nothing wrong with disabling interrupts! But, like everything else, you have to understand *why* you are doing it and when it is *appropriate* to do so.

In my experience, it seems that people turn interrupts off "too early" in a code fragment and turn them back on "too late". Often, if you look at the code, you can see things that could have been moved outside of the critical region (sometimes with some added complexity) to reduce the size of that region. This usually results in "needing more CPU" than you *really* do.

*If* your system can handle high interrupt latencies, then you can "afford" to disable them for longer periods of time. But, you have to make sure you know just how long those periods can *become* -- since a deferred IRQ is more likely to encounter some *other* IRQ during its execution (i.e., you may return from one ISR just to find yourself in *another*, etc.)

You also have to be careful about how you apply the DI/EI idiom. If, for example, the code you are executing can take place *within* a region where interrupts are off, then you want to be sure you don't blindly re-enable them at the end of "this" critical region only to (belatedly) discover that you have corrupted some other *encasing* critical region. I.e., you may have to examine the state of the interrupt mask/level before disabling so that you can correctl restore it afterwards.

The same is true of *all* locking mechanisms. E.g., taking a lock on a file before you *really* need it, etc. (especially if that "file" is a *special* file!)

Vote

S

Stefan Reuther 16 years ago

Exactly this "append a pointer" thing is where an atomic operation can come in very handy. You shouldn't append to the list while it's being processed, nor should you interrupt another instance in another interrupt (if you have nested interrupts).

All this is totally simple when you can use CLI/STI: void atomic_swap(node** a, node** b, node* c) { cli(); *a = *b; *b = c; sti(); }

// prepend new node to a list (easier than appending...) node* list; node* newNode; atomic_swap(&newNode->next, &list, newNode);

// get list (for processing) and make it empty node* oldList; atomic_swap(&oldList, &list, 0); (some strategically-placed 'volatile' can be appropriate.)

Actually I see much more often that people do not use proper locking and ponder about the occasional crashes that happen once in a fortnight :-)

By "easy to get wrong" I mean, for example, that you have to tell all your interrupt routines about your atomic_swap routine, so that they can see that they've interrupted it and can complete or restart it. This is nasty asm trickery. It enlarges all your interrupts by a dozen instructions (at least some compares and jumps). And all that just for a sticker "never disables interrupts" in the glossy brochures? Sorry, I'd rather take the additional three-cycle interrupt latency for running atomic_link under CLI.

I'm quite optimistic that I understand :-)

Stefan

Vote

D

D Yuniskis 16 years ago

You can wrap all of your interrupts with a common dispatch routine. But, it boils down to the same thing: there is more to get done.

The one thing in your favor is that these ISR's are, in theory, relegated to the operating system's responsibility. It is a fair bit harder when you let "user code" migrate into this realm.

[which is another aspect of this: folks who do things *in* ISRs that don't need to be handled there -- resulting in bloated ISRs which impacts the responsiveness of the whole system... which causes folks to tend to put more in *other* ISRs... which...]

Vote

C

ChrisQ 16 years ago

A related problem is where a single shared variable at driver level that needs to be accessed from different functions, all of which disable the interrupt within the critical section. Disabling interrupts is fine, but it is necessary to know at the time of reenabling any instance, what the current interrupt state was for the device, prior to the call. The solution was to write a couple of functions to stack the current device interrupt state before disabling, then pop on exit, rather than just a hard disable / reenable.

I suppose this will turn out to be a standard technique, but new to me at least...

Regards,

Chris

Vote

F

FreeRTOS info 16 years ago

...and also a technique that won't work on most compiler/architecture combinations as modifying the stack in this way will prevent the code in the critical region from executing correctly (any stack pointer relative addressing just 'aint going to work).

Regards, Richard. + http://www.FreeRTOS.org Designed for Microcontrollers. More than 7000 downloads per month. + http://www.SafeRTOS.com Certified by TÜV as meeting the requirements for safety related systems.

Vote

C

ChrisQ 16 years ago

No, not using the system stack. By writing functions, I mean functions using an short byte array to simulate a stack...

Regards,

Chris

Vote

4

42Bastian Schick 16 years ago

I would not recommend that any function (despite some deeply internal OS routines) should "enable" interrupts if it had not "disabled" them.

Of course, if the function may run in either context (i.e. interrupts disabled or not) it should save the current state and restore it.

So no need for some kind of global variable to stack the interrupt status.

BTW, the same is true for higher level locking using semaphores and mutex.

42Bastian Do not email to bastian42@yahoo.com, it's a spam-only account :-) Use @monlynx.de instead !

Vote

4

42Bastian Schick 16 years ago

If FSL only would have placed a big sticker on the manual. :-) Seriously, the manual states reservation works, and some place later in the EBI (external bus interface) description they write it does not work for EBI. => I learned, I have to read every single word of a manual.

42Bastian Do not email to bastian42@yahoo.com, it's a spam-only account :-) Use @monlynx.de instead !

Vote

D

David Brown 16 years ago

Do you mean something like this :

void foo(void) { doSomethingUnlocked(); memoryBarrier(); oldSREG = SREG; // Preserve old interrupt state cli(); // Ensure interrupts are disabled now doSomethingLocked(); memoryBarrier(); SREG = oldSREG; // Restore old interrupt flag doMoreUnlocked(); }

The details vary from processor to processor, but the principle is the same.

Some compilers have extensions or extra headers making it easier. For example, avr-gcc standard library has ATOMIC_BLOCK macros for this purpose, and msp430-gcc has an extra "critical" function __attribute__ that wraps a function in an interrupt disable/restore pair.

Vote

D

David Brown 16 years ago

Before I comment on your code below, let me first say I agree entirely that it is almost always better to have a small but clear and predictable interrupt latency using CLI/STI that you can easily see works correctly, than to have a complex scheme that whose correctness is difficult to evaluate. If your system cannot tolerate the small extra interrupt latency caused by such interrupt disable blocks, make a faster system or move the critical parts to better hardware.

(Note that this is in the context of microcontroller programming - if you are working with large systems, especially SMP systems, there are many more factors to consider.)

Here you have missed one important point that many people fail to understand about interrupt disabling (and the related topic of volatile accesses) - memory barriers. Unless your cli() and sti() macros / assembly functions specifically include memory barriers, then your atomic_swap function is not safe.

The problem is that C does not have any requirements about ordering of operations except for volatile accesses - all it requires is that the program has the same effect as though the operations were done in order, running on a C abstract machine. But the abstract machine has no concepts of interrupts, multi-threading, or external input and output and peripherals. The compiler can freely make re-arrangements to improve the quality of the generated code, which are correct for the abstract machine but may not be correct for your actual code.

In the atomic_swap function above, for example, the compiler will probably see the cli() and sti() as volatile accesses (depending on the compiler and the definition of these macros). But the code in the middle does not use volatile accesses - the compiler can therefore freely move these operations before or after the cli() and sti() operations. To avoid this, you need a memory barrier before and after these operations, to tell the compiler that memory reads before the first barrier are now invalid, and that any pending memory writes must be completed before the second barrier.

A typical memory barrier on gcc is a memory clobber: "asm volatile ("" : : : "memory");

On some systems, you may need to issue processor instructions too (such as a "msync" or "mbar" on a PPC) to deal with pipelined write buffers, but that's not typically the case for small microcontrollers.

Some compilers will always consider inline assembly as a memory barrier

- that makes it easier to write safe code in this case, but harder to use inline assembly optimally in other cases. Either way, you have to know what your compiler is doing.

Vote

C

ChrisQ 16 years ago

Yes, but using an array, rather than a single variable. The code looked like:

static U8 u8PwmInterruptStack [SIZE]; static U8 u8PwmStackPointer = 0;

/* Push Pwm Interrupt State and Mask Interrupts */ /* -------------------------------------------- */ void vPwmPushMaskInterruptState (void) {

if (u8PwmStackPointer < PWM_STACKDEPTH) { u8PwmInterruptStack [u8PwmStackPointer++] = u8PwmReadIntState (); vPwmMaskInterrupts (); } }

/* Pop Pwm Interrupt State */ /* ----------------------- */ void vPwmPopInterruptState (void) {

if (u8PwmStackPointer > 0) { u8PwmStackPointer--; if ((u8PwmInterruptStack [u8PwmStackPointer] & PWM_INTMASK) != 0) { vPwmUnmaskInterrupts (); } } }

The range check is just there for safety and in practice, the array is sized to suit the application. The first function is called before twiddling the shared data and second after.

Turned out to be a simple solution, but the first time i've had to use anything like it...

Regards,

Chris

Vote

D

David Brown 16 years ago

This may sound like a silly question, but why don't you just use a local variable to hold the old interrupt status? It is rare that you would want to disable interrupts in one function, and restore the state in a different function - since you want to keep the disable time to a minimum, and be very clear about when they are disabled or not, your disable and enable are typically done at the same level. Thus you have

{ uint8_t oldState = u8PwmReadIntState (); vPwmMaskInterrupts () doSomething(); if ((oldState & PWM_INTMASK) != 0) { vPwmUnmaskInterrupts (); } }

The code in doSomething() could quite happily have the same sort of sequence. There is no need to have a special stack - normal locals suffice.

If you have a complex system with particular interrupts (rather than global ones) enabled and disabled from different threads in a program, then local variables like that won't be sufficient. But I'd first suggest thinking hard about the structure of the program - there are probably better, clearer and therefore safer ways to get the effect you want. However, if you want to do something like that, you don't need a stack - you just need a counter variable tracking the number of times the interrupt has been disabled. Of course, you need to disable global interrupts around access to this counter...

Vote

M

Mark Piffer 16 years ago

he

I remember reading somewhere, something like Motorola documentation departments being the only US organizations running with CMMI level 5. Their documents suck with a high standard definitively. :)

Mark

Vote

C

ChrisQ 16 years ago

The product is typically a 128 x 192 x 2 colour led pixel graphics display, driven via serial comms at 9.6k. The leds are driven from ti led drivers, which handle 8 leds each, and use a set of serial control lines to clock and latch data and switch between data transfer and self test modes. Intensity is done via pwm on an enable line, which itself is one of the control lines. The absolute requirement to avoid any flicker at all on the display means that most control functions and data transfers can only be done in interrupt context, synchronised to the pwm off period. From a programming point of view, it's not ideal but that's the hardware that we had to work with. There's no rtos, rather state driven, with more state machinery in the comms, data transfer and data self test areas.

The software was developed over 2 years and now runs on many variants. Problem was that there were many layers of code and an overall lock requirement that was difficult to resolve across the layers. The only way to keep track was to save and restore context at each point. Yes, you could do this with a local variable in each function, but it is more code and more bug prone than a single line function call for disable and enable. The functionality is encapsulated and I don't have to think about the internals, nor remember how many instances there are in the code if it needs to be changed. Run time speed is not an issue either, so an added function call or so is irrelevant.

Ideal world rarely maps onto real products in practice :-)...

Regards,

Chris

Vote

S

Stefan Reuther 16 years ago

Yep. The declaration should've been void atomic_swap(node*volatile* a, node*volatile* b, node* c); I was just too lazy to figure out where to place the 'volatile's :-) That, together with cli/sti macros acting as memory barriers (honestly, cli/sti which are not such barriers are useless) fixes the routine for a single processor system. Multiprocessor is a little harder.

Actually, our compiler moved the accesses even though I used 'volatile'. Fortunately the vendor fixed it quickly, but in the meantime I've made an asm version to fix it forever.

Stefan

Vote

H

Hans-Bernhard Bröker 16 years ago

That won't work, because it has the exact same problem that the concept of a critical section is supposed to work against. Interrupt state can change between these two instructions, and that would leave you restoring an out-of-date oldState.

Ultimately, you need atomic query-and-change operations to resolve this.

Vote

Disabling interrupts to protect data

Join the Discussion

Didn't find your answer?