Compiler Support for Ensuring that a Statement is Atomic

- D
- Datesfat Chicks
  
  Contact options for registered users
posted
14 years ago

Sat, Jan 16, 2010 12:52 AM

This is slightly off-topic, as it involves implementation rather than the C language.

It frequently comes up in embedded systems that one wishes to ensure that a C-language statement is atomic. One might have available functions or macros to disable or enable interrupts, so one might write:

volatile int sempahore;

DI(); sempahore++; EI();

Whether an increment can be handled atomically depends on the processor, where in memory the variable is located, its size, and on specific decisions the compiler makes.

Other operations that one might wish to ensure atomicity with include assignments, bitfield assignments, tests, etc.

For example, in the code:

volatile unsigned long x;

if (x > 29204) ...

there might be a problem because an asynchronous process may update the variable and cause the test to get inconsistent bytes as it performs the test.

In most cases, one can protect the statements of concern with DI()/EI() or find a way to guarantee atomicity. The issue is that this method of protection can be inefficient. It consumes memory, consumes CPU cycles, and affects the timing of interrupts.

Here is my question:

Has anyone seen any compilers that support guaranteeing that a certain statement or test is atomic?

What I'd be looking for is something like:

#pragma ASSERT_ATOMIC_STATEMENT_START semaphore++; #pragma ASSERT_ATOMIC_STATEMENT_END

The pragmas would advise the compiler to:

a)Try to compile the statement atomically (using a single instruction), and

b)Create an error if it can't be compiled atomically.

This would save the expense of DI()/EI(), or generate an error if the statement can't be compiled atomically.

Is there anything like this in the world?

Thanks, Datesfat

- N
- Niklas Holsti
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 16, 2010 6:09 AM

Hey, that is exactly how I usually mistype "semaphore". Must be something in the qwerty layout...

The C standard (C99, at least, I believe) defines the type "sig_atomic_t" which is guaranteed to be accessible (read/write) as a whole, so it solves the problem with the ">" test, but not the problem with increment operations. However, "sig_atomic_t" is guaranteed only to be at least 8 bits wide, not "long". In practice "sig_atomic_t" is probably as large as the processor's memory interface allows, so it could be as wide as "long".

I haven't, except for sig_atomic_t, which does less than you are asking for.

A good suggestion. But I suspect that compiler writers would find it easier to introduce compiler-specific "built-in" functions for things that can be done atomically, such as increment/decrement of a memory word, or swapping the contents of a memory word and a register. Maybe some such functions could be standardised? For example, a function called "atomic_increment" (decorated with suitable '_' marks to make it a reserved identifier). The compiler would accept or reject a call of such functions depending on the types of the parameters and on the capabilities of the target processor.

GCC has a set of built-in atomic memory access functions

formatting link

It seems that they don't support counting semaphores but do support locks (binary semaphores).

--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

- W
- William Ahern
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Jan 16, 2010 10:20 PM

IIRC, C++0x has . The proposal also put forth stdatomic.h, which I think is being considered for inclusion in C1x.

There were/are actually many similar proposals, but I believe the one that was actually settled on was this one:

formatting link

- B
- Boudewijn Dijkstra
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Jan 18, 2010 8:34 AM

Op Sat, 16 Jan 2010 01:52:21 +0100 schreef Datesfat Chicks :

For you maybe.

Imagine how a program with DI/EI all over the place can wreak havoc on your worst-case interrupt latency.

Don't do that then. Use an OS service that ensures exclusive data ownership, like a messaging interface.

No.

Assuming that the architecture guarantees that any instruction so generated is atomic.

And then what? Try a different compiler? Switch architectures? Redesign the software?

--
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/
(remove the obvious prefix to reply by mail)

- N
- Niklas Holsti
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Jan 18, 2010 9:04 AM

For me, too, although only occasionally, not "frequently". But then I don't spend much time programming in C.

The effect on the worst-case interrupt latency does not depend on the number of DI/EI pairs "all over" the program, nor on how often they are executed. It depends only on the duration of the DI/EI pair that has the most code between the DI and the EI, in terms of the execution time of that code.

But why do you, Boudewijn, criticize Datesfat for mentioning DI/EI, when Datesfat's very aim is to *avoid* DI/EI by making the compiler use an instruction that is intrinsically atomic?

Such OS services tend to have longer interrupt-disabled regions than a simple "semaphore++" and thus more effect on the worst-case latency. And then we have the bare-board systems that use just background + interrupts with no OS to speak of. Finally, Datesfat's question and proposal are also applicable to the internals of OS services, which also could benefit from using intrinsically atomic instructions instead of DI/EI.

--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

- D
- Datesfat Chicks
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Jan 18, 2010 8:44 PM

You are correct in the mathematical analysis. All that matters is the worst case for any one critical section. The number of critical sections is irrelevant for a worst-case latency analysis.

Most critical sections are very short (by design), but why use them at all if you can select intrinsically atomic instructions?

The systems that I work on tend to have about 32K of FLASH memory and 2K or so of RAM. There is no formal operating system that provides such services. The logic is typically just to execute a loop at a constant rate, with interrupts happening at the same time based on timer compares or overflows, received SCI characters, etc.

The "no OS to speak of" describes the type of system I work on.

This is also correct.

I've done some work also with Windows in a multi-threaded sense. Some of the Windows documentation guarantees that reads and writes of certain types of 32-bit variables will always be atomic. I believe that, but I still like to look at the assembly-language.

Additionally, some computer instructions are made just for the purpose you described--to efficiently implement operating system primitives and/or solutions to classic IPC problems.

One example that comes to mind is this one:

formatting link

The instruction is there specifically to address a specific IPC problem. An instruction like TSL serves no purpose of any kind in a single-threaded framework.

Datesfat

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Jan 18, 2010 10:47 PM

[attributions elided]

Not quite. In the (admittedly unlikely) case that one critical region follows another, they can become cumulative (it depends on when your processor actually enables interrupts).

E.g., I typically use critical regions to manipulate shared structures at the beginning and end of a "routine" (function). So, you might see a "ret" immediately following an "EI()" (using your notation). The routine that it returns to may then immediately enter another critical region.

I have no qualms about nesting critical regions. Sometimes it makes the code a *lot* easier to deal with.

In other cases, it is possible that your critical regions *never* impact interrupt latency. E.g., consider:

spinwait(foo); EI() blah; DI()

where "foo" is signalled by an ISR. I.e., the interrupt has just *finished* when the EI() commences...

You can't always do this. And, sometimes the instructions that you use to implement this may be nonintuitive (unlikely for a compiler to chose)

Even a messaging service has, at some level, a need for atomic operations. Putting it under a bushel doesn't cause it to cease to exist.

This "guarantee" is undoubtedly relateed to the machine's architecture. E.g., block moves (pretend a 40 byte structure is a data object) are probably not true, here.)

Well, that's not entirely true, either. One can imagine having a piece of code that manipulates a boolean. And, some other piece of code EXECUTING IN THE SAME THREAD wants to check to see if that boolean is set and, regardless, set it thereafter.

- D
- Datesfat Chicks
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Tue, Jan 19, 2010 3:22 AM

You are correct. I didn't think of that scenario.

Datesfat

- B
- Boudewijn Dijkstra
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Tue, Jan 19, 2010 3:58 PM

Absolutely right. But consider the differences between a solution written and maintained by specialized people, and one by people with much less experience.

That sounds as if there was no decision process for hardware & tools. I would have expected something along the lines of: "it was decided to keep hardware cost low and not use an OS."

Since in your case an OS means richer hardware, then you might actually reduce latency by having a faster processor. ;)

And some OSes do utilize these. Not by depending on the compiler, but with plain assembler code.

--
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/
(remove the obvious prefix to reply by mail)

- B
- Boudewijn Dijkstra
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Tue, Jan 19, 2010 4:23 PM

Right. But every DI/EI pair does increase _average_ interrupt latency, for all interrupts, so even those that might be really important and cannot even access that particular piece of shared data.

I merely intended to point out a disadvantage of using DI/EI (which horribly failed). In fact, in the context of his question, the widely used DI/EI is very illustrative and it would be unwise to not mention it.

--
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/
(remove the obvious prefix to reply by mail)

- M
- Mr. C
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Wed, Jan 20, 2010 2:39 PM

Nobody has mentioned the problem with using the DI/EI pair in code segments where interrupts can some times be enabled and other times be disabled. The code above will always enable interrupts, even if they were not enabled before the DI was executed.

- D
- Datesfat Chicks
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Wed, Jan 20, 2010 4:46 PM

This problem is normally very easy to address. It is normal to examine the interrupt mask and make the decision about whether to enable interrupts again based on whether they were enabled originally.

It is of course easy to do in assembly language, but some compilers allow testing of the interrupt mask as well.

It might go something like this:

BOOLEAN im;

im = imask(); //Built-in function supported by the compiler.

DI(); semaphore++; if (!im) EI();

This is a fairly easy problem, and very common.

Did I miss something?

Datesfat

- H
- Hans-Bernhard Bröker
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Wed, Jan 20, 2010 11:36 PM

Easy, yes. But potentially quite wrong, too. You can't implement a semaphore using a sequence of operations (store current interrupt mask, then disable interrupt) that itself needs to be protected by one to avoid untimely outside interference. That's an exercise in pulling yourself out of the pond by yanking on your own hair.

And you'll be SOL if the interrupt mask gets changed by an interrupt happening in between those two instructions.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jan 21, 2010 12:43 AM

It's an unusual system that will change the interrupt mask during an interrupt, and not restore it on exit - and the situation can only be an issue on processors with multiple interrupt levels and no global interrupt flag (such as the m68k). There might be occasions when this sequence is dangerous, but it is normally considered safe.

- D
- Datesfat Chicks
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jan 21, 2010 1:16 AM

In most software architectures, the scenario you proposed can't occur. The interrupt mask is typically a part of the PSW or CC register (same animal, depends on who is doing the naming), and it is normally saved and restored by the hardware when an interrupt occurs and when the ISR exits. In most small microcontrollers, the interrupt mask is in the same register as the carry flag, zero flag, overflow flag, etc., so it has to be restored on exit from an ISR.

I suppose an enterprising ISR could reach into the stack and modify the value that will be restored when the interrupt exits, but this is not done by sane humans.

Generally, can't happen.

Datesfat

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jan 21, 2010 7:05 AM

It's not a problem if you only use DI/EI around very small sections of code so that you know that DI/EI will never nest.

For larger sections of code, you should probably be using mutexes, semaphores, RCU, or whatever (which may need to use DI/EI in their implementation).

- J
- James Dow Allen
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jan 21, 2010 7:22 AM

I almost posted something like the following, but was busy stamping out flames in another ng. :-)

the

Nothing to add, except a True Anecdote(tm). Development of a peculiar system(*) was getting behind schedule due to a very intermittent bug. I came across the responsible programmer sitting next to a logic analyzer reading a comic book or something. "Can't do anything until the analyzer triggers again" was his answer. "Problem occurs only once every several hours."

I showed him how to research by writing (the equivalent of) assert(imask() =3D=3D ENABLED); before the DI(); and thus make the system "fail" several times a second instead of once every several hours. I'm sure he then figured out the fix himself.

(* -- IIRC, it was a processor emulating a disk controller, connected to a disk controller emulating a processor, or some such. :-)

James Dow Allen