Shared Data Problem

I

Ian Bell 21 years ago

What is the current thinking on the best way to solve the shared data problem in embedded systems?

Ian

Ian Bell

Vote

R

Robert Scott 21 years ago

What's the problem?

-Robert Scott Ypsilanti, Michigan

Vote

I

Ian Bell 21 years ago

Accessing a shared variable in a non atomic way can give erroneous results if the ISR that also changes it occurs part way through the access.

IAn

Ian Bell

Vote

C

CBFalconer 21 years ago

What is new about that? Standard concurrency problem. The simplest way on single processor systems is to disable interrupts during access, i.e. create a critical region. You can also use semaphores, or monitors. Watch out for deadlocks. If you have an OS it will probably provide appropriate techniques. Read up on priority inversion.

"If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson

Vote

R

Robert Scott 21 years ago

It is bad program design to have a variable that is changed by both an ISR and a main program. In almost every case it can be avoided.

In case you can't figure out how to avoid that situation, then you may be forced to use one of the more heavy-handed solutions, as indicated already by other responders. But in response to your general question, I would say that every situation is different, and there is no "current thinking" on the issue. There are just a collection of various solutions that you can pick from as needed. Get to know all the tools in the toolbox and you will pick the right one when the time comes.

-Robert Scott Ypsilanti, Michigan

Vote

B

Bryan Hackney 21 years ago

Don't use super-atomic accesses in interrupts.

Vote

P

Paul Keinanen 21 years ago

On many non-virtual memory CISC architectures it is even simpler to use a single non-interruptable instruction to manipulate a shared variable, such as a single instruction memory location increment or decrement. To be on the absolute safe side, the data size should be no larger than the physical memory bus width and the variable aligned in a such way that only a single read-modify-write memory cycle is needed.

The situation with virtual memory systems is trickier, since each instruction must be interruptible or restartable in order to handle a (potential) page fault during each memory reference. Even on virtual memory systems, some interlocking mechanisms are usually available, such as the interlocking prefix on x86 systems or the "add with interlock" instruction on VAX systems. The shared variable must be naturally aligned, i.e. it can be accessed in a single memory reference and it does not cross virtual memory page or cache line boundaries. In some systems even some dummy references to the variable and the instruction stream just after the interlocked instruction might be a good idea to preload any missing pages before attempting the actual interlocked access.

The situation is much more complex with RISC architectures with only load and store memory access instructions, so only one task or interrupt service routine should write to a single variable. Even in this case the data size should be smaller than the memory width and properly aligned, which is usually the requirement in RISC architectures anyway.

Paul

Vote

I

Ian Bell 21 years ago

Did I say it was new?

I know.

So what you are saying is the current thinking has not changed in the last decade.

Ian

Ian Bell

Vote

I

Ian Bell 21 years ago

In other words atomic operation is likely to be viable in a CISC architecture but not in a RISC one?

Ian

Ian Bell

Vote

I

Ian Bell 21 years ago

A very interesting reply, thank you. Almost all references I have come across go straight to atomic operation, semaphores and so on. None mentions avoidance which I intuitively feel to be the first and best option to be abandoned only as a last resort. However I have not found any reference to avoidance techniques. Can you give some examples or pointers to references?

Thanks

Ian

Ian Bell

Vote

I

Ian Bell 21 years ago

I am not familiar with the phrase 'super-atomic' but I will assume it means non atomic. Unfortunately the interrupt is the one place where a non atomic operation is allowed because the interrupt will always have a higher priority than the task it shares data with. The task is where non atomic operations can lead to data corruption

Ian Bell

Vote

V

Vadim Borshchev 21 years ago

It naturally depends. ARM, starting from architecure version 2, has swap instruction that operate with both 32-bit (swp) and 8-bit (swpb) semaphores.

Vadim Borshchev

Vote

C

CBFalconer 21 years ago

I don't believe so. What may have changed is the prevalence of certain types of failings, i.e. there are more multi-processor systems about, and more systems operate under formal OSs, etc.

"If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson

Vote

T

Thad Smith 21 years ago

As others have mentioned, the easiest way is usually to disable interrupts around the non-atomic variable access.

Another technique that I once used in a situation when I couldn't afford the increase in latency caused by disabling interrupts is to have the background routine check for an interrupt during the non-atomic read and perform the read again if an interrupt occurred:

interrupt: do stuff var = new value intp = 1 return

background: do intp = 0 vcopy = var while (intp == 1) use vcopy

This is the same general technique that is often used for reading a multi-byte timer on the fly:

do hb = timer.msby lb = timer.lsby while (timer.msby != hb)

Another technique that you might use, for the situation in which the interrupt is setting the data and the background is reading it, is to have two locations, one for realtime update, the other for reading:

dataStructure buf[2];

interrupt: buf[i] = data

background: do stuff i = 1-i /* flip buffers */ work with buf[1-i]

In this case, when the background flips the buffers (indicated here by an array subscript, but could be a boolean), it allows access to the latest value preceding the flip, while allowing interrupt update in the other buffer. I assume that setting i is atomic.

Thad

Vote

I

Ian Bell 21 years ago

Now that is interesting. Would you say that although systems as a whole have become more complex, the complexity per processor is about the same or has that gone up too because of more powerfull processors or is the greater power simply being used to do more powerful algorithms?

Ian

Ian Bell

Vote

P

Paul E. Bennett 21 years ago

If I am reading the nature of your question in the way I think you mean then I was expecting better of you Ian. I would have thought you would know that two processes/tasks writing to the same variable are bound to get their collective knickers in a terrible twist. It would, surely, be better to use separate variables and combine them elsewhere, out of reach of the interrupt routine.

If this is part of a multi-processor system then you may have to sit down and give some careful consideration to a proper dialogue between processors. This may have to include some hardware handshaking as well.

******************************************************************** Paul E. Bennett .................... Forth based HIDECS Consultancy ..... Mob: +44 (0)7811-639972 Tel: +44 (0)1235-811095 Going Forth Safely ....EBA. http://www.electric-boat-association.org.uk/********************************************************************

Vote

I

Ian Bell 21 years ago

snip

Thanks Thad for those usefull ideas.

Ian

Ian Bell

Vote

I

Ian Bell 21 years ago

Glad you think so highly of me ;-) though I am not sure I deserve it.

Indeed I do know that and my question was what is the *current* thinking about how to address the problem.

You will have to explain how that solves the problem. *Any* code outside the interrupt runs the risk that it may be interrupted as it operates on the variable.

I was thinking more of single processor systems.

Ian

Ian Bell

Vote

R

Robert Scott 21 years ago

OK here's some specific examples.

In a very primitive micro, and ISR generates a PWM signal based on on-time and off-time that is calculated by a main program. The parmeters that define the output signal are several, and if the main program updated only part of them when an interrupt hit, then the ISR would generate a very inappropriate pulse width. For this particular application, I could have disabled interrupts during the time that the main program transferred the parameter set to the variables used by the ISR. But that could have delayed the ISR and thus added jitter to the output pulse. So what I did was to have the main program poll an interrupt count until it sees it change. The main program now knows that an interrupt has just happened, and therefore another interrupt is not going to happen for a known length of time. So now it is safe to transfer the parameters. Like most avoidance solutions, this one is highly application-dependent. It relied on the fact that interrupts never stop, that there is only one kind of interrupt in this system, and that the main program has nothing better to do than to wait around for an interrupt. But if all those things are true, then this is a solid solution.

Another problem is the one of reading a timer together with a software extension of that timer maintained by and ISR. I believe there was a thread in this forum recently about that topic. The difficulty is in grabbing a consistent high part and low part from the ISR-maintained extension and from the hardware timer. The heavy-handed approach would be to disable interrupts during the access, and even then the solution is not trivial. But there is no need to do that. Read the hardware low part, then read the software extension. If the hardware low part has the sign bit clear, then an overflow interrupt is not going to happen soon, so the subsequent read of the software extension is consistent with it. If that sign bit was set, then an interrupt may have happened, so read the hardware low part again. It will be consistent with the software extension.

This one has nothing to do with interrupts, but it does relate to shared data. If you have a system with cooperative multitasking (not preemptive), then implementing a shared resource is easy. Check an "in-use" flag. If the in-use flag is clear, then set it and take possession of the resource. After you are done with the resource, clear the in-use flag. This works because this task is guaranteed that no other task will check the in-use flag inbetween the time that this task checks it and when this task sets it. During the period of ownership of the resource, the task may give up control to the multitasking scheduler. If the multitasker is preemptive, then you have to rely on help from the OS and use a real semaphore. Buy many simple embedded systems use the cooperative model, so the simple in-use flag works.

I suppose if I had to come up with some general prinicple of shared data, I would say it is this: The design of the total system should make it unambiguously clear which entity has the right to modify a variable. Which entity that is (a particular task, or an ISR) may change in the course of the running of the program, but there should never be any doubt as to who that entity is at any point in time. Several entities may have read-access, but only one should have write-access. Whatever means you can use to guarantee this condition

- be they systematic or ad hoc - the desired result can be achieved. What distinguishes embedded systems from general-purpose systems in this regard is that in embedded systems you often have more control over what is going on.

-Robert Scott Ypsilanti, Michigan

Vote

J

Jim Granville 21 years ago

The best model here would be a 'Software FIFO', for instances where atomic access is not supported, or possible. That trades off a small added latency, to guarantee the variable does not change-during-read.

-jg

Vote

Shared Data Problem

Join the Discussion

Didn't find your answer?