Cortex-M: share an int between two tasks

P

pozz 6 years ago

I'm working on a Cortex-M4 MCU and using FreeRTOS.

One task:

uint32_t wait_period; while(1) { // make some things vTaskDelay(pdMS_TO_TICKS(wait_period)); }

The following function can be called from another task:

void set_waiting_period(uint32_t new_period) { wait_period = new_period; }

In this case, is it needed to protect the access of the shared variable wait_period with a mutex? I don't think, we are dealing with an integer variable that should be accessed (reading and writing) atomically.

Vote

D

David Brown 6 years ago

You don't need to use a mutex, as you won't have race conditions when accessing it (race conditions happen when at least one thread can be writing, and at least one thread reading, and the accesses are not atomic). However, you /do/ need to make the accesses volatile, or there will be no guarantee that the variable will be read or written in the spots where you have accessed it.

Vote

P

pozz 6 years ago

Ok, I got the point.

Let me explain just to be sure I got it.

If wait_period is not volatile, in the task endless loop, the compiler could make the assumption the variable doesn't ever change, so for example it could "cache" its value in a register. In this case, when set_waiting_period() is called from another task, the variable in RAM is changed, but the value in the register doesn't change, so the task will continue using the old value.

I think the access could be volatile only in the loop and not in set_waiting_period:

uint32_t wait_period; while(1) { // make some things volatile uint32_t *vwait_period = (volatile uint32_t *)&wait_period; vTaskDelay(pdMS_TO_TICKS(*vwait_period)); }

void set_waiting_period(uint32_t new_period) { wait_period = new_period; }

Vote

D

David Brown 6 years ago

Exactly right.

(The same can work when writing variables in a loop - the compiler knows that it has full control of any normal variables it is using, so there is no need to write out the variable to ram until later, if that would give more efficient code.)

Bad idea - get in the habit of making such writes volatile too. If the "set_waiting_period" function is compiled and linked separately, then the compiler will have to generate the write instruction before the return. But if it has access to the definition (because it is in the same file as the one using it, or you have link-time optimisation), and you write something like:

set_waiting_period(100); // more code set_waiting_period(1000);

then the compiler can skip the first write to "wait_period" if it knows "more code" does not use it.

Most problems with missing volatiles are seen from a failure to do volatile reads in a loop. The second most common failure (IME) is assuming that volatile accesses place an order on non-volatile accesses (they don't). But failure to use volatile on writes occurs too, and is often more subtle, with code working fine until you change something else that appears independent.

Vote

P

pozz 6 years ago

Il 13/03/2020 15:38, pozz ha scritto:

Ok the example was extremely simple. If I have something a little more complex:

volatile uint32_t wait_period; while(1) { // make some things if (wait_period > 0) { vTaskDelay(pdMS_TO_TICKS(wait_period)); } else { vTaskSuspend(myTask); } } void set_waiting_period(uint32_t new_period) { wait_period = new_period; if (wait_period > 0) { vTaskResume(myTask); } }

Vote

D

David Brown 6 years ago

When you have more complex examples, you start needing synchronisation of some sort (it can be a disabled interrupt block, a lock like a mutex, or message passing, or something similar). Sometimes it can be handled with lock-free methods.

Generally, if you have one reader and one writer, volatile will be enough - but you have to think about exactly how you want to arrange the accesses.

I find one of the best ways to get good software design is to think about it like hardware. You don't have one hardware signal controlled by various different signals scattered around the design - if you have an LED that can be controlled by different signals, you put these signals into a multiplexer or logic gate (or wired or / wired and). And you collect them all together in one place in the design to do that.

The same applies to software. Don't change a variable from a dozen different places around the program. Don't access the same resource from different places. Put in queues, multiplexers, etc., so that you have control.

In a case like this, the idea that you have your "myTask" being suspended from one thread and woken up by a different thread is your flaw. Fix the design - the suspension and resumption of "myTask" should happen from /one/ place in /one/ thread - and your problems of synchronisation go away.

Vote

P

pozz 6 years ago

Just to think of a more complex example. If you share a data structure with some members and they must be consistent (all members must be changed atomically avoiding reading only a part changed), volatile is not enough. You need mutex, right?

I see. I think the safest approach is async messages sent in a queue.

I have two tasks. Task A can be suspended and resumed. Because Task A can't resume itself from suspended state, only Task B should suspend and resume Task A.

// Task A while(1) { // make some things vTaskDelay(pdMS_TO_TICKS(wait_period)); }

However I want to suspend Task A, if needed, after "make some things" or during vTaskDelay(). I don't want to suspend Task A in the middle of "make some things".

I think I need a semaphore:

// Task A while(1) { xSemaphoreTake(...); // make some things xSemaphoreGive(...); vTaskDelay(pdMS_TO_TICKS(wait_period)); }

// Task B if (taskA_must_be_suspended) { xSemaphoreTake(...); vTaskSuspend(taskA); xSemaphoreGive(...); } else { vTaskResume(taskA); }

Or at least a simple boolean, but volatile, flag:

volatile bool taskA_blocked;

// Task A while(1) { // make some things taskA_blocked = true; vTaskDelay(pdMS_TO_TICKS(wait_period)); taskA_blocked = false; }

// Task B if (taskA_must_be_suspended) { while(!taskA_blocked) ; vTaskSuspend(taskA); } else { vTaskResume(taskA); }

Vote

D

David Brown 6 years ago

You need something to make the accesses atomic, so that each reader or writer sees a consistent picture. (This applies also to data that is bigger than the processor can read or write at one time, and for read-modify-write accesses.) Mutexes are one way to make atomic access, but they are not the only way. With C11/C++11, atomic access is part of the language - but support in libraries can be poor or even wrong (a common implementation I have seen could lead to deadlock). Other locks and synchronisations will work, as will a simple interrupt-disable block. You can also use indicator flags that are atomic to mark when a block is being updated, or have more than one copy of the block and use an atomic pointer to the current block. Many of these techniques can be simpler and more efficient than mutexes, but are only correct if certain assumptions hold (like there only being one writer thread).

If you can arrange for your data to be transferred between threads using message queues of some sort (most RTOS's support a few different types), then this is often the easiest and cleanest method of communicating between threads.

But you are right that volatile is not enough!

Yes, it is often good. It is often easier to see that the code is correct when you use message passing rather than mutexes.

If you are using "suspend" and "resume" directly, you are almost certainly doing it wrong - no matter which threads are doing what.

Suspension of a task should be for a reason. If you are waiting for some time, use vTaskDelay (or similar functions). If you are waiting for data from another thread, wait for that - receiving from a queue, waiting on a semaphore, a condition variable, etc.

Resumption of a task should be for a reason - send it a message in a queue, set the semaphore or condition variable, etc.

"suspend" and "resume" are low-level "beat it with a hammer" approaches.

Vote

A

antispam 6 years ago

To me the above looks like very natural anti-pattern. Namely, code testing 'wait_period' _and_ execution of 'vTaskSuspend' logically form a single ciritical section. AFAICS any placement of locks above leads to races.

My solution to the above is have _whole thing_ in scheduler, so that testing flags and switching process is atomic. In your case messages look like step to solution, but you still need to think about messaging protocol...

Waldek Hebisch

Vote

H

Hans-Bernhard BrÃ¶ker 6 years ago

Am 18.03.2020 um 09:09 schrieb pozz:

You could do that, but you absolutely should not.

You make the _variable_ volatile, and that's that.

Just looking at that line should tell you that this cannot possibly be the right way to do anything.

The primary rule about explicitly casting pointers in C code is: don't do it. All the evidently safe casts are already performed implicitly by the compiler. The perceived need for less safe ones almost invariably results from an incorrect design decision a few steps earlier.

Vote

D

David Brown 6 years ago

I disagree. I think it is vital to remember that it is /accesses/ that are volatile, not variables. Declaring a variable volatile simply means that all accesses to it must be volatile (and don't need anything extra to make them volatile).

Often, you /do/ want all accesses to such a variable to be volatile, and thus declaring it as volatile is useful. And you can't make correct code wrong by having an extra volatile - at worst, you might loose a little efficiency.

True - the cast should be hidden inside a macro like "ACCESS_ONCE".

Not entirely true.

Certainly quite a few casts that people use are not safe. In particular, it is, in general, /not/ safe to access objects as a different type just using a cast. But there are exceptions - and adding a "volatile" in a cast is one of them. Casts of pointer types should definitely be treated with caution, but that does not mean they are always wrong.

(The C standard has been unclear about the meaning of using a cast-to-volatile like this. However, all known C implementations, and all members of the C standards committee, have been happy that a cast-to-volatile-pointer does what you expect, at least for simple read or write accesses. And this is now codified in C18.)

Vote

U

upsidedown 6 years ago

The concept of global variable or global struct "owner" is hardly in this respect. Only let the "owner" update the volatile global variable and let one or more other tasks only read this variable.

If multiple writers are required, send an update request (by e.g. some queue mechanism) to the owner, which will make the update.

This is especially important for global structures, it is important that only the owner updates fields in the struct.

A version number field should be added to the struct, which is incremented each time the owner has updated one or more fields of the struct. The readers should first read the version number, copy all required fields to local variables and then reread the version number. If the version number is still the same, use the copied data. If the version has been incremented, copy again the fields to local variables, until the version number doesn't change. Of course, watch up for pathological cases :-).

If the RTOS supports messages as big as the struct, just ask the owner to send the most recent version of the struct.

Some OS support also waiting data from an other thread in which also a timeout parameter can be used. If you wait for an event that will never occur, the timeout parameter can be used to make a fixed delay.

Yes, absolutely.

"Suspend" can be usable if done e.g. with low power Wait For Interrupt instruction. When an interrupt occurs, the interrupt will be served and then the scheduler is executed to find the first high priority task in READY state and it starts executing i.e. WaitForSignificantEvent. If the scheduler starts to run the original suspender, it should check if it should wake up or start a new "Suspend" state.

Vote

D

David Brown 6 years ago

(typo - "handy", I assume)

Yes. But be careful that this alone is not enough if the variable is too big to be written atomically by the processor.

The "owner" here refers primarily to the owning thread or context, but the "owning" module, function, class (for C++), etc., is also a useful concept for good structure. And the ideas apply to all data, not just "global" data.

Yes, that is a good way to do it.

That is certainly one way to handle atomic reads. It is by no means the only way, but it can certainly be a good way for data that is rarely changed. However, you must be careful to use volatile reads for the struct (the local copy does not have to be volatile), or use memory barriers, to ensure that the reading really takes place in the correct manner.

For smaller data - such as a 64-bit tick counter - it is sufficient to simply read the whole data repeatedly until you get the same value in two successive reads. That avoids the need of a separate version field. (Don't use that for anything that may suffer from an ABA problem.)

Often good, yes - I do that often. When sending more complex data involving pointers, be very careful about the lifetime of the data pointed at, and who (if anyone) is responsible for clearing it up.

True, but usually the OS will have a specific call for a timed delay which will be clearer in the code.

Vote

P

pozz 6 years ago

I didn't know that suspend/resume mechanism shouldn't be used in normal cases like mine.

I simply wanted a management taskB to stop and restart sensor sampling done by taskA. taskA uses vTaskDelay() for sampling at precise timings.

Suspend/resume appeared to me the most natural way to implement this stop/restart feature.

Thank you for your suggestions. I think I will use a queue and events pushed from taskB and read from taskA.

After many years using cooperative schedulers, it appears to me incredible how difficult, error prone and resource consuming is to implement even simple mechanisms like this one with a preemptive scheduler.

There are many subtle things that could happen in a preemptive scenario and you must open your eyes very well in each part of code to avoid race conditions that will be manifested only to the user and not during testing.

Is there a way to test for the presence of dangerous race conditions? Or is this demanded to design only?

Vote

P

pozz 6 years ago

Il 18/03/2020 19:37, snipped-for-privacy@downunder.com ha scritto:

Interesting. Could you explain?

You write "messages" so you aren't thinking of a simple function like this, are you?

// taskB calls get_struct() to retrieve mys // The owner of mys is taskA void get_struct(mystruct *s) { *s = mys; }

Indeed this simple function suffers from the problems of multitasking and non-atomic accesses (considering the whole data structure).

I think you mean using two mono-directional queues: one for taskB->taskA communication, one for taskA->taskB communication.

--- taskA.c --- // taskA process events on its queue qA static queue_t qA; static mystruct mys; // taskA is the owner of mys while(1) { .. if (waitForMessage(qA, &msg) == pdTRUE) { if (msg.type == GET_STRUCT) { push_message(msg.qdst, mys); // mys is *copied* in the message } } }

// Another task (taskB) could request mys void request_struct(queue_t qdst) { msg.type = GET_STRUCT; msg.qdst = qdst; push_message(&msg); }

--- taskB.c --- // taskB should do something similar when needs mys static queue_t qAB; ... request_struct(qAB); waitForMessage(qAB, &msg, pdPORT_MAX_DELAY); mys = msg.data; // mys in taskB is a thread-safe copy of mys of taskA

Is this what you mean? I admit, the code is convoluted. taskA should be coded without long delays otherwise taskB could not receive mys in time. Or taskA should block on qA too *always*.

Vote

D

David Brown 6 years ago

That is very roughly the idea. But usually you can structure the code better than this. What you are describing is a more general "remote procedure call" - the task that owns the data has to receive commands, carry them out, and pass the results back. In an RTOS system in an embeded systems, you have full control of all parts of the system, and thus you can make it a lot more efficient. When writing TaskA you know that TaskB needs to see changes to "mys" - so you send it over as soon as the change is made. When a task wants to ask another task to do something, it sends the request - it doesn't usually need to wait for an answer, as there is no risk of the communication failing.

As for the timings, that's a matter of getting thread priorities right.

Vote

D

David Brown 6 years ago

You want vTaskDelayUntil, not vTaskDelay.

It would be possible, but it does not look good to me. I'd be looking at an atomic flag (volatile bool, pre-C11) to indicate task A should be shutting down after the current sample, then wait on a semaphore. But I'd need a clearer picture of the whole system and interaction between threads.

That can usually work, but might not be the most efficient. I don't know if efficiency is important here. "Direct to task notifications", "Event groups", and "Message buffers" are topics you might want to read about on the FreeRTOS pages. In particular, communication that only goes from one known task to another known task can be a lot more efficient than general queues.

Multi-threading /is/ more difficult, and you need to understand a lot more. It is very, very easy to make something that is completely wrong, and hangs at unexpected times, and is difficult to debug. On the other hand, it is also easier to write the parts of the system in separate blocks, living their own lives without having layers of state machines or always thinking about breaking work up into little bits. The keep point is to be very clear about the communication between threads and any shared resources - keep this all to a minimum. As much as possible, make your threads self-contained - a C file contains only one thread, and all functions it calls. Any common functions in common files should be re-entrant.

An RTOS has a lot of benefits, but it is not a free lunch.

You can never use testing to show the lack of bugs - only the presence of bugs. Race conditions often only trigger in rare timing coincidences, so you can't rely on tests. You have to design the code so that race conditions don't occur.

Vote

P

pozz 6 years ago

Oh yes.

This is what I thought, but I'm not able to implement it without any race conditions.

// TaskA static volatile bool stopped; static volatile unsigned int delay_ms;

stopped = false; while(1) { // get sample and process if (stopped) { (1) vTaskDelay(portMAX_DELAY); (2) } else { vTaskDelay(pdMS_TO_TICKS(delay_ms)); } }

void set_delay(unsigned int new_delay) { // This is TaskB if (new_delay == 0) { stopped = true; } else { stopped = false; delay_ms = new_delay; vTaskDelayAbort(taskA); } }

This code has some race conditions if both tasks can be interrupted in any point by the other. For example, if stopped is true and taskB enters immediately after (1) and before (2) with set_delay(3), I will have taskA blocked forever!!

Could you suggest a solution with a simple volatile bool flag?

I'm starting understanding this.

Vote

N

Niklas Holsti 6 years ago

[snip long discussion]

[snip]

It's not so hard once you absorb a set of high-level solution patterns and apply them consistently and with a high-level design approach.

I realize that this response may not be helpful, but here is how it can be done in Ada, as an example of a high-level approach.

The SW will have a subsystem that does "sampling", and you want some other "managing" subsystem to be able to set the sampling period and to start/stop sampling. The sampling subsystem interface (API) is then:

package Sampling is procedure Set_Period (To : in Duration); procedure Start; procedure Stop; end Sampling;

That Ada text is put in its own file (by current convention named sampling.ads).

The body (implementation) of the sampling subsystem is put in its own file (conventionally named sampling.adb) and will start with

package body Sampling is

The rest of this post will add the parts of the package body one by one.

The sampling package will have a concurrent task, called Sampler for example, and declared as:

task Sampler;

The task body is shown later below.

The data to be communicated to the Sampler task will be the sampling period and whether sampling should be active. These shared data are placed in a "protected object" to ensure mutually exclusive access. The actions on these shared data are the following:

- Actions from the "manager" (exterior to the Sampling package):

-- set the sampling period -- start sampling -- stop sampling.

- Actions from the Sampler task:

-- get the current sampling period-

Moreover, the Sampler task should be suspended when sampling is stopped, which is convenient to implement with an Ada "entry" operation that suspends the caller if its "guard" expression is false. The declaration of the protected object is then:

protected Sampling_Control is procedure Set_Period (To : in Duration); procedure Start; procedure Stop; entry Get_Period (Giving : out Duration); private Running : Boolean := False; Period : Duration := 1.0; end Sampling_Control;

The shared but protected data are thus the Running flag, which will be the guard for the Get_Period entry, and the current sampling Period. Note that sampling is initially stopped (Running is initially False) and the initial sampling period is one second.

The implementation is rather trivial:

protected body Sampling_Control is

procedure Set_Period (To : in Duration) is begin Period := To; end Set_Period;

procedure Start is begin Running := True; end Start;

procedure Stop is begin Running := False; end Stop;

entry Get_Period (Giving : out Duration) when Running is begin Giving := Period; end Get_Period;

end Sampling_Control;

Note the guard "when Running" on the Get_Period entry. This means that a task that calls Get_Period when Running is False is suspended, at the point of the call, until Running becomes True. If Running is True, the call succeeds at once.

Note that Set_Period does not automatically start sampling. If that is desired, add the statement "Running := True" in Set_Period.

The Sampler task body can now be implemented simply:

task body Sampler is Period : Duration; begin loop Sampling_Control.Get_Period (Giving => Period); -- Take the sample, etc. delay Period; end loop; end Sampler;

(This code is a bit sloppy, because the use of a "relative" delay statement brings lag and jitter into sample timing. It is better to use the "absolute" delay statement ("delay until"), but this complication is not relevant to the task interactions so I skipped it.)

The public (API) operations of the Sampling package just call the corresponding operations of Sampling_Control (this circumlocation can of course be avoided by making Sampling_Control itself public, but that would harm encapsulation and increase inter-module coupling):

procedure Set_Period (To : in Duration) is begin Sampling_Control.Set_Period (To); end Set_Period;

procedure Start is begin Sampling_Control.Start; end Start;

procedure Stop is begin Sampling_Control.Stop; end Stop;

The implementation of the Sampling subsystem is now complete, and the file sampling.adb ends with:

end Sampling;

The "manager" part of the SW accesses the Sampling subsystem by saying

with Sampling;

and can then use the Sampling interface, for example as

Sampling.Set_Period (To => 0.1); -- 10 Hz Sampling.Start; ... Sampling.Set_Period (To => 2.5); ... Sampling.Stop;

and so on.

To map this Ada code into some C+RTOS combination, one should use a mutex to make the executions of the protected operations (Sampling_Control.Set_Period, .Start, .Stop, .Get_Period) mutually exclusive, and use some "condition" variable or semaphore to implement the guard (caller suspension) of Sampling_Control.Get_Period.

That last point (caller suspension) has some subtleties that depend on the actual set of RTOS primitives available. But I recommend a simpler solution: instead of suspending/resuming the sampling task, just let it run normally at the sampling period, but make the actual sampling (and sample processing, I assume) conditional on a "Running" flag (by returning this flag, too, from Get_Period). Usually the processor effort saved by actually suspending tasks that normally run periodically is not significant.

Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .

Vote

P

pozz 6 years ago

Ok, where do I study those "high-level solution patterns"? Because I didn't find good docs that explain non-trivial real examples.

Unfortunatley I don't know Ada.

Yes, I didn't imagine that controlling a task from another task was so difficult with a full-features RTOS like FreeRTOS. So your suggestion to let the sampling task always run even when it should be suspended is good. It's a pity, because I have many sensors, but only a few of them are active at any time.

Vote

Cortex-M: share an int between two tasks

Join the Discussion

Didn't find your answer?