C++ threads versus PThreads for embedded Linux on ARM micro

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
We're starting an embedded Linux C++ project with an ARM micro and using GCC V7.  Can anyone suggest pros and cons of using C++ Threads versus PThreads (Posix threads).


Re: C++ threads versus PThreads for embedded Linux on ARM micro
On 20/07/18 13:01, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it

C++ threads are always a wrapper around an underlying library.  So if  
you are using C++ on Linux, the C++ threads /are/ pthreads.  These are  
the points I can think of for preferring C++ threads:

+ You have a nice class/template library with C++ threads, instead of a  
C function interface.

+ You have have RAII classes for locks and other synchronisation objects.

+ You have consistency with other C++ thread systems.

+ Your compiler may understand that your code is threaded.

- You need at least C++11 (but that has huge advantages anyway, compared  
to older C++).

- It is marginally more fiddly if you need the underlying thread details  
for features not supported by the C++ thread library.


Re: C++ threads versus PThreads for embedded Linux on ARM micro
On Saturday, July 21, 2018 at 6:19:31 AM UTC+12, David Brown wrote:
Quoted text here. Click to load it

That's great, thanks.

Re: C++ threads versus PThreads for embedded Linux on ARM micro
snipped-for-privacy@gmail.com writes:
Quoted text here. Click to load it

Also, ask yourself if you really need threads in the first place.
Depending on what you're doing, you may be better off with multiple
processes.  That gets rid of a lot of lock and race hazards, and if the
processes can communicate through sockets, that improves scalability by
making it easier for you to distribute your program across multiple
machines if you run out of cpu cores on your original machine.

Re: C++ threads versus PThreads for embedded Linux on ARM micro
On Saturday, July 21, 2018 at 1:28:02 PM UTC+12, Paul Rubin wrote:
Quoted text here. Click to load it

Thanks for the suggestion.  The micro is an ARM9 LPC3250 SOM (we're forced  
to use this at the moment) which I believe is single core (it's hard to fin
d out for some reason) but it could easily change in future.  Based on a pr
evious project, race conditions and deadlocks are a major headache so I'm h
oping the core data will be written to by one thread only, maybe with lock-
free queues.  The CPU data cache is 32KB and it's probably "write through".
  We would have to do some performance tests to see if multi-processes and  
sockets is viable.    

Re: C++ threads versus PThreads for embedded Linux on ARM micro
On 21/07/18 08:58, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it

There are certainly tasks that are better handled as multiple processes  
rather than multiple threads.  (But note that it is not an either/or  
choice - often the best solution uses both.)

Quoted text here. Click to load it

No, it does not - it merely changes them.  If your separate threads of  
execution need to synchronise, communicate, or agree about shared  
resources, then there is no theoretical difference about the types of  
hazards, races, or other such problems if you use multiple threads or  
multiple processes.  The details change, and the types of  
synchronisation objects use can change, but they do no not go away.  
Some may be handled by the OS rather than the application, however - for  
example, a pipe between processes will let you communicate without worry  
about locks for the underlying shared data structure, at the cost of  
being a lot less efficient than shared memory in threads.

Multiple processes have higher resource costs, and they make it a lot  
harder to use tools such as "-fsanitize=thread" to find problems.  On  
the other hand, they make it easier to break the problem down into  
separate tasks that are handled independently and tested independently.  
That helps if you have different developers - or even different  
programming languages.

Quoted text here. Click to load it

True.

This can also be useful during development when you might have some of  
the bits running on your target system, and other bits running on your  
host computer (perhaps under a debugger).

Quoted text here. Click to load it


That doesn't matter for the choice of threads, processes or both.

Quoted text here. Click to load it

Multiple processes are slower than multiple threads, and sockets are  
much slower than in-process queues.  But the sockets are more flexible.  
You might find you want an abstraction that can use either method as a  
backend, and change during different stages of development.




Re: C++ threads versus PThreads for embedded Linux on ARM micro
You can also communicate among processes through shared memory (e.g. mmap).

To look at a other way, processes require explicit sharing, threads share
implicitly.

On an embedded system, the heavier cost of process switching may be
important.


Re: C++ threads versus PThreads for embedded Linux on ARM micro
On 21.7.18 09:58, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it



For number of cores on your system, if you have Linux running on the
target, have a look at /proc/cpuinfo:

tauno@pi2:~ $ cat /proc/cpuinfo
processor    : 0
model name    : ARMv7 Processor rev 5 (v7l)
BogoMIPS    : 38.40
Features    : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt  
vfpd32 lpae evtstrm
CPU implementer    : 0x41
CPU architecture: 7
CPU variant    : 0x0
CPU part    : 0xc07
CPU revision    : 5

processor    : 1
model name    : ARMv7 Processor rev 5 (v7l)
BogoMIPS    : 38.40
Features    : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt  
vfpd32 lpae evtstrm
CPU implementer    : 0x41
CPU architecture: 7
CPU variant    : 0x0
CPU part    : 0xc07
CPU revision    : 5

processor    : 2
model name    : ARMv7 Processor rev 5 (v7l)
BogoMIPS    : 38.40
Features    : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt  
vfpd32 lpae evtstrm
CPU implementer    : 0x41
CPU architecture: 7
CPU variant    : 0x0
CPU part    : 0xc07
CPU revision    : 5

processor    : 3
model name    : ARMv7 Processor rev 5 (v7l)
BogoMIPS    : 38.40
Features    : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt  
vfpd32 lpae evtstrm
CPU implementer    : 0x41
CPU architecture: 7
CPU variant    : 0x0
CPU part    : 0xc07
CPU revision    : 5

Hardware    : BCM2835
Revision    : a01041
Serial        : 0000000064d34ba1

---

The above is from a Raspberry Pi 2.

--  

-TV


Re: C++ threads versus PThreads for embedded Linux on ARM micro
On 07/20/2018 11:58 PM, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it

3250 is single-core.  Happens to be a part we use a lot around here,  
although we always go bare-metal rather than running Linux.  Cache is  
programmable through the page-table as to whether it's write-through or not.

The reason you're having trouble determining much of this information is  
that NXP bought large chunks of that chip wholesale from ARM without  
anyone there actually understanding it.  So the NXP documentation is  
spotty and occasionally wrong (let me tell you of our I2C-based woes).

There's a document available directly from ARM, ARM DDI 0198E, that is  
specifically the ARM926EJ-S Technical Reference Manual.  Getting into  
the details on the 3250 is nearly impossible without it.

--  
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Re: C++ threads versus PThreads for embedded Linux on ARM micro
snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it

I'd suggest thinking about a design for how you'd measure which works
in context for you. If I run pthreads on a big Linux machine it'll be  
different from running them in a VM. Similarly, it'll be different on
a RasPi 3 sized ARM computer.

--  
Les Cargill

Re: C++ threads versus PThreads for embedded Linux on ARM micro
Traditional threads, whichever way you package them (as C++ threads, p-thre
ads or any other thread library), typically correspond to the "shared-state
 concurrency and blocking" approach. This approach is known to be problemat
ic, and many experts in concurrent programming recommend to drastically lim
it both sharing and blocking according to the following three best practice
s:

1. Keep data isolated and bound to threads. Threads should hide (encapsulat
e) their private data and other resources, and not share them with the rest
 of the system.

2. Communicate among threads asynchronously via messages (event objects). U
sing asynchronous events keeps the threads running truly independently, wit
hout any further blocking on each other.

3. Threads should spend their lifetime responding to incoming events, so th
eir mainline should consist of an event-loop that handles events one at a t
ime (to completion), thus avoiding any concurrency hazards within a thread  
itself.

The set of these best practices are collectively known as the Active Object
 design pattern (a.k.a. Actor). While this pattern can be applied manually  
on top of a traditional threads, a better way is to use an Active Object fr
amework.

The main difference is that when you use "naked" threads, you write the mai
n body of the application (such as the thread routines for all your tasks)  
and you call various thread-library services (e.g., a semaphore or a time d
elay). When you use a framework, you reuse the overall architecture and wri
te the code that it calls. This leads to inversion of control, which allows
 the framework to automatically enforce the best practices of concurrent pr
ogramming. In contrast, a "naked" threads let you do anything and offer no  
help or automation for the best practices.

Re: C++ threads versus PThreads for embedded Linux on ARM micro
Thanks.  What is an "event object"?  What is the best way to pass data asyn
chronously using a queue, on Linux?  I've read that lock-free data structur
es are easy to get wrong and best avoided and that the C++ thread library d
oesn't have any lock free data structures - mainly because there's too many
 variations to have a generalized data structure.

Can we use the "libcds" library and be confident that it will work correctl
y?

https://github.com/khizmax/libcds


On Saturday, July 28, 2018 at 3:52:02 AM UTC+12, StateMachineCOM wrote:
Quoted text here. Click to load it
reads or any other thread library), typically correspond to the "shared-sta
te concurrency and blocking" approach. This approach is known to be problem
atic, and many experts in concurrent programming recommend to drastically l
imit both sharing and blocking according to the following three best practi
ces:
[snip]

Re: C++ threads versus PThreads for embedded Linux on ARM micro
Quoted text here. Click to load it

Just a message that you pass from one thread to another.

Quoted text here. Click to load it

I'd probably just use std::deque with a lock.  

You might look at seastar-project.org for some inspiration.

Re: C++ threads versus PThreads for embedded Linux on ARM micro
On Friday, July 27, 2018 at 5:53:28 PM UTC-4, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it
ly using a queue, on Linux?

I can tell you how this is done in the QP/C++ framework, which I've designe
d and refined for almost two decades now. But before I can get to the techn
ical, I need to make full disclosure that QP is a dual-licensed (open-sourc
e/commercial) product of my company (see https://www.state-machine.com ), so
 I do have a commercial interest in promoting it.

So, now going back to your question, "event objects" are messages that thre
ads send to each other via event queues. But a naive implementation of copy
ing messages to and from the queues is expensive and hurts real-time perfor
mance. So, in the QP framework, the events are allocated from fixed-size po
ols and only pointers to events are kept in the event queues. The framework
 maintains the copy-by-value semantics as much as possible, while event obj
ects are really shared under the hood. The framework also automatically rec
ycles events that are processed.

Specifically to the POSIX port of QP/C++, which has been available for over
 15 years now, each active object runs in its own p-thread. These threads a
re organized as an event-loop (according to the best practice I listed in m
y previous thread), so they block only in one place--when the event queue i
s empty. The queue uses internally a p-thread mutex and a condition variabl
e to implement blocking on an event queue and signaling the queue. But the  
application programmer does not need to know any of it, because the main po
int is that the framework does the heavy lifting of thread-safe asynchronou
s event exchange. The application threads (active objects) only process the
 events one at a time (to completion), but they don't need to worry about a
ny low-level mechanisms like mutexs or condition variables.

The design also allows you to avoid sharing of anything (except events) amo
ng the threads, which is another best-practice of concurrent programming. T
his means that you don't need to use any synchronization objects. In this s
ense, the RAII benefits of synchronization mechanisms in the C++ threads do
n't matter.

There is of course much more to active object framework like QP/C++ to capt
ure here. For example, the framework supports Hierarchical State Machines t
o implement the internal behavior of active objects. There is also a free m
odeling tool (QM), with which you can design your HSMs graphically and gene
rate production-code automatically. But all of this requires a paradigm shi
ft from the traditional sequential-programming with blocking to event-drive
n programming without blocking or sharing. To learn more, you might read ab
out the key concepts here:

https://www.state-machine.com/doc/concepts

Miro Samek
state-machine.com


Re: C++ threads versus PThreads for embedded Linux on ARM micro
On 27/07/18 23:53, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it


Lock-free data structures can range from very simple to very difficult,
and there can be huge differences depending on the details of the
structure.  A single-writer, single-reader fixed size queue is /easy/ -
it's just two atomic counters for "head" and "tail" and an array, with a
little care about memory ordering.  For single core embedded processors,
it's usually sufficient to just use "volatile" - for bigger systems,
C++11 or C11 atomics handle the details.

On the other hand, a queue that can have variable size, and more than
one reader or writer, quickly gets really complicated to handle
lock-free, and often it is much simpler, safer and cheaper to use a
lock.  On the third hand, if you have multiple cores you might want
lock-free again for scalability.

There is no simple answer here, and much depends on the details of
exactly what you are wanting.  As long as you ask general questions,
you'll only get general answers.

Quoted text here. Click to load it


Re: C++ threads versus PThreads for embedded Linux on ARM micro
On 27/07/18 17:51, StateMachineCOM wrote:
Quoted text here. Click to load it

Encapsulation is always a good principle, but don't take it too far.  If
two parts of the system need to share data of significant size, then you
want shared data, not "messages" or other synchronisation mechanisms.
(You use the messages or other synchronisation to communicate metadata -
such as who owns the real data space at any given time - but not the
data itself.)

Quoted text here. Click to load it

Blocking is fine with threads.  If you have a single core cpu - or more
threads than cores - then blocking is often more efficient than
attempting to continue.  After all, if thread A is asking thread B to do
something (via a message, actor call, or whatever) then thread B can't
get started in doing the work A wants until A has taken a break.  It's
cheaper to have a voluntary break (yield, or blocking call) than to wait
for a scheduling change.

So use blocking calls whenever they fit naturally in the progression of
the code - and non-blocking calls whenever /that/ is the more natural
fit.  Don't make the mistake of thinking that one is inherently "better"
or necessarily more efficient - /measure/ the /real/ effects if
efficiency is vital.

Quoted text here. Click to load it

Actor designs can certainly have their advantages - equally certainly,
they are not the best design for all uses.  Whenever someone says "this
is the best way to do it", it's unlikely to be that simple - and
whenever they say so without knowing exact details of the problem at
hand, they are almost certainly wrong.

Quoted text here. Click to load it


Re: C++ threads versus PThreads for embedded Linux on ARM micro
@David Brown: Absolutely, if you stick to the traditional sequential progra
mming paradigm with shared-state concurrency and blocking threads, the thre
e best practices I listed in my previous post can all be questioned, relaxe
d, and ultimately dismissed.

That's because they represent a different, event-driven ("reactive") progra
mming paradigm. The distinction is important, because the two programming p
aradigms do NOT mix well, certainly not inside the same thread. So it is im
portant to always realize which paradigm you are using in which thread, to  
avoid confusion and mixing the two.

To back up this point, I'd like to recommend the article "Managing Concurre
ncy in Complex Embedded Systems" by David Cummings (http://www.kellytechnol
ogygroup.com/main/concurrent-embedded-systems-website.pdf ). The author pre
sents general guiding principles of structuring threads, which he found par
ticularly useful and which he applied in the NASA Mars rovers and other mis
sion-critical systems. The paper starts with the description of the general
 thread structure, which can be immediately recognized as the event-loop. T
he bulk of the paper then focuses on discussing several scenarios in which  
designers might be tempted to apply thread BLOCKING, followed by explanatio
ns why blocking is always a BAD idea. Again, I repeat, that this conclusion
 applies to the "event-driven" thread structure, which the author started w
ith.

Re: C++ threads versus PThreads for embedded Linux on ARM micro
On 01/08/18 00:16, StateMachineCOM wrote:
Quoted text here. Click to load it

I haven't read the link yet (I will do so), but I do agree that blocking  
is a very bad idea in an event-driven thread.


Re: C++ threads versus PThreads for embedded Linux on ARM micro
Il 01/08/2018 00:16, StateMachineCOM ha scritto:
Quoted text here. Click to load it

Thanks for this reference... it is a very *very* instructive material.


Re: C++ threads versus PThreads for embedded Linux on ARM micro
On Tue, 31 Jul 2018 15:16:07 -0700 (PDT), StateMachineCOM

Quoted text here. Click to load it

It seems Cummings has reinvented the wheel :-).  

Those principles were used already in the 1970's to implement real
time systems under RSX-11 on PDP-11. Later on these principles were
also used on real time systems under RMX-80 for 8080 and similar
kernels.


Site Timeline