Task, process, thread, etc.

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
I've been refactoring some of my RTOS documentation.  Comments from
the reviewers suggest there's still some confusion as to terms
(despite the fact that I explicitly define them!  :< )

All seem to understand the notion of a "thread".

And, to a lesser extent, an "application" (this one's a bit
harder as there's often no clear-cut distinctions; do you
tie it to a "pre-packaged set of algorithms").

I had opted to use "task" instead of "process" to describe
resource containers.  Too many folks with single-threaded
process experience brought that baggage to their understanding.
"Task" lets me avoid that.

Beyond that, I describe "jobs" -- collections of tasks to
implement a specific goal/service/etc.  E.g., speech synthesis
is a *job* that uses several "tasks", each of which support
several (possibly concurrent) threads, to solve that particular
problem.

A job is smaller than an application, but bigger than a
single task (even though a task isn't an active entity).

Other legacy terms further add to the confusion:  should
IPC be renamed ITC?  Are non-synchronous RPCs worthy of
a different name?  etc.

Is there some more widely accepted taxonomy that can be
referenced?  Or, just rely on explicit definitions (as
I've done) and not sweat the confusion that folks might
experience with legacy definitions?

[Alternatively, invent completely new bogo-terms just
to ensure my formal definitions are consulted?]

Re: Task, process, thread, etc.
On Wed, 31 Mar 2021 16:31:12 -0700, Don Y

Quoted text here. Click to load it

No old farts?  

Historically "thread" referred to control flow (execution patterns)
within a "program" (the code) and had nothing to do with concurrency
or parallelism.

The notion of "thread" as a scheduler entity dates from the early 60s,
but this alternate meaning of "thread" really did not enter general
use until the 90s.

The notions of "continuation passing", "tail calling", etc., represent
the last vestiges of the historical meaing of "thread".  Outside of
assembly programming - where e.g., "threaded interpeters" are known -
the modern terminology does not actually include the word "thread".


Quoted text here. Click to load it

Many (most?) people today realize that an "application" is an abstract
concept that may involve multiple "programs" cooperating to achieve.


Quoted text here. Click to load it


Few people today understand what is meant by "multi-programming" vs
"multi-processing", or "concurrency" vs "parallelism", and fewer still
know the historical meaning of "thread", let alone understand how it
relates to "multi-threading".  

"Multi-tasking", as applied to computing, may refer to any combination
of the "multi-" terms above.  It is somewhat remarkable that it is so
widely understood whilst simultanously being so utterly non-specific
in meaning.


Quoted text here. Click to load it

"Job" is more generic than "application" - however, in context, I
would argue that they could (should?) be considered synonymous.


Quoted text here. Click to load it

Just retcon the acronym to be "inter-PROGRAM". Problem solved.
8-)


Quoted text here. Click to load it

No. If anybody bothers to look, they will discover that the literature
recognizes both synchronous and asynchronous forms.


Quoted text here. Click to load it

In the past you have - sometimes vehemently - opposed the use of
conventional terminology on the grounds that it might imply something
not true of your particular system.

I have argued that people are going to try to look up things they
don't understand, and using unconventional terms hinders their
learning.  People who can't understand how conventional meanings apply
(or not) to your system, then they are incapable of programming it.


Quoted text here. Click to load it

May I suggest using random alphabetic sequences so there is no
possibility of confusing your defined terms with actual words.


YMMV,
George

Re: Task, process, thread, etc.
On 3/31/2021 10:07 PM, George Neuner wrote:
Quoted text here. Click to load it

I don't think age -- or history -- is a primary source of the problem.
Concepts that I was taught 50 years ago have evolved.  Or, been replaced.
I'm not tied to an initial understanding/explanation of a term or concept.

I think recent experience plays a far bigger role.  And, for folks with
more narrow ranges of experience, that can often delude them into thinking
they have the One True Explanation.

I don't see anyone complaining about the use of task/process over
the use of "team"!  (And, team suggests different connotations)

Quoted text here. Click to load it

The problem with that is defining a "program" and defining a
"cooperating set of applications".

A single thread can be a complete program.

Alternatively, can be seen as an application (one wherein everything is
folded into the single executable)

A group of threads can be seen as implementing a (more complex!) program.
When does it become an "application"?

How do you differentiate between any one of the constituent applications
and the collection of them?

Is my speech synthesizer an application?  (it can be free-standing!)
Are each of the threads within it "programs"?  Or even applications?
(I can see an "application" that takes free-form text and normalizes
it to equivalent text, free of abbreviations, digits, special
symbols, etc.  If a thread is performing these actions, is it JUST a
thread?  A program?  An application??)

What happens when I tie the synthesizer to the answering machine.
Is it still an application?  Within an application?

I.e., where is the distinction between them (thread, program, application,
job, etc.)?

Or, do the terms have such ambiguous meanings that you may as well call
them Tom, Dick and Harry?

Quoted text here. Click to load it

I see folks who've only worked in "single process space" environments
thinking every "executing entity" (what I would call a thread) is a
"task".  This differs significantly with my definition; tasks aren't
active!

Quoted text here. Click to load it

This quickly leads to the situation, above.

A collection of tasks (my definition) suggests certain dependencies
(on other tasks as well as resources, containers, etc.).  This
being more than what task (process) or thread would imply.

Yet, the collection may not make sense as a free-standing entity;
it may not "do meaningful work".

Quoted text here. Click to load it

But two threads can be two "programs".  The inherent aspect
of IPC is the fact that it crosses protection domains -- it
bridges *between* resource containers.  That's not necessary
with threads-as-programs.

Quoted text here. Click to load it

But the issue you (I) are trying to draw attention to is
the fact that the IPC is now across host boundaries.
I.e., it's a different level of IPC and has different
requirements and consequences.

Quoted text here. Click to load it

This is exactly the case, here.  There are only so many *meaningful*
terms that can be applied to general concepts.  It would be silly to
call a "task" a "banana"!  And, presumptuous to call it a "mission".

Quoted text here. Click to load it

That's why I've taken the time to define each term that I use.
The problem is one of coercing folks to abandon their preconceived
notions (which may be entirely accurate or appropriate FOR THEIR PAST
EXPERIENCES) and READING what is there before them.

I can always correct "misunderstandings", gently.  But, there's
a limit to how available I will be for that, in the future.
So, the documentation has to clearly state my intent; yet do
so in a way that doesn't burden people with completely
foreign terms.

Quoted text here. Click to load it

They'd have to be pronounceable; try reading your code to
someone when all of the symbols/identifiers are random
sequences of letters/symbols!  :-/

Re: Task, process, thread, etc.
On Wed, 31 Mar 2021 16:31:12 -0700, Don Y

Quoted text here. Click to load it

To me jobs smells like batch processing, The JOB card was the first
card ahead of a deck of punched cards. The program was run to
completion before the next JOB card was processed and executed.


Re: Task, process, thread, etc.
On 01/04/21 07:49, snipped-for-privacy@downunder.com wrote:
Quoted text here. Click to load it

Either that or one unit of work that is submitted externally
and executed in an application by however many resources are
necessary.

Think of a "job shop"...
https://marketbusinessnews.com/financial-glossary/job-shop/


Re: Task, process, thread, etc.
On 3/31/2021 11:49 PM, snipped-for-privacy@downunder.com wrote:
Quoted text here. Click to load it

The problem with *every* term is that they all have preexisting
definitions/expectations -- in ways that are unique to the
particular environment in which they are encountered.

Everyone brings their past experiences to new projects and,
if the lexicon is vague (or, worse, if folks ASSUME they understand
certain terms), then there's a mismatch between what is being
conveyed and how that is being *interpreted*.

Re: Task, process, thread, etc.
On 01/04/21 12:03, Don Y wrote:
Quoted text here. Click to load it

Quoted text here. Click to load it

If you go much further down that path, you end up
in Humpty Dumpty land.

The "principle of least surprise" is a useful concept,
coupled with
  - being explicit about what is meant
  - being explicit about what is not meant
  - examples of use

Re: Task, process, thread, etc.
On 4/1/2021 5:02 AM, Tom Gardner wrote:
Quoted text here. Click to load it

That's the point of formally defining each.

THIS is a thread. See how it has state?  See how it executes
code?  See what it implies is required to have MORE of them?

THIS is a task.  See how it CONTAINS threads?  See how it
warehouses resources (memory, objects, threads, etc.)?  See
how it doesn't (inherently) impose constraints on how many
of each resource it can support?

See how the task can't execute code -- but, rather, contains
code?  See how the threads don't contain/own resources
(besides their state/stack)?

Note how task A and task B do not overlap?  See how an IPC (ITC?)
is the means by which something in task A can communicate with
something else in task B?

See how these tasks -- containing these threads (and other
resources) -- can implement a calculator?   See how that calculator
can have freestanding value as an "application" (sans user I/O)?
See how that calculator can act as an expression resolver *in*
the scripting language *application*?  etc.

See how this thread can convert "Dr. Smith lives at 123 Smith Dr."
into "Doctor Smith lives at one hundred twenty three Smith Drive"?
See how that likely doesn't have value as a freestanding application
outside of its role *in* the speech synthesizer?

See how these varied applications can act together (or separately)
in this *system*?

Resorting to novel terms just means everyone starts off in the same state
of confusion.  But, reusing "existing" terms -- many of which have been
defined and redefined at different times and in different domains -- means
some number of readers *will* be surprised.

I don't see any way out -- other than documenting everything and
providing hotlinks to each term's definition, regardless of the document
in which it is encountered (I already draw attention to "special"
terms with distinct formatting; I'd just have to add the additional
links to a glossary, index and/or "defining document").  The downside
is this almost forces the reader to read the documents electronically
(it's too cumbersome to annotate each such reference with "See foobar
on page 13 of Document blahdeeblah.")

Re: Task, process, thread, etc.
On 4/1/2021 2:31, Don Y wrote:

Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it



Quoted text here. Click to load it

Quoted text here. Click to load it

Hi Don,

I can only say what I have done with DPS - the "task" related
wordings have not changed much if at all since the early 90-s.

Back then I called a "task" what is a ...task, code running
with its own stacks (user/system) and being put into use
by the scheduler. Whether it runs on this or that core is
irrelevant. And I took a decision to call a "process"
a group of tasks having the same common data section
(each task points to one). Obviously this is very different
to what people would think of as a "process" on other
systems where they call a process what I call a task,
I think interchangeably. So I try to phase that term out
by not using it.

A thread is a dangerous term to use, to me it means a thread
in a multi-threaded processor core. Which is much more
hardware than software related, a virtual core can just run
yet another task to another virtual core within the same
physical one.

Other than that I see no need for other names. Keeping it
simple is nice, especially when it comes down to the
basics like the ones I talk about here.

There are more complex things to think and talk about
of course, but these I leave to being various "objects"
(dps has its inherent runtime system of objects) and to
whatever "actions" you can "do" by/with these (I don't
use the words "methods" and "apply", I just did not know
these when I wrote the first implementation of the dps
object system back around 1995, and my words seem to
better describe what I have written anyway).

Wait a second, you may be asking from an end user perspective.
Well, my reply does not apply then... :-). I thought if
people like the population of this group while I wrote it...

Dimiter

======================================================
Dimiter Popoff, TGI             http://www.tgi-sci.com
======================================================
http://www.flickr.com/photos/didi_tgi/




Re: Task, process, thread, etc.
On 4/1/2021 5:59 AM, Dimiter_Popoff wrote:
Quoted text here. Click to load it

This is what I've called a thread.  Thread's are the only things
that execute code.  A thread executes code on a hardware *processor*.
(a processor only supports a single thread -- more later).

Quoted text here. Click to load it

Ditto.  Threads are schedulable entities, regardless of which
*processor* they run on (again, using my terminology)

Quoted text here. Click to load it

This is roughly similar to what I call a task -- resources
exist in RAM (memory, stack, thread state, "file handles",
etc.).  But, additionally, I consider resources things like
scheduling parameters, access permissions (capabilities), etc.

Quoted text here. Click to load it

The problem with "process" is that many folks think of the one-thread,
one-process computing model (of days gone by).  Hence, my desire to
avoid "process" for fear of it conjuring up a single execution entity.

Quoted text here. Click to load it

I call the hardware the "runs" a thread's code a processor.
A "core" can contain multiple processors.  A "host" can
contain multiple cores.  A node can contain multiple hosts.

E.g., a node is likened to a PCB with some number of "CPUs"
on it.  Each CPU "chip" is a host.  Each host can contain more
than one core -- each of which can contain more than one
processor.

The distinction is important because I treat each of these
things as formal objects.  Code can manipulate those objects
if the code has the proper "permissions" (capabilities).

So, a thread running *somewhere* can diddle with the scheduler
for a particular *processor* -- even if it doesn't reside on
the same core, host or node!

In this way, code can bind specific "threads" to specific
processors in specific cores on specific hosts on specific
nodes, etc.

Likewise, if I want to shutdown a node and migrate all
of its resources and threads to some *other* node, I can
just stall the scheduler(s) on the node and wait for
everything to naturally (or, maybe UNnaturally?) idle.
Then, while quiescent, move the resources over to some
other node that I've preselected.  And, finally, remove
power (or otherwise repurpose) the original node.

Quoted text here. Click to load it

See above.


In my world, EVERYTHING is an object!  Threads, tasks,
processors, cores, nodes, hosts, memory, etc.  I deal
with them via "handles" that the OS manages for tasks
(tasks own resources, the threads don't!).

In each handle is a reference to a particular object and a
set of methods that the handle-holder can invoke on the
object (capabilities).  This is enforced client-side, by the
OS.  So, you can't waste an object's "time" by sending
spurious ILLEGAL requests to it (DoS attack) and expecting it
to decide if your request is "allowed"; the kernel will
ensure that you waste *your* time doing that (and time is a
resource that the kernel tracks; abuse yours and you cease
to exist -- but the intended "victim" object is unaffected)

Quoted text here. Click to load it

No, you answered in the spirit I intended!  :>

Re: Task, process, thread, etc.
On 4/1/2021 6:39 AM, Don Y wrote:
Quoted text here. Click to load it

(sigh)  I've been overruled (despite it being *my* codebase!  :< )

OTOH, there are more of *them* than there are of *me* so it's probably
easier for me to adapt to their desired terminology than stick with mine!
As long as they are willing to take responsibility for understanding
the distinction(s)!

Process = resource container.

Now, "task" has no meaning.  (I'm not sure introducing it in place
of "thread" would be well received; But, I can try!)

I'll probably wait to edit all the references, function labels, etc.
until I get a final verdict on all this...

Re: Task, process, thread, etc.
On 4/4/2021 23:15, Don Y wrote:
Quoted text here. Click to load it


Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

Don,
I think that since you see everything as an object you can afford
more descriptive names, not necessarily made of one word.
I do that with objects in DPS. Here is how they can go.
The simplest object - the root of all objects - is called
"something", i.e. everything is something, LOL.
Then there is a "piece of memory", which carries more information
about itself.
Then there is a "generic object", which carries some more information
which one will need anyway - like how to set/get a parameter in a
standardized way, where it is allocated/who has allocated it (that
would be its "container" - for example, a "memory pool" which is
a "piece of memory"). And from there you go further, you have a
"file reference", "directory reference", "directory view details",
"directory view icons" etc. etc., the point is the names can be
self explanatory. Now I never had the need to treat a task as an
object itself but obviously one can think of a more self explanatory
word for that, too. I have tasks referred to by objects - say,
the ip_link refers to the IP input task (defragmentation etc. sort
of thing), the tcp_connection object refers to a tcp input
task (reordering/linking incoming segments etc.).
I suppose "tcp_connection" is a nice example of a self explanatory
name. You could call what you have as a "task" (which you call
a "thread") say a "running program" or something (just trying to
give an example, I don't like it particularly well, just can't
come up with anything better but I am sure you can manage that
if you give it some more time).

Dimiter

======================================================
Dimiter Popoff, TGI             http://www.tgi-sci.com
======================================================
http://www.flickr.com/photos/didi_tgi/



Re: Task, process, thread, etc.
Hi Dimiter,

On 4/4/2021 2:03 PM, Dimiter_Popoff wrote:
Quoted text here. Click to load it


Yes, of course.  But, some things (objects) already have a
"naming history" that people are comfortable with.

Call a thread an "execution unit"?  The processor on
which it executes an "executor unit"?  <grin>

Would it be strained to call a process a "Resource Container"?

Quoted text here. Click to load it

I have "memory objects" which are treated as entities.
E.g., I can pass a memory object to a function (or process;
that is, thread who will act on it in a process) in much
the same way that I can pass an "int".

There are "exception handlers", "deadline handlers", "deadlines",
"exceptions", "scheduling criteria", etc.  These are easier to
name because there's no "legacy" that you have to overcome.

Quoted text here. Click to load it

A process is referenced as:

process_t MyProcess;

Using MyProcess in a method that is defined for a process_t will
invoke the code associated with the method *on* that process_t.

Quoted text here. Click to load it

I've found coming up with MEANINGFUL names for things is tiring;
There are many concepts/objects that are incredibly similar.
So, you end up finessing terms that could easily be interchanged
for each other (yet can't as they are actually different things)

The biggest problem comes in documentation (once folks are USING
things, they learn what those things mean and how they are to be
used).  You don't want to have to employ qualifiers on generic
terms:
- an RPC can take no arguments and NOT return a result
- an RPC can take no arguments and return a result
- an RPC can take arguments and NOT return a result
- an RPC can take arguments and return a result
- an RPC that doesn't return a result can "return" without
   confirmation that the remote procedure has actually been
   invoked!

[And, substitute IPC for RPC]

Do you add these qualifiers to each use of the "RPC" term in the
documentation?  Or, do you come up with different terms for each
*type* of RPC?

A process (Resource Container) can contain any number of threads,
including zero.  If you're discussing the process in the context of
being a server for a particular class of objects, do you refer
to it as a single-threaded server?  Multiple-threaded?  etc.

Etc.

BTW, Happy Easter!  (I think you may still have a few hours left...)

Re: Task, process, thread, etc.
On 4/4/2021 2:42 PM, Don Y wrote:

Quoted text here. Click to load it

I floated that idea... and it was **promptly** shot down!

Amusingly, folks like to talk about a process as if it is an
active entity - even KNOWING that the process is the container
FOR the active entities and not the executing code!

And, the notion of using "Resource Container" in a sentence
as if it was in any way "active" fell on deaf ears.

<shrug>  It's amusing to see the implicit baggage that comes
with our choices of terms!

(admit defeat; tweak the code/docs and be done with it!)

Re: Task, process, thread, etc.
On 4/5/2021 8:29, Don Y wrote:
Quoted text here. Click to load it

Quoted text here. Click to load it


Quoted text here. Click to load it

Quoted text here. Click to load it

I am not surprised at that :-). My reaction was similar.
I call a "container" objects which have allocated the memory
for another object - so when an object is told to "getlost"
it asks its container to deallocate it apart from whatever
else it has to do (close file(s), connection(s) etc.).
But I think of "container" as of a jar with a lid you know.

If I understand what you want to name is a group of tasks
(which you call threads); why not just call it "group of
tasks" or something. Then gradually let the language migrate towards
just "group"on its own, that is in a natural way.

Dimiter



Re: Task, process, thread, etc.
On 4/5/2021 12:12 PM, Dimiter_Popoff wrote:
Quoted text here. Click to load it

Yes.  But, amusing as the *container* isn't an active entity!
Yet, we speak of it as if it was -- as a proxy for the threads
it contains!

So, it's important that folks discussing an implementation
have a shared lexicon; someone talking about "processes" may
or may not be thinking in terms of "threads"!

Quoted text here. Click to load it

It's not just a "group of threads"  (trying to avoid mixing terms);
it's a group of threads executing in an isolated address space sharing
a set of capabilities for a set of objects and access to specific
code/data...

[By contrast, a (non-specific) "group of threads" can span multiple
"processes" to solve a particular problem (hence "job").]

It is the sharing and coupling of the threads in that "container" to
which you want to draw attention.  E.g., two such threads can stomp on
each other's data -- hence the need for mutexes/locks or some other
cooperative sharing algorithm.  Two such threads can access the same
set of resources (e.g., thread A can acquire a resource yet thread B
can actually be the one who uses it without the need for any special
"protocol" to exchange "ownership").  In the context of an object
server, *any* thread *could* service a request for any object contained
in its "process"; or, specific threads could be assigned to specific
objects' requests; etc.  But, a thread in ANOTHER *process* couldn't
service that request (for THAT object)!

*Within* a process, you have to be more disciplined, as a developer.
But, you have greater leeway in terms of what you can do -- and get
away with!  The OS isn't playing policeman *inside* the process as it
would *between* processes.  Communication can be nearly instantaneous;
you just agree as to how each thread will access a particular shared
structure/buffer.  No need for the OS to get involved in moving
information across protection boundaries.  All threads in a process
are colocated on the same *host*.

OTOH, a "group of threads" (cuz threads are the only things that
actually *do* anything!) executing in different processes can reside
on different hosts to achieve a common goal.

It's all the hidden assumptions (what I call baggage) that leads
to misunderstanding.  If you have to resort to overly precise
language, then it becomes difficult reading.  :<  Hence, trying to
find terms that "feel" natural to people.

If I was describing a particular *algorithm* (that is likely to
evolve, over time), I could be a bit looser in my prose.  But,
when discussing fundamentals (which are likely invariant), there's
less wiggle room.  If I say "5", I mean *5*... not 4 or 6!

<shrug>  I'll post you a draft copy when I get some of the
illustrations finished (busy picking oranges this past week+)

Re: Task, process, thread, etc.
On 4/5/2021 23:44, Don Y wrote:
Quoted text here. Click to load it

Quoted text here. Click to load it


Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

OK, "container" may be the right word but to me (and apparently to
other people) it sounds more static than what you describe. I mean
subconsciously one thinks that a container has some static contents....
Call it an "aquarium", lol. On a second thought, it may do the job
exactly because it sounds funny....

Dimiter


Re: Task, process, thread, etc.
On 4/5/2021 1:54 PM, Dimiter_Popoff wrote:
Quoted text here. Click to load it

Yes, the container *is* static!  It's the threads that actually do work!
(wherever they may actually reside)

Yet, we talk about the process (i.e., container) as if *it* was
actually doing the work!  Imprecise language:  the threads IN the
process are doing the work.

We talk of "killing" a process (or "starting" a process) -- as if *it* was
an active entity.  But, what we really mean is to stop all of the threads
in the process (container), free all of their resources and then delete/destroy
the process (container).

Re: Task, process, thread, etc.
On 4/6/2021 1:05, Don Y wrote:
Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

Quoted text here. Click to load it

So aquarium is not that bad,especially if we choose to call a
task a fish etc. :D.
While I don't think anybody (except perhaps me...) would find
"aquarium" an acceptable term, following that line of thought
a "bowl" might even be OK. You pour everything out of it and eventually
break it into pieces -> it goes into dust...

Dimiter



Re: Task, process, thread, etc.
On 4/5/2021 3:14 PM, Dimiter_Popoff wrote:
Quoted text here. Click to load it


I'm going to stick with "process".  But, I'm going to include references
to how it is often "misused" in this way -- to clarify what is meant by
such misuse.

Site Timeline