Suggested features for CASE to protect shared stores against conflicting accesses

Thank you for all of you that have commented my post "How do you want to access shared resources?". We are currently working to implement following kind of solution to ( mutual exclusion / shared resource protection / critical region ) issue.

Anyway comments are welcome. Especially operation names and syntactical issues are under consideration.

The following text describes

1) the needs behind the new features 2) a short description of the notation to be improved 3) the new features for ReaGOS kernel 4) the new features for ReaGeniX

The features are to be included in the next version of CASE-tool & RTOS combination ReaGeniX & ReaGOS.

formatting link
formatting link

The needs driving the development of the new features:

  • Effectiveness of internal communication of the target software
  • Reliability of the target software developed by our tools

First a short description of our notation to get an idea where the new features will be added.

Our graphical notation is a structured one. The basic idea is to define and interconnect functional blocks. The functional blocks have interfaces with strongly typed asymmetric connectors. There are two kind of diagrams to define new functional block types: architecture diagrams and state diagrams. In architecture diagrams instances of functional blocks and data stores (kinds of boxes) are connected to each other and to the diagram interface by lines. In state diagrams there are hierarchical states (kind of nested boxes) and transitions (lines). In transitions textual conditions and actions refer to interface items. Notation is slightly different from Harel's StateCharts. Our notation was already originally designed for generation of efficient target code.

In the topmost architecture diagram the functional blocks can be annotated with priority numbers or "driver". The annotated architecture diagram is called a priority diagram and it can be used to generate/configure the ReaGOS microkernel.

The new features for ReaGOS microkernel are

  • Shared store - a store symbol in priority diagram
  • New connectors for functional block interfaces to connect to a shared store - shared - for read access to a shared store - shared_W - for read/write access to a shared store
  • Locking operations based on priority ceiling protocol

The shared stores with locking operations improve the communication between tasks. No server tasks are needed to protect the data.

Priority ceiling protocol keeps the locking operations simple, avoids priority inversion, and avoids deadlocks.

The new features of ReaGeniX graphical language and code generator

  • New connectors shared and shared_W to support ReaGOS or other OS - Separate shared connectors for inter-task data and store connectors for intra-task data allow generation/compile-time assurance that inter-task data must be locked during access, but no overhead is introduced to intra-task store access.
  • Separation of store and store_W (read and read/write) connectors for data stores. Earlier only read/write connector was available. -This is to keep better track who is updating (and possibly messing up) a store.
  • A locking operator "lock(X)" to be used in transition condition, if the expression refers to a shared store connector X
  • A release operator "unlock_all" to release all shared stores locked in condition. This is to be used in the outermost block level of the transition action.
  • A locking block construct "begin_lock(X) ... end_lock(X)" to protect X from modification (read lock)
  • A locking block construct "begin_lock_W(X) ... end_lock_W(X)" to protect X from all other accessess (write lock)
  • A mechanism to discourage accessess of a shared store outside of locks and to discourage modification of a shared store outside of write locks.

All comments are welcome!

Regards,

Ari Okkonen OBP Research Oy

formatting link

Reply to
Ari Okkonen
Loading thread data ...

Op Wed, 18 Nov 2009 17:09:39 +0100 schreef Ari Okkonen :

Compared to what? How did you come to this conclusion? Which inter-task communication methods did you investigate?

If you can lock, you can have deadlock. What if a process dies before it has the chance to unlock?

Is it possible that, at any given time, one can determine which source code line or which model element has caused a store to be locked?

Is this operator inserted by the code generator or by the user?

What if I write inside a read lock or read inside a write lock?

This mechanism sounds quite interesting. Could you provide some more details?

--
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/
 Click to see the full signature
Reply to
Boudewijn Dijkstra

An example is a measurement data block in a mechanical or chemical process control system. Several measurement tasks update hundreds of values in the data block each in own pace. Several control tasks read these values in different paces. There will be lots of messages sent back and forth if these tasks cannot share the same data store. Using messages to transfer and query values seems more complicated than just assigning or reading the values, including locking.

In the priority ceiling protocol there are no specific locks for the resources. The priority of the task is raised so much that no other task having access to that resource cannot get scheduled. If the task dies, its priority has no meaning anymore. See e.g.

formatting link

No specific mechanism is planned for that. Anyway it can be done using a debugger in a breakpoint. The main idea, however, is that if a functional block has given a read access to a store. You get a generation /compile time error if the block tries to modify the store.

It is inserted by the user. If not, an error will be given. We thing that you (and the review team) should clearly know what you are locking.

if you write inside a read lock you get an error. Reading inside a write lock is allowed.

Because the locking is block structured it is possible to check the accesses during generation/compile time and give errors if needed.

Reply to
Ari Okkonen

The priority ceiling protocol works for single-processor systems where only one task can be scheduled at a time. Do you intend to support multi-processor (multi-core) systems? If so, how will you make the priority ceiling protocol work for them? The multi-processor extensions of the priority ceiling protocol seem to need the same kinds of task queues and real "locks" that the single-processor PCP neatly avoids.

--
Niklas Holsti
Tidorum Ltd
 Click to see the full signature
Reply to
Niklas Holsti

We have to do some redesign to our kernel for multi-processor systems, anyway. Any pointers to multi-processor PCP (or multi-processor mutex problem in general) texts are highly appreciated.

Reply to
Ari Okkonen

Before writing my comment above, I looked at this report:

Jim Ras, Albert M.K. Cheng, "An Evaluation of the Dynamic and Static Multiprocessor Priority Ceiling Protocol and the Multiprocessor Stack Resource Policy in an SMP System," Real-Time and Embedded Technology and Applications Symposium, IEEE, pp. 13-22, 2009 15th IEEE Real-Time and Embedded Technology and Applications Symposium, 2009.

formatting link

--
Niklas Holsti
Tidorum Ltd
 Click to see the full signature
Reply to
Niklas Holsti

Looks can be deceiving. I hope your design process does not base its conclusions on mere assumptions and hearsay.

Your assumption seems to be that process synchronization is achieved by the scheduler alone, without the addition of explicit locking and unlocking. What if a process holding a resource has to sleep for a millisecond or so, giving another process the chance to access that resource? Havoc would ensue!

So, explicit registration of resource ownership still has to take place. Refer to my question above.

I don't see how that would help. How do you envision one should debug performance problems?

--
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/
 Click to see the full signature
Reply to
Boudewijn Dijkstra

Boudewijn Dijkstra wrote: > Op Fri, 20 Nov 2009 05:19:32 +0100 schreef Ari Okkonen > : >> Boudewijn Dijkstra wrote: >>> Op Wed, 18 Nov 2009 17:09:39 +0100 schreef Ari Okkonen >>> : >>>> [snip] >>>>

Our design process utilizes experience and experimental coding of solution choices. About experience: I have implemented and used both mechanisms. (My history with real-time started with some maintenance work to Honeywell H1642 time-sharing OS in 1973. Then I read Per Brinch-Hansen's Operating System Principles and wrote my first RTOS microkernel for Honeywell H316 in 1974 as a hobby project. I never gave up real-time/embedded since then.)

Application engineering considerations:

If using shared stores with locking, it is quite easy: 1) lock those stores you need, 2) read and/or update them, 3) release them

To do the same with messages you have to: 1) define messages for data queries and updates, 2) write a task or tasks to contain the data and to answer the messages, 3) prepare for interleaved new events and server responses (this is explained later),

4) for data writes encode and send messages, 5) for data reads encode and send query messages, define a wait state for waiting an answer and decode the answer. All operations including several shared values must be programmed into the data-containing task in order to keep system-wide consistency.

It is possible that a new event arrives for processing while your task is waiting a response from a data server task. You have to implement a mechanism to cope with that. (E.g. SDL has a save-operation for that issue.)

So, it seems (measured by words, at least) that applying message based solution is more complex.

Run-time considerations:

First experimental codings suggest that those locking and unlocking operations are together about 12 simple statements in C plus a scheduler call in unlock. (Haven't measured clock cycles yet.)

The message solution requires at least 900 clock cycles in ARM 7 including two scheduler calls - switching tasks back and forth.

So, it seems that running message based solution takes more time.

Our kernel is of run-to-completion type. There are no operations available that can put a process to sleep. Anyway, holding a resource while sleeping or waiting for something may invite trouble in hard real-time.

Scenario: A certain high priority response is sometimes late. You have to find out where time goes in that specific case. The investigation is quite hardware dependent. You need debugging writes to output ports, oscilloscopes, and/or logic analyzers - or a good in-circuit emulator. I don't know any generic software help for that. Perhaps locking operations could provide a hook where a hardware-dependent measurement code could be attached.

Reply to
Ari Okkonen

Or, an event queue.

Indeed. Using messages implies that you think beforehand about your transactions. It enforces data encapsulation. Defining wait parameters forces you to think about real-time constraints, something you should also do when considering resource availability.

If you would just mindlessly lock and unlock without knowing how you are influencing other tasks, you will (in a sufficiently complex system) introduce problems at a late stage in the development process, which is well-known to be an expensive moment to solve errors. Also, a messaging philosophy forces you to work with a client-server model, which has the advantages of being easy to distribute and more importantly it promotes re-use of its loosely coupled elements.

Talking about distributed systems, is ReaGOS able to lock and unlock remote resources? Or should one rather use a message interface there?

If a lock implies a wait, then also a task switch is needed. In messaging, the task switching follows the data flow.

Then what happens when trying to lock a resource that is in use? Surely the process will wait until it is available?

Yes, but not all operating environments are run-to-completion, synchronous I/O calls typically involve waiting.

--
Gemaakt met Opera's revolutionaire e-mailprogramma:
http://www.opera.com/mail/
 Click to see the full signature
Reply to
Boudewijn Dijkstra

Boudewijn Dijkstra wrote: > Op Tue, 24 Nov 2009 23:40:22 +0100 schreef Ari Okkonen > : >> Boudewijn Dijkstra wrote: >> > Op Fri, 20 Nov 2009 05:19:32 +0100 schreef Ari Okkonen >> > : >> >> Boudewijn Dijkstra wrote: >> >>> Op Wed, 18 Nov 2009 17:09:39 +0100 schreef Ari Okkonen >> >>> : >> >>>> [snip] [snip] >>

Yes, true. It is easy to miss the real-time responsiveness, if you don't know what you are doing. The same is for many other aspects: priorities, algorithmic complexities, etc. >

Exactly. Locking a resource is comparable to disabling interrupts. You must know what you are doing. It delays some other response. You must know the affected processes and ensure that they have slack enough in order not to miss deadlines. In the design process you could define maximum locking times for the resources, if needed. - And test them.

Yes, client-server model makes many issues clearer for complex systems. However, it may make many issues more complex and slow in otherwise simple systems. The new mutex feature in ReaGOS does not prohibit you to use client-server arrangements when needed. It allows you to use straightforward and simple solution where speed and simplicity are of importance.

ReaGOS is not a distributed kernel. Messaging is needed between processors.

When the priority ceiling protocol is used, locking does not imply wait. You can always lock if you are scheduled. After you have locked, nobody else that can even think about locking the same resource, cannot be scheduled. They can be scheduled after you have released the lock.

[For priority ceiling protocol] >> >>
formatting link
>> >

Referring above, a process that could try to lock the resource is not even scheduled if the resource is locked (in priority ceiling protocol).

The problem is avoided so that ReaGOS does not support any synchronous calls. There is no call that can cause waiting. (That may require re-thinking of some usual design patterns.)

Reply to
Ari Okkonen

Thanks for your explanations. I certainly got the feeling that I've learned something.

--
Gemaakt met Opera's revolutionaire e-mailprogramma:
http://www.opera.com/mail/
 Click to see the full signature
Reply to
Boudewijn Dijkstra

Thank you for the reference. I got the paper. There seems to be several design decisions to do for a multicore system.

The current state of our project is:

  • Locking seems to work (for single processor systems) - Priority ceiling protocol -> no deadlocks, no priority inversion - Separate read only and read/write locking - You cannot inadvertently access shared data without proper locking. Locking policy is enforced by CASE tool, design language, and code generator support
  • We are now finalizing the installer for the development system.
  • The system will be released as ReaGeniX/ReaGOS version 2.2.4

Many thanks for all of you who have posted comments and tough questions for us to be considered. Contact me, if you are interested to evaluate or explore the solution.

Best Regards Ari Okkonen

Niklas Holsti wrote:

Reply to
Ari Okkonen

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.