Resource revocation

Don Y · 2013-07-25T19:23:46+00:00

Hi, What's the current "best practices" regarding asynchronous notifications (in a multithreaded environment)? I have a system wherein "tasks" (omit a formal definition) request resources from a service that meters out their use; waiting until the resource has been granted to them "officially" (in some cases, this is all trust based). When done, they surrender the resource to the service where it can be reused by other consumers. But, there are times when the service must revoke a granted use of a particular resource. In some cases, it "asks" for the resource back (giving the current consumer time to tidy up before releasing it). In other cases, it just *seizes* the resource -- and notifies the consumer after-the-fact. Presently, I use signals to notify the consumer when this sort of thing is happening. But, my personal experience is such that folks have problems writing this sort of code. *Remembering* that they have to register a handler for the signal; remembering that said handler can be invoked at any time (including immediately after it has been registered); etc. Is there a new "safer" way of implementing these types of notifications? Thx, --don

D

Don Y 12 years ago

In my cases, services are often physical ("real world") resources. E.g., client1 would like to use water (at a certain rate of flow) to irrigate some portion of the landscape. Service grants this request. Some time later -- before client1 has finished using the resource -- client2 would like to use water to "take a shower".

The service knows (because of codified rules) that taking a shower is a higher priority (in terms of "importance") than watering the yard (because the system has defined it to be so!). In an ideal world, service could notify client1 and *command* that it release the resource, "now". Await acknowledgement that it has released the resource, then grant it to client2. Presumably, client1 will immediately issue a request for the resource -- which the service will defer -- because client2 is now using it and has a higher priority (think preemption) than client1!

"Service" can perform neither client1 nor client2's actions with that resource -- doing so would require intimate knowledge of their respective applications.

The flaw in this "cooperative" approach is client1 need not comply with that "command". Or, might be sluggish in complying. Then, client2 sees an artificial delay in when it can legitimately use the resource. Similarly, it might be reluctant to relinquish it when the fire suppression system client *needs* it. (i.e., these are all just different "priorities"/importances to the service.)

How long do you allow the current user to hold onto the resource after notification that it MUST be relinquished? What if your message got lost on the network -- does your protocol include time for a reissued "command" to be delivered SO client1 CAN COOPERATIVELY RELINQUISH THE RESOURCE? What happens *in* client1 when it "discovers" the resource has been withdrawn?

(I.e., I contend client1 has to be able to handle the asynchronous revocation of the resource *without* notification, regardless.)

So, you have to implement a fail-safe mechanism whereby you can forcibly revoke the service from a noncompliant client -- and deal with the consequences of that.

[For other examples, consider a client that wants to operate the air conditioner compressor while another has already been granted use of a certain amount of AC power for the electrically fired kiln. While air conditioning seems like it should be a high priority/importance task, turning off the kiln before the glaze is completely baked will probably alter the appearance of the piece -- possibly rendering the work "useless" (along with all of the electricity expended to bring it to that point).

Or, imagine the resource is the MIPS available on some processor that is underutilized in the system. Or, some portion (KB/MB) of a fixed size, persistent store ("Yeah, I know you'd love to store your children's birthdates. And, I know we told you, previously, that you could use this space for that. But, we

*really* need to store this critical log message until someone has a chance to review it!")]

See above. Server is just metering out resources so the only "action" it can provide is granting or denying access to that resource.

But that's how I see these things. In much the same way that a scheduler invocation INTERRUPTS a currently executing task's access to the "CPU".

The difference is the interrupted task has a contract with the scheduler that it will *try* to resume the task WHERE IT LEFT OFF at some later point in time. AS IF, the interrupt had never happened (unless the task is actually watching the wall clock or anything else that changes with the passage of REAL time!). I.e., the task is not asked for it's *consent* to be interrupted -- even if it might never be resumed!

Similarly, a task can be killed asynchronously. There is no prerequisite to notify it, in advance (pre-notification can make its shutdown more orderly -- but, also requires more elapsed time *and* coding!)

Or, whose messages/replies aren't being delivered due to network congestion, delays, etc. (uniprocessor and SMP techniques don't map directly into the distributed world where these guarantees are a lot harder to meet *in* a timeliness context!)

[Remember, a message has to get from its originator, through that scheduler (i.e., when is the originator allowed to execute?), through the network stack (is it an RT stack??), onto the media up through the receiving stack (again, RT?) and dispatched to a task running at some unknown LOCAL priority before that task can even *see* it.]

By contrast, notification (e.g., signal) that the resource has ALREADY been revoked has an implicit "fait accompli" -- regardless of how long the message took to transit.

Neither is a particularly great way of dealing with the problem. But, cooperation just seems to rife for abuse. Even well-intentioned clients might be busy when the request (command) arrives. And, may have complex requirements to wind down their use of that resource in a timely fashion E.g., they might be acting as an agent/server for some related "resource" that some *other* client is currently using. And, that client will need to be "notified" that it must release that resource... before the original client can release the resource it has been commanded to release.

(How long do you sit patiently -- defering the needs of that original client2 -- while you try to be polite/accommodating of client1 et al.?)

I'm willing to live with signals *if* I can come up with a clean framework that makes it easy for developers to implement them, correctly. (so a client isn't chugging along in ignorance thinking that it still owns that resource that has ALREADY been revoked and repurposed).

[E.g., you *think* you are watering the yard but, in fact, there is no water flowing through *your* pipes! Or, you *think* a CPU is busy working on some aspect of your "job" but it has been reallocated to some other purpose... you'll be waiting *forever* for it to complete!]

Yet another example of how bending requires more thought/work than simply throwing excess capacity at a problem! :-/

--don

Vote

D

Don Y 12 years ago

I should add (though it complicates the discussion by adding another agency to the mix) that your scenario sort-of happens currently. There are "trusted tasks" at different places in the system that act as gatekeepers to these physical resources.

E.g., a trusted irrigation task has actual control over the irrigation valves (hardware). So, the "water resource manager" can expect it to comply with *requests* to relinquish that "service" granted to it (e.g., in case someone wants to run a shower).

But, the implementation of a particular irrigation algorithm is delegated to an untrusted task. I.e., that task requests water and, when granted, "tells" the trusted irrigation task to activate the valve associated with that untrusted task (access control). So, that "service" can be regarded as acting on behalf of the (untrusted) client instead of having that untrusted client actually controlling the valve (which would require some upstream mechanism to allow the water to be "reclaimed" from a noncompliant client)

[The same is true of things like compute resources. A trusted service actually meters out access to the resource because there is no other way of ensuring its release/redeployment]

But, the untrusted client still needs to be aware of when it has and doesn't have that resource!

--don

Vote

A

Anders.Montonen 12 years ago

Your description reminds me a bit of the capability-based resource management model used in the seL4 kernel. There's a bunch of papers available at

-a

Vote

D

Don Y 12 years ago

Yes, under Mach you used ports/TxRx "rights" as a convenient capability system. Since these were unforgeable objects "secured" by the kernel, you could predictably and verifiably ensure only intended clients accessed particular resources at aparticular times (e.g., if you didn't CURRENTLY have the capability, then you couldn't CURRENTLY access the resource -- even if you could do so a few microseconds ago!)

There, a resource could be modeled by a port (communication object) and a capability by a *right* (to access that port). The capability allowed you to *ask* for something to be done on your behalf (by the service that was listening on that port... i.e., "implementing" that resource). Creating multiple ports for a single "logical" resource allowed different actions to be possible with that resource.

E.g., the server can listen on port A for (preauthenticated!) requests to read *the* "file" represented by the resource. (i.e., read_file(portR) directs the IPC/RPC for a "read" request on the file represented by that "portR"). Attempts to write that same file might only be honored when the file (resource) is accessed via some *other* port (portW). So, incoming requests received by the server handling that file resource that arrive on portR are ignored (or, trapped!) if they contain the "write file" message -- because portR intentionally doesn't handle write requests. Thus, only clients holding a right for the portW resource (which is really the same file referenced by portR!) can legitimately send write requests to the server for that file.

[There were other ways of doing this as well]

But, this presents the same sort of problem. When wants to revoke that capability (resource priviledge), you either engage in a cooperative exchange ("request/command" its return) *or* you revoke it unceremoniously and count on some notification mechanism to inform its previous "holder" of your action.

E.g., you can ask the client to relinquish the portW/portR rights that it holds (since you know these). And, wait for their return in an orderly fashion before doling them out to some other client (this assumes you have mutually exclusive access to the resource).

Or, you can *silently* revoke the capability -- letting the client THINK it still has access to the resource (file) even though it doesn't! The next time the client tries to access the resource (i.e., tell the server to do something on the client's behalf), the server can return an error. The client then fields this error and tries to figure out *why* the operation couldn't complete as expected.

Or, you can asynchronously notify the client (via its exception handler) of this action and let the client recover *when* it has been revoked.

Experience with the first (cooperative) approach has shown response can get sluggish -- there are more actions in the critical path (to giving the resource to another client!). And, it required the clients to be benevolent and cooperative. E.g., before they undertook any sort of blocking/synchronous action, they had to ensure they had spawned another thread to watch for these notifications and act on them while the "main work thread" was blocked.

But, this is essentially what an exception handler does! It fields these asynchronous notifications on behalf of the thread (process, task, job, etc.) to which it is associated. So, why have two mechanisms to do the job of one?

E.g., if you divide by zero and haven't registered an exception handler to deal with this possibility ("catch"), then your task is killed (the "default" exception handler!)

Tanenbaum uses cryptographic tokens for capabilities and, thus, has to physically pass them around (how do you "retrieve" a token that you have revoked if the holder doesn't want to -- or *can't* -- relinquish it?)

I'm trying to protect against malevolent/sloppy/lazy developers without penalizing everyone for this "protection". As well as deal with inevitable changes in resource assignments as the execution environment changes!

Thanks, I'll have a look through them!

--don

Vote

P

Paul Rubin 12 years ago

Who claims that?

That sounds like a misstatement. I can believe that verification is much harder in general if you use dynamic allocation, and therefore some certification methodologies preclude its use. That doesn't say reliable RT with dynamic allcation is impossible. It just means if you use it, you're going to have to deal with either a less rigorous or a more complicated approach to certification. By restricting the program's dynamicness (e.g. like SPARK Ada does) you can do some powerful verification with fairly easy to use tools.

Vote

R

Roberto Waltman 12 years ago

And for SW maintenance, knowing how to conjure the pixies of QIOs, ASTs, and SYSGEN...

Roberto Waltman [ Please reply to the group, return address is invalid ]

Vote

R

Roberto Waltman 12 years ago

While at the same time discouraging the use of preprocessor macros ;)

It is common knowledge that Stroustrup developed "C with classes"/C-front as a tool to work on simulation projects for which the Simula language would have been ideal, except for performance issues.

But I never read anything referring, directly or indireclyt, to how/when the switch was made from "Let's make a tool to solve a problem" (or set of problems,) to "Let's make a general purpose programming language"

If this had been the goal from the beginning, the language may have been quite different.

Vote

U

upsidedown 12 years ago

ASTs (Asynchronous System Trap) are essentially Windows callbacks.

Most interesting QIO parameters are available in Windows callback parameters.

SYSGEN was essential for PDP-11 system building, on VAX/VMS the question was really can you do some essential performance improvement by doing some kernel mode tweaking, on Windows NT 3.x VM tweaking was quite hard.

Vote

B

Boudewijn Dijkstra 12 years ago

Why be so nice? Surely most clients are perfectly OK with getting cut off "now" and receiving a notification "circa now".

In this case I'd say the action is "deliver water", and requests could specify a definite amount of water or time, or leave it undefined.

Normally fire suppression is more important than any damage caused from taking the resource by force.

Before any HRT deadline, I suppose.

Absolutely.

Or forcibly revoke by default and allow clients to indicate whether they'd like an advance warning (which is not guaranteed to be timely in case of network hiccups).

So airco has high priority and kiln has low priority initially, but changes to critical priority once baking has started. (Which is still below emergency priority.) If you were illustrating a problem, I don't see it.

In these cases the clients need to be able to detect the availability of the resource, either via a guaranteed notification or some other sense.

It all depends on how damaging each decision is. In case of water or electric power the clients might detect when the resource is going away. They might have an internal buffer for winding down or they might have a special procedure in case of immediate cut-off.

Indeed.

Gemaakt met Opera's revolutionaire e-mailprogramma: http://www.opera.com/mail/

Vote

D

Don Y 12 years ago

Well, to cover the widest class of apps and implementations, it would be nice to be able to receive notification so you could do any cleanup, etc. E.g., if the resource was "primary power", you might want to spin down any mechanisms that are currently in motion before the power is forcibly removed.

For a compute resource, you might like to be able to checkpoint its progress instead of losing ALL of its state when it is KILL'ed. Etc.

At the other end of the spectrum are some of the cases I illustrated. E.g., you *have* to be able to cope with a loss of power, water, etc. -- whatever the resource -- because the system itself has no guarantees that these resources will be available for its own use! (i.e., if you can't handle losing power asynchronously, how will you handle a REAL power outage/"blackout"?)

Yes, but that's a trivial case (and, if you look at my followup where I qualify how I actually manage "water" with a "trusted" intermediary, you can see that this is how things work AT ONE LEVEL in that particular application).

OTOH, imagine the resource is computational. Or, associated with some particular I/O device that isn't directly attached ("serviced") by the "server" (e.g., a persistent data store). The server may not *have* the resource to be able to *supply* it. E.g., the workload manager doesn't have gobs of spare MIPS. Rather, it keeps track of where excess capacity exists in the system and allows clients to avail themselves of that capacity (resource) -- based on *its* metering of the resource.

Yes. But you could pick other pairs of uses and get different results (e.g., the ACbrrrr vs. kiln, below)

(Ignoring S/H issues...) but the task being *asked*/commanded to release the resource has no knowledge of the deadlines (priority, importance, etc) of the prospective new consumer of that resource!

Nor should it! (?)

I.e., a preemptive scheduler (anything beyond a simple RR) embodies the knowledge of priorities, deadlines, etc. in a system -- not the tasks themselves. Hence, we don't *ask* a task to yield the CPU; we just *take* it. And the tasks know that this is a condition of their execution (in this environment).

For a client to know that it must release a particular *use* of a particular resource in a specific time, it would need to understand other aspects of the system beyond its own (e.g., the requirements of the client that will NEXT be using this resource).

But, this means every service (or agency) has to support both options: unceremoniously revoke a resource that it wants/needs to dole out to another client/consumer *plus* provide advance warning (and somehow fold notification of the timeliness constraints for *this* revocation into that notification) if the client so desires.

I.e., instead of "burdening" the client with being able to

*just* handle the asynchronous notification of a resource's revocation, you now require all services/agencies to implement both capabilities!

(Note that an "app" can just as easily be a "client" or a "service"/agency! So, now there is even more you have to hope the developer "gets right")

One possible approach would be to allow the service to advertise it's "revocation policy". (e.g., "I do not provide advanced warning of revocation"; "I will provide advanced warning -- within some limits that I might not to advertise, here"; "I will never asynchronously revoke a granted resource"; etc.) And/or allow clients to condition their acceptances of a resource (e.g., "I want this resource but only if you will provide me XXX prior notification of its future revocation"; "I want this resource but only if you can guarantee it to me for "; etc.)

[This can quickly get unwieldy -- though you could build a library/service to manage this sort of thing in a reliable manner ON BEHALF OF other services!]

If you embody "policy" in the service provider (i.e., let *it* decide how it can forcibly revoke previously granted resources), how much of the details of a particular consumer's use of that resource does it need to be aware of? How do you allow for the introduction of new knowledge as new consumers are developed or brought on-line?

I.e. you *really* would like the client to have some influence over this policy (future-safe). Yet, at the same time, you don't trust him to be impartial in this voice!

"Oh, it's REALLY REALLY REALLY important that I never be denied access to a resource! I am SUPER IMPORTANT. Trust Me. You have my word on it. -- Joe Isuzu"

[Forgive the cultural reference]

It's available when it was (past tense) granted to the client. And, the client went on its merry way "knowing" that it owned that resource.

Now, later, you change your mind and revoke that resource (possibly without any forewarning -- the client might NOT have a "local copy" to fall back on, etc.)

[I'm just trying to illustrate the different ways *each* approach can screw you]

Yes, but deciding what the current "consequences/priorities/importances" of a given resource assignment gets tricky. Do you put all of the decision in the service/agency (i.e., the mechanism that is doling out the resource(s))? Leave it up to the consumers to *declare* (and.or implement their requirement constraints)? Or, some combination of the two?

E.g., in a trusted environment, you can freely share this responsibility so that you better find the performance/reliability "sweet spot" for a given set of resources and consumers.

But, when you don't inherently trust a consumer to "be fair", it gets much more difficult.

And, I haven't even addressed things like potential for deadlock! I.e., how do you implement a "safe" mechanism ACROSS A VARIETY OF SERVICES implemented on numerous different servers to ensure "tasks" (which may be consumers and producers) don't get into deadlock WITHOUT KNOWLEDGE OF OTHER tasks in the system?

There just doesn't seem to be much prior art to address this sort of issue... And, the size/distributed nature just makes it that much more difficult to "nail down" systemically.

--don

Vote

U

upsidedown 12 years ago

I do not understand why you insist of giving the resource to the client and then want to revoke it. Why not simply let the resoce be in control of the server and only act upon it by client requests.

Why not divide the water into quantums, say 10 liter/quantum. The irrigation system must then ask more water, when the previous quantum has been consumed. In the simplest case a shower request will be granted as the previous irrigation quantum is consumed or in a more complex case, the irregation quantum is pre-empted and the delivered amount is returned to the irrigation client.

Three remote controlled valves in a water feed labeled "1", "2" and "3" are quite generic and can be used in many different ways.

For instance you could connect a fire hose to valve 1, shower hose to valve 2 and the irrigation hose to valve 3.

The server does not have to know, what is behind those valves. It just has to obey the rule that valve 1 has priority over valve 2 which has priority over valve 3.

Vote

G

George Neuner 12 years ago

Hi Don,

Abridged "pocket sized" dictionaries - which students carried in the days before computers were required for kindergarten - have conflated SINCE with BECAUSE for decades. I have a 30-some year old dictionary from high school that does so.

e.g., Since: adverb : because; inasmuch as: Since you're already here, you might as well stay.

I believe this usage now has found its way into all dictionaries, although a *good* one will indicate that the usage is informal and conversational and should not be preferred.

I recall reading a newspaper article - maybe 10 years ago - about how Oxford and Webster were among the last holdouts and were finally "correcting" their collegiate and unabridged editions to "bring them into line" with common usage.

To old farts [like us] who know the difference, it may be confusing when a use of SINCE leaves out the temporal referent, but it is not "completely wrong" BECAUSE the meaning of SINCE subsumes the meaning of BECAUSE.

Moreover, it can be argued that there is an unspecified temporal referent in the use of BECAUSE ... at least if your physics demands that cause precede effect. This argues that SINCE and BECAUSE are related closely enough that they might be considered synonyms.

[ I'm just messing with you - it's very hard to resist a good "since for because" flame fest 8-)

I was schooled with the notion that the dictionary was the arbiter of language ... not a reflection of it. IMO it should be very difficult to get a new word or meaning into the dictionary, but that doesn't seem to be the case any longer.

I wouldn't mind so much if the informal and colloquial meanings of words were clearly marked. However, many dictionaries simply assume readers understand that the ordering of entries in a definition gives precedence to the meanings and indicates their appropriateness to be used in formal writing.

Indeed readers might understand had they read the dictionary's howto preface pages. But most people who have used a printed dictionary have never read the preface, and many students today have never even looked at a printed version. Many web dictionary sites I have seen don't even have a page explaining how to read definitions ... they simply assume users already know. And that is a problem because most users don't know. ]

As always, YMMV. George

Vote

D

Don Y 12 years ago

Because the needs and priorities of the system may dictate *new* "priorities/importances" for the use of said resource!

You are thinking in terms of a uniprocessor -- where the service runs on the same hardware host as the "client". What happens when the service runs on hardware that doesn't have physical control of the resource?

E.g., wrt water:

The irrigation "system" consumes water. The washing machine. The dryer. The dishwasher. The ice maker. The water softener. The swamp cooler. Each sink in the house. Each hose bibb (outside?). Each toilet. Tub & shower.

I can't control the flow of water to all of these (without significantly replumbing the house). But, some are a lot easier to control because they already have electromechanical valves in place.

Do you run wires from each of these "loads" (clients) to one central place so *one* piece of software can control ALL of the valves?

Do you put a little bit of the "water service" in each physical device (MCU) that has control over a valve?

Do you delegate responsibility for the valve to a TRUSTED agent running on each such device ON BEHALF OF the water service?

(Hint: pick door #3!)

But, even if you pretend that trusted agent *is* the water service, how do you inform the UNTRUSTED client (app) that it is no longer allowing water to flow as expected by the client? I.e., "Yeah, I told you I would turn the water on -- *this* water -- but I have now changed my mind".

What if the client, having asked (and been granted) for the water to be turned on is now busy watching soil moisture levels. When the moisture level reaches a threshold, it will ask for the water to be turned off.

But, it never reaches that level! Because its asynchronously been turned off!

What if the resource is a compute resource. I.e., a client needs some extra horsepower to perform its mission. It requests some "available MIPS" from in the system there might currently be surplus capacity -- including the possibility of bringing some capacity "on line".

The client is granted the resource, dispatches the desired workload and goes on its merry way. But, The System later decides there is a more important use for that compute resource. Perhaps the hardware associated with it has direct control of some I/O resource that was previously unneeded but now *is* needed. E.g., the processor controlling a PZT camera wasn't needed (perhaps powered off!) when the initial MIPS request was received and had been allocated to this client, initially. Now, however, someone has rung the doorbell and the processor is needed to interface to the camera and push compressed video to a display, somewhere.

Do you, instead, give the requesting client "so many MIPS for a fixed time quantum" and expect him to renew that request, again. Just so you can artificially insert synchronous "revocation points"? And, resign yourself to *not* revoking the resource at any time other than these?

Isn't this like a scheduler only allowing preemption on the jiffy? Or, coincident with the invocation of any system call? (and nowhere else)

But what if the water load is connected to valves Q and %? (see above)

Again, the problem of "notification" -- when and how -- still remains.

Vote

R

Richard Damon 12 years ago

As others have pointed out, yes they have (at least many of them). The big question here is how do we determine the meaning of words in a language, and in general it is by how they are used in a manner accepted by the general population as correct. In some cases, groups are set up to define a "standardized" definition of a word in specific technical context (like ISO) to allow things to be talked about with better precision. Going back to the web site you pointed to, the author seems to have no claim to this, so that when the author of that site starts by saying that the community has mis-defined key terms and he is going to assign different labels to what those terms did mean, and new meaning to the labels, he has admitted that he is going to abuse the language of the field and has put his work into a land of gobbledygook, and NONE of his results can safely be moved out of his frame to anything else. It goes back to the old standard of logic that

if A -> B not A therefore ??? (to claim not B is a fallacy).

To use proper language, the normal claim is that you SHOULDN'T use dynamic memory. Note that as a GUIDELINE, it IS a good rule, (provided you understand why), the biggest issues are:

1) Programs that use dynamic memory, especially on the smaller machines common in real-time systems, MUST be prepared for the allocation to fail. If a critical operation might fail due to a failed memory allocation, then we need to either be ready to handle that failed case, or be able to mitigate the failure. 2) Dynamic memory acquisition can have a poorly bounded time, which can make it hard to make a proof that a dead-line WILL be met.

As with most "rules" for programming, violation with proper justification/documentation should be acceptable.

I myself have used dynamic memory in real-time critical systems. In the last case the mitigating factor was that I HAD preallocated enough buffers (with safety margins) to handled the systems rated load. In the case when request were coming in above our rated maximum rate, the system might exhaust the preallocated buffers, so it would allocate additional ones from the heap (and mark them so they would eventually get returned to the heap when done). Allowing these heap allocations reduced the level of degradation when overloaded (the choices were to either fail the requests for lack of buffer space, or do the dynamic memory operation and fail only if the heap is now exhausted). That heap space was also used by a few non-critical routines, which would only be used when the system was under a light load.

Note that depending on undocumented behavior of a processor is a good way to find your program "broken" on a later revision. Sounds like some of the "undocumented instructions" that people discovered playing with the processors (the designers didn't waste logic to try and detect some illegal op-codes). The programmers who coded using these found out that there programs no longer behaved on the next "upwards compatible" processor. It is very common that you only document the interface that you want to be responsible to maintain in the future, and not behavior, even if useful, that is accidental and you don't want to commit to maintain.

Note that even for ordinary English, it isn't a "Majority" rule, but more of a "Generally considered as correct" with a dash of inertial that requires changes to slowly be approved (first often with a "slang" tag). If the minority is preserving a well established meaning, then that is reasonable (and part of the inertia of language), if the minority is trying to redefine things to mean things other than what other could reasonably be expected to know it as, it is an abuse of language.

I am not saying that this is a totally worthless way to look at a system, but it does NOT follow the accepted definitions of Hard and Soft real time as used in the industry. In fact, he seems to define that what is generally thought of as "Hard Real Time", where there are solid requirements that are expected to be made, as a different class of system (time-critical). These means the class of systems he is addressing are "non-time-critical", which may well be an useful class to talk about, but he poisons his presentation by abusing language. If he took the effort to develop a truly descriptive language for his cases, then perhaps is presentation would be more effective.

As you don't seem to understand the concept of HARD requirements, and how to prove you meet them. These sort of requirements typically list the set of operating conditions, and under these conditions you need to be able to guarantee that you system will meet a defined set of critical results, often including completion of within specified periods of time. There normally are additional, but looser, requirements for operation under conditions beyond which the system has been specified with required performance, with things like best effort, and limiting problems that the system itself can cause to others.

As an example, a system might be specified to be able to stop a single missle coming in )and for this level of complexity, you are often give a required success rate instead of having to be perfect), but if in review it comes out that if two missiles you just give up, you are going to have to explain why you can't attempt to intercept at least one. Systems are normally required to gracefully degrade. There is a BIG difference between a system that has failed (failed to met a requirement) and a system that has become "non-operational", especially during the qualification phase (where you are going to be made to fix it or you might not get paid), vs a failure under deployment, when the system needs to do its best to limit damage, and you will need to solve the problem later.

No it can NOT. The problem is that, as far as I can see, the methodology assumes you can apply a zero value to a task that just isn't done, while in actuality, when you have hard requirements this isn't a valid assumption. Of course, he has gotten around this by defining system with this sort of hard requirement as "time critical" instead of real time.

The second problem with the methodology is that for systems where assigning values might make sense, in many cases the value really is only realized at the end of a whole operation, that may we depend on a number of sub-steps being successfully completed. These sub-steps need to get scheduled, but you can't really assign value to the individual steps, as they ALL need to complete on some sort of schedule for any "value" to be earned. This means that the technique being described may be helpful for a higher level "planing" for the system, but may not be as useful for the fine scale scheduling of the individual sub-tasks (but the planing stage might be able to profitably influence that scheduling).

As to going "on the record", I will note that you have chosen to hide behind the mask of anonymity, and are NOT going on the record for yourself. As such I can't tell if you are really "competition", but if you are working in the domain that these techniques are aimed at, I doubt it, as I tend to have requirements that MUST be met.

Vote

Resource revocation

Join the Discussion

Didn't find your answer?