By "controller" I assume you mean "supervisor" and not the "irrigation controller" I've been talking about?
If the "supervisor" dies, the shit has hit the proverbial fan. What do you do if *the* CPU in your single-CPU product catches fire? The supervisor is one of those "critical resources". Potentially, someone could redesign the system to implement redundancy with a protocol that allows the clients to elect/select a new master. But, that's beyond my level of interest.
When/if the supervisor fails, there is nothing to dole out work to the nodes. When they finish doing whatever they have been tasked with doing, they just "generate heat". I could potentially add a daemon that automatically powers down idled nodes but I suspect that would be a rare enough event that it wouldn't buy me much ("OK, the nodes are now ineffective but at least I'm not wasting power keeping them UP!" :-/ )
If you meant (above) the *irrigation controller*, then it depends on what part of the irrigation controller has died. E.g., part of the controller (a *virtual* part!) must reside on the physical node that is connected to the solenoid valves -- because it needs to be able to actuate the individual valves as part of "controlling irrigation".
But, the code that implements each of the zones can reside on any processor in the system. Potentially, the workload manager can dispatch the controller for zone 3 to processor node 8; zone 4 to node 14; etc. If one of these dies, the workload manager can sense this fact (think: keepalives) and dispatch the task to another, functioning node -- at the same time, instructing the *physical* irrigation controller node to ignore communications from that "failed" node that had previously hosted this zone.
First, the network is dedicated -- no "otehr" traffic on the wire. *Anywhere*. Second, it is physically secure -- you can't just plug/unplug something. It's not like a NoW where you could conceivably unplug a particular workstation while it is actively part of the grid. (The same is true of power to these satellite nodes, to a large extent). And, finally, the network/protocols are hardened. So, even if you could inject traffic onto the network, the best you could hope to do is deny service to *one* node -- the node whose network drop you have infiltrated.
[This is effectively the same as unplugging the node since it can no longer communicate with the rest of the system. For some nodes, they can continue to fulfill their roles in a fail-secure mode. E.g, the HVAC controller will continue to keep the house in the "habitable" zone though it might not be particularly *comfortable* to occupants if left in this mode indefinitely. It wouldn't necessarily know when to alter the temperature based on occupancy if it can't *sense* occupancy -- reported by some other node in the system! But, at least the pipes won't burst in winter and the pets won't collapse from heat exhaustion in the summer!]
If the network is unreliable, then something is broke. You treat it as any other failure. I.e., if you had some timing constraint on when you received the results of a query from the RDBMS and that constraint was violated, you had some recovery procedure in place. (E.g., HVAC system asks RDBMS what the setpoint temperature should be at this time of day and gets "no reply". "OK, I'll play it safe and pick XX. This might not be ideal in terms of comfort or economy but its better than a NoOp!")
I can log these sorts of failures. Unfortunately, I have not been able to come up with a strategy beyond this! I.e., what do you report to the homeowner? (I'm not as worried about commercial/industrial deployments because they will typically have someone skilled in these "physical plant" issues) What do you suggest as possible remedies? Or, do you resort to a (useless) "Check Engine" light?? :< ("Yeah, I checked the engine. It's still there!")
[I also am completely at a loss regarding how to deal with the privacy issues involved. Your *phone* leaks information about you via the countless apps you install. Imagine when your
*house* starts leaking information!! "He went to the bathroom at 3:27AM. Watched the following TV shows. Ate supper at 6:22P. Slept for 4.5 hours. etc." How do you give a "normal Joe" control over all that information and what apps are allowed to see and consume? In a meaningful way that doesn't just have him blindly consenting to each app's desire to poke at specific data?? :< ]
It's nice if you can quantify your performance. But, I look at it as: the deadline has passed; is there anything I can do to still meet my goal? if not, it's HRT.
In the case of the tape transport, the value of obtaining the data from the head diminishes rapidly after the deadline has passed -- especially as that 6us operation turns into a 2sec operation (backspace, reverse, read again). If the transport couldn't backup, then it would have become an HRT problem: once the spacecraft has left the solar system, no use trying for orbital insertion! :>
The important distinction is that it has nothing to do with the "magnitude" or "frequency" of the times involved. "Hard" and "soft" are not synonyms for "Fast" and "slow". (or "often" and "seldom")
I am always puzzled by those "definitions" (for lack of a better term) that seem to ignore the obvious: Real time is about windows/time slots in which an action is effective. i.e., too early is as bad as too late.
For a few trivial examples:
In a package sorting installation, the diverter that pushes out a box from a conveyor belt must operate when the box is in front of it, not after, not before.
Firing a rocket to slow down a space craft for reentry must happen at the proper time, not "before" a certain time.
Pushing down the cork in an automated bottling facility works best when the bottle is below it.
That seems very reasonable to me for a large, important, useful class of problems.
SRT perf stats of the type I mentioned are common within the telecom industries, where performance, end-user irritation and, (in limiting pernicious cases) correctness are appropriately described by "mostly met" time intervals.
More ambiguous would be the case where if your bid arrives after a competing bid, then it is useless. It matches your HRT definition, but the time interval cannot be specified in advance. Such considerations are paramount for the high frequency trading brigade to the extent they are spending the best part of $1bn for their own dedicated trans-Atlantic fibres which will enable them to shave 30ms off message latency!
There is a bit of slight of hand, here. A task can not complete before it has begun (). So, if you alter the release time of the task (the time at which it becomes *available* to execute), you can shift the earliest completion point for the task. I.e., the start of your "window". Then, the deadline becomes the
*end* of the window.
Typically, when you are encountering these sorts of applications, you design so that you have the "answer", reliably, some time "around" the start of the window and then defer its effectivity until after the formal start of the window.
The delaying action ensures the result is never *applied* before the window start while the deadline delimits the definition of "too late".
We do this in hardware designs, too. Set-up signals and propagate them into their intended circuits "at the next clock edge", etc.
Ah, makes sense. Yes, here telco services (at least LAND lines) are regulated to the level of performance they must provide. (I don't think there are any such guarantees for cellular service?) And, this is evident in people's expectations of that service!
I.e., if you pick up a phone (land line) and *don't* get dial tone (I think 2 seconds is the target?) you are really puzzled. OTOH, if you turn on your cell phone and get "no bars", you just grumble and do a pirouette hoping, magically, that the reception will improve somewhere during that spin! :-/
[I once lost service on a landline for a prolonged period of time. It was an unnerving feeling. You *expect* power outages but not *phone* outages!]
Assuming the third party *acts* on the earlier bid (?)
I do not understand why these misconceptions persist. Don't folks teach these things anymore? Or, have so many folks resorted to "real fast" examples that the message becomes distorted: people thinking "real fast" is the issue and not "timeliness"?
I see this in lots of places. Folklore displacing science. (e.g., don't use dynamic memory allocation; don't use recursion; etc.)
Worse yet, the abundance of resources in modern hardware making people oblivious to the costs of those resources (and ways to economize on them)
They may not be regulated, but internally inside networks the latency is still specified - and measured so that one subsystem supplier can push responsibility onto another supplier.
More importantly it is impossible to regulate the vagueries of the radio propagation channel. People often complain that multipath reception causes problems, but in mobile comms it is *required* in order to get reception in non-line-of-sight positions!
Two reasons: - "those that can do, those that can't" teach the next generation - as employment expands and becomes "old hat", the brightest minds no longer go into the field and the top 1% are replaced by the top 10%
If I was starting out again, I'd go into synthetic biology.
The "correct reason" for avoiding malloc and recursion is to enable the possibility of correctness proofs. But you know that.
More subtly, interpreted = slow. To thoroughly confuse them, I point them to HP's Dynamo experience where an emulated processor running C outperforms the same non-emulated processor running optimised C (because the emulation enables the system to optimise the code that is actually executed, rather than the code that the compiler is forced to conservatively guess will be executed.
That's because time isn't as symmetric as you make it out to be.
We can always delay delivering a result that we already have before it's needed, but there's no way we can deliver a result that we don't have yet.
Which makes the diverter operation synchronized, rather than realtime. The algorithm that figures out whether the diverter ought to be fired, on the other hand, is a realtime task. If that algorithm completes half an hour because that package comes by, that's no more than a minor nuisance. But if it runs a millisecond late, it has failed.
When I set out, I looked for a mainstream language that was reasonably safe and efficient that would act more like a "scripting" language. I.e., most of the heavy lifting that an application would need would be available *to* the application as system services. The application would "simply" (ha!) tie those services together in a meaningful way.
I settled on Limbo as it is sort of a "safer C" -- easy enough for folks conversant in C to pick up quickly. But, also including inherent support for IPC, strong type-checking, support for concurrency, etc.
And, under Inferno, provides a VM platform that makes things like pushing a process to another processor relatively painless. As well as affording some protection mechanisms for collocated tasks (and their remote communications!)
There are a few things about the language and environment that I'm not thrilled with, but, so far, it seems to have been a good choice.
I opted for a full-fledged RDBMS (currently, PostgreSQL) as it lets tasks push a great deal of work back into the server. E.g., let the server implement the joins, triggers, checks, etc. instead of requiring the client/app to do all that detail. (remember, clients are reasonably strapped for resources!)
It also helps ensure *all* clients of a particular DB follow the same constraints applied to the data *in* each table. So, some client doesn't add an entry that is incorrect and screw up some *other* client who expects, e.g., "person.age > 0" to have been enforced *in* the data! Otherwise, each client would have to implement more comprehensive checking on the data -- and, be capable and consistent in reporting any problems it encounters *to* the user! (i.e., "Age must be greater than zero" reported by one client's tests while another client opts to say "Not old enough", etc.)
(It also lets me avail myself of advances in the development of the RDBMS just by "upgrading" my implementation to "-CURRENT" as appropriate)
I'm happy with the software environment. But, having more problems than I anticipated with the instrumentation! :<
Ah, OK. You are referring to networks in the more modern sense (I was interpreting your comments in the classic telecom sense of "PSTN")
Yup. But, here, there is a silent push for telecom providers to move away from "wired land lines" as they are more costly (?) to maintain *and* subject to stricter regulation on QoS than wireless services.
E.g., rather than replace existing/damaged land lines, providers would like to provide residences with "immobile cellular stations" that give them access to the wireless network at their "fixed" home-line.
Actually, I was fortunate to have reasonably good instructors. But, I think it was a different era -- where you had to exert more discipline over your "art" than the current reliance on automated tools, "fleshy" languages, bloated libraries, etc.
Yes. And employers look to find ways to "dumb down" the requirements for the folks they "need" in those positions. Software bloat, hardware overkill, etc. Makes you wonder what will happen when we eventually *do* hit a limit and suddenly have to relearn Ye Olde Ways to make continued advances! :< (hopefully, that'll be past *my* time! :> )
I like mechanisms. (I guess you can argue that creating an organic mechanism would be comparable). Hence, I am not usually interested in "desktop software":
"Oooooh! Look! I made it blink!" "BFD. Watch me cut a pentagonal hole in this piece of steel...".
Actually, I think more often it is just fear of "common" bugs. (then why not learn how to use these things more effectively?)
I happen to be a huge fan of recursion as it makes so many algorithms *so* much easier to implement correctly, out-of-the-box
*and* easier to understand (i.e., the equivalent iterative solutions always look like there is a lot of housekeeping going on *in* the code -- lots of opportunities for "little errors"). But, I'm also careful to make sure something in the data implicitly controls the depth of the recursion.
For example, I use a recursive "matching" algorithm in one of my TTS systems. But, rather than letting the "input" drive the recursion, I let the *template* do so -- since I can ensure all of the "const" templates have implied limits to the recursion (but I can't make that same guarantee when processing arbitrary "input")
The malloc argument I think is just a failure of folks to think about how memory is consumed in their application and, how the heap is administered/implemented in their runtime. E.g., I can often guarantee that malloc handles requests in *constant* time. So, why *not* use it?
Yup. Or, "hand optimized" is better than the compiler's optimizer. Or, vice versa. Too much folklore and not enough hard/empirical data! Yet, people rely on these assumptions to make significant design decisions... :<
Although the usual easy definition of hard real-time is: "Late answers are wrong answers" there are a few occasions when doing something too early is also wrong. Hard real-time is really about doing things at the right times.
Stephen Pelc, stephenXXX@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
Oh, but "the library [or worse, framework] takes care of all of that for us". Including solving the byzantine generals' problem - I think not. :(
It has been pushed back for a *few* years by the cheap implementation of multicore processors. But that fails for more than a few cores due to inescapable memory bandwidth and latency ans coherence problems.
Long term I think it will have to go down the non-coherent memory plus message passing route.
Wow, I made something that'll outlast the human species :) or :(
Problems arise when the system has been designed and/or implemented by multiple companies/teams/people. Look at the problems inherent with using libraries in C++! (If C++ is the answer, I want to know what the question was!)
Yeah, Erlang is worth understanding even if you don't actually use it. Some of its principles can be used in other languages.
Interesting, I didn't realize anyone was using it. I might look into it. One of the authors later developed Go, as you're probably aware. I've been looking at Go a little bit, and some of my co-workers have been using it.
And libraries never have bugs? (how many *digits* in a microsoft release number??) :<
I think these "dumbing down" exercises suffer from the problem of allowing developers to be ignorant of what's inside the box. Even on a conceptual level!
One reason I enjoy writing in C so much is that I can get a real feel for what resources are being used, how complex my algorithm is, etc. Some of these other "languages" have too much smoke and mirrors obfuscating what's going on at the pins of the CPU!
Yup. Hence my approach to distributing the automation system (NoRMA, etc.). Of course, it's the sort of application (or, "application set") that lends itself well to decomposition.
[Folklore, malloc, recursion, etc.]
*** AND POORLY DOCUMENTED! ***
Three of the big (huge!) problems I see with many FOSS projects are:
- no "ownership" (no one takes responsibility for ensuring the quality and consistency of the "product")
- no formal testing (where's the regression suite? Do you expect every developer who touches the codebase to implement his/her own test suite? Replicating the work of others? Or, do *none* of them take on this task?? "Leave it to the users to find the bugs!")
- no formal documentation (what is the product *supposed* to do? Does anyone know? Or, is it "self-documenting": it does what it does!)
PostgreSQL has been a refreshing defiance of these problems! :>
C++ was good for pushing the OOP paradigm into mainstream thought. But, it tends to be too heavy-handed in how much it does for (to?) you -- all the while claiming it is making your life easier!
OTOH, it has made it much easier for folks examining my (C) codebase to get used to the structure that I embed in my data, "objects", etc. ("Why all these damn structs all over the place???")
One cute feature I enjoy in Limbo is the use of "tuples". So, I can make the return of sophisticated types and pseudo types more obvious: [silly example] (x, y, radius) := describe(a_circle) is a bit more obvious than:
I.e., the original tuple example has implicit in that one statement the declaration of the types of the returned parameters -- as well as exposing those parameters to the reader directly! (you don't have to chase down the definitions of the "result")
(there are loads of things that I *don't* like in Limbo so I revel in the few that I *do* like! :> )
PSTN and ISDN are essentially circuit switched (or TDMA) with a guarantied throughput and constant latency.
Being downgraded from some ISDN 2B+D service to some packet switching system is definitely a loss.
At least in Europe, service providers put 4 cellular phone links into a single 64 kbit/s B channel.
Going from circuit switching to packet switching saves a lot for the service provider, when no capacity is needed, when the customer is "silent".
I have absolutely nothing against using malloc(), but I definitely refuse to use free() in a critical industrial control system :-) :-).
At least in industrial control systems the main reason for avoiding malloc is the dynamic memory pool fragmentation over decades of operation.
Such systems are designed to work 10 (contractual requirement) to 50 years continuously without reboots. In many practical system, the next available boot time for clearing fragmented memory pool, might be possible at Feb 29th, 2016, when some big mechanical updates are done at the plant.
At least that was the situation, when dedicated high reliability (and expensive) was used.
Unfortunately these days, quite often some unreliable COTS hardware is used, forcing to use double or triple redundant systems.
With a redundant system, rebooting a node is not a big difference, so memory leaking applications or memory pool fragmentation applications can be used.
A redundant system makes it possible to hide some long time design bugs, which in the long run is a serious situation.
While in the 1970's it was relatively easy to write an assembler program based on original specifications, trying to hand code some tested/working FORTRAN programs, I usually failed (speed/size).
With current processors, the compilers will make a much better job. However in some rare cases, it is important to be able to tell what exact code (such as machine instruction) should be generated.