Managing "capabilities" for security

Hi George,

[watch mail; I'm p>> E.g., "23" interpreted as a file descriptor in task (process) A can

But there is no *requirement* that this be the case! I.e., if you printed the value of a file descriptor in three instances of an app running on the same machine -- along with three instances under some other OS -- you could expect up to 6 different "local names" for the same physical object.

By contrast, under Amoeba, 0xDEADBEEF12345678 pertains to exactly

*one* object regardless of *who* has that value present in their local memory (as a capability_t). The identity of theobject (along with the allowed permissions) is encoded *in* that value.

"23" has no such encoding (as a file descriptor *or*, in my case, a Handle)

I.e., Amoeba tasks can pass capabilities to each other via any number of communication methods WITHOUT THE CONSENT OR INVOLVEMENT OF THE KERNEL!

So, for example, a task can display a hex representation of a capability ON A CRT. A human can read that value and type it into another terminal. The task attached to *that* terminal can then store this in memory and present it AS A VALID CREDENTIAL with exactly the same permissions as the original!

No record of this transfer exists.

In Mach, all that first task could do is display the equivalent of a file descriptor to the user. If the user typed that into another process, it would be meaningless, there. You have to tell the kernel to transfer the capability (or a copy of it). It's not just "data".

But not invalidated if possessed by another. I.e., I can send copies of a capability to 100 other tasks. They have exactly the permissions that I had. The kernel is unaware that I have made these transfers -- it's just data! (that was the goal of Amoeba -- to get the kernel out of that business, contrast Mach).

Because the kernel is NOT involved in the duplication of existing capabilities (same permissions), it never knows how many such outstanding references exist.

Yes, as can mine. Same limitations -- the kernel is involved in creating *new* capabilities (different permissions on same object or different objects entirely). But, Amoeba allows exact duplicates to be made "for free". Mach doesn't. The kernel is involved in all Handle (port) operations. Even literal copies. So, it knows how many copies of each Handle exist at any given time (fudging the truth here in the distributed case).

For example, Mach had a "send-once" right. It allowed the holder to send *one* message to a port. It could not be copied by the holder (had to be created by the holder of the corresponding *receive* right) even though holders of regular send rights could ask the kernel to copy (or move/transfer) them.

So, a task could be given 38 send rights for a particular port.

38 messages later, the port stops working.

In Amoeba, once you know the ticket's "binary value", you can use it as many times as you want (unless the server backing it has imposed some limitation on its use -- which a mach server could also layer atop any send right!).

I can *move* that send-once right to another task of my chosing thereby giving it the "permissions" that were tied to it (by its server). But, in doing so, *I* lose the ability to use it! In Amoeba, we would *both* retain the capability to use it!

Yes. My point was it allows a name to be revised without being "changed". The title of the book remains the same though its content/role has been altered.

Last year's pass for the park doesn't work this year.

But the reference is the same regardless of who "uses" it! If you give me a binary copy of a ticket, the kernel doesn't know that I now possess that capability (object, permission).

I can freely create a "constrained" capability from this and hand it to someone else. Who, in turn, can *duplicate* that and hand it to yet another.

The only way for any of us to know a ticket (capability) has been revoked is to try to *use* it. If I use mine or you use yours, we get told THAT capability has been revoked. But these other (nameless) two guys in my example have no idea that this has happened (unless one of us knows to tell BOTH of them).

Yes. But literal duplicates of a ticket can be made "for free".

Correct. Ditto Mach.

Correct, also ditto Mach.

In mach, deleting the port (Handle) deletes all instances of it. And, since the kernel knows where those instances are, those Holders can be notified when that happens.

It needs to know the current version of each capability that it will honor. Because actors can present stale tickets.

In Mach, the Handle is *gone*. The Handler never sees it again. Nothing to "remember".

This is the Mach approach -- a Handle is a Handle (a port is a port). The only version that exists is the current version. If I don't want you to have a Handle, I delete that Handle -- there's no way you can reconstruct it "from memory".

The way the "permissions/authorizations" associated with a particular handle are interpreted can change -- without the consent of the Holders (also true in Amoeba). E.g., if he File Server is flushing all buffers to disk because of an impending shutdown, it can choose to ignore the "write" permissions that had previously been granted in EXISTING Handles.

If it *wants* to notify each Holder of this, it can (because it knows which tasks are holding those Handles *now* -- no duplicates without the kernel's consent).

Yes! This is the single biggest drawback! But, it is the only way to enforce communciation restrictions, etc.

E.g., in Amoeba, even bad tickets have to be processed by the backing server. On *that* server's dime (timeslice/resources). In Mach, bad Handles (no send rights) are caught at the local server on the offending actor's dime!

If a Handler sees an actor abusing his privilege (whatever that means), he can revoke the Handle (because the Handler has the receive right for the port!) and shut the actor up *at* the actor's kernel.

In Amoeba, the server is forced to validate (invalidate) each request as it comes in -- regardless of how often, etc. And, tie up RPC/IPC resources as well!

In Amoeba's case, RPC could be reasonably cheap because Amoeba expected processor *pools* -- lots of SBC's in a big box. You could afford fat pipes between boards, shared memory cards, etc. regardless of whether you created them initially. Mach could also support SMP & NUMA multiprocessors but handles NORMA by proxy. Invoking the communications for RPC is expensive. If you can determine that the connection shouldn't be attempted and inhibit it locally, you gain a lot.

(Imagine all the network traffic if a roue/corrupt/adversarial actor kept issuing RPC's to an Amoeba server that was just going to repeatedly say NO_PERMISSION!)

[I am obsessed with DoS attacks as I see this as the achilles heel that will always remain vulnerable in interconnected systems. Witness the extremes I taken in the design of my router/switch! :< Don't even *make* a coonection if its not supposed to exist!]

It's just an IPC! Ameoba won't let tasks issue IPC's without a capability?? Who validates *those* (i.e., acts as handler/server for IPC)?

It's still an IPC. Something on the other end has to validate on *its* dime. In my case, that's the kernel EXECUTING IN YOUR TIMESLOT. In Amoeba's case, the kernel has no role in IPC other than to pass the data along to the service encoded in the ticket -- wherever it may currently reside. That service has the only and final word on whether it wants to honor the ticket (at this time).

Am I missing some *new* addition to Amoeba that allows the local kernel to act as selective gatekeeper?

Correct. It also has to track the current state of each executing task, current memory allocations, runtime priorities, etc. I consider communication "rights" to be a key aspect of providing integrity to applications.

Think about it. How can an actor take down my system? It has the final say in only those objects that it backs. In all other cases, it is subservient to other on whom it depends. You can't abuse a service unless the service chooses to let you. If you become abusive, the service can revoke your ability to even *talk* to it!

It amounts to the same thing. They chased after UN*X becaue it would give them a "real application" to demonstrate. Hardly anyone would be interested in watching an "inverted pendulum" controller implemented on yet-another-OS.

So, much of the work went into supporting mechanisms that had no use outside of the UN*X context. E.g., the "environment server". And, many hooks were created just for these things! (e.g., the list of special ports provided to newly created tasks should only have been a *single* port from which all others could be derived).

The "BSDSS" (Single Server). Ditto POE (DOS single server). Then MachUX and eventually MachUS (or perhaps I've got the multiserver mnemonic backwards). Along with the RTMach and Mach4 branches. (I believe I probably have the largest "private" Mach archive in existence. I'm sure many of the older releases are no longer available on *any* public servers!)

NeXT followed this route with 2.5

But even the first "UX" server had too much of a performance hit precisely because of the "kernel congestion" you mentioned above. That's the mistake with pursuing a UN*X goal -- even temporarily. The mechanisms don't fit the application as well as existing UN*X "monolithic kernels" did.

You're gaining something (additional isolation, multiprocessing support, distributed processing, user-level services, etc.) and there's obviously going to be a cost. UN*X itself didn't have (nor apparently

*want* those features so it could be more efficient.

The RTMach crew just built on these same assumptions -- instead of saying, "UNIX is great -- for the problems that UNIX solves THE WAY UNIX SOLVES THEM. We're going after a different class of problems and a different methodology for solving them.

Exactly. But, the Mach API *under* another layer of abstraction just makes things worse, not better. Also, there were some problems with the possibility of cross contamination *within* the library -- no doubt another sacrifice to efficiency.

I think the Mach (and Amoeba) focus on application to existing usage paradigms is the fatal flaw. E.g., applications there are transient and invoked by a multitude of users on a multitude of objects. BIG objects.

OTOH, appying those mechanisms to a *control* system where:

- connections tend to persist for a LONG period of time (24/7/365)

- isolation means real dollars (task A doesn't end up scrapping all the product produced by task B)

- processors can have multiple cores

- as well as being physically distributed

- and compute costs decrease considerably faster than memory or communication costs THEN, burning cycles in a kernel (cached) to save references to memory (local or remote) AND unwanted communication costs can have a big payoff.

But, I think you have to approach the OS with these things in mind and not "files" and TTYs, etc. Do you *really* need all these traps? All these function arguments? etc.

At the opposite extreme, see how things like L4 struggle to "run with the big boys". Yah, you can be lean and efficient and fast -- but, what does it cost to implement expensive feature with a few primitives? And, what level of skill to do so with the same sort of efficiency? ("Wow! Kernel traps are twice as fast!" "Yeah, but you need twice as many...")

Reply to
Don Y
Loading thread data ...

Hi George,

Brief (ha!) description of how objects come into being and how actors get access to them. And how their interactions are controlled and revoked.

[mixture of Mach-speak and each of the other lexicons we've loosely employed]

Sketching as you read may help!

When a task is created (by /), in addition to the typical resources packed in that container (memory, executable, etc) there is a *Handle* to a Namespace_t. The Handle is effectively a send right to a particular port_t. That Handle is *handled* by the Namespace Server (Handler). I.e., the Namespace Handler holds the receive right for that port_t.

(remember the task is an object and *also* is represented by a Handle!)

Any communications directed to that port_t (Handle) end up *at* the Namespace Handler -- arriving on that specific port ("assigned" to that particular task). I.e., the Handle represents the task's Namespace.

All Namespace operations (methods) expect a Handle BACKED BY SOMETHING THAT UNDERSTANDS NAMESPACE IDL as an argument. Other arguments vary with the operation (method) being invoked. For example, lookup(name), create(name), destroy(name), etc.

The / that caused this Namespace object to be created (evident below), may have prepopulated the namespace with certain names that the task will recognize, by convention (can vary with task; e.g., may also include a name like "[parent]").

The may elect to allow the task to modify this namespace (add, delete, rename, etc). Or, it may *lock* the namespace from the task's perspective.

Once the task is executing, it can lookup() particular names in its namespace that have been previously agreed upon, by convention. [note that the conventions need only be consistent between a task and he creator of its namespace! one task can know an object by the name "trunk release" and another could know an object by the name "boot opener". Names are just convenient mnemonics.]

Invoking the lookup() stub with the args required causes an IPC/RPC message to be created, args marshalled, etc. and then delivered to the port *in* the Namespace Handler that is associated with that task's Handle (Namespace).

*Because* the task holds a valid send right for the port, the kernel allows the message to be delivered.

The Namespace Handler receives the message, examines the method being sought and decides if messages of this type (method) are permitted ON THIS PORT! I.e., if the Handle has the appropriate "authorizations" defined when created this binding. If not, it returns FAIL (NO PERMISSION).

Assuming the Handle for the task's namespace ALLOWS the lookup() method to be invoked by that task (WTF?), the args are unmarshalled and the operation (method) attempted. A result (if required) is then propagated back to the calling task (using a *reply* port that the task has included in the message -- i.e., you can support async methods if you plan on watching that reply port *later* for the reply to this operation!)

[the IDL allows various types of "methods" to be designed -- including "routines" that return no value.]

In the event of a lookup(), the task is often trying to establish initial contact with some other Handler -- to create an object of the type Handled by the Handler. Pretend we're interested in Files (bad choice but easy to spell! :> )

[Of course, objects of that type may already have been created by and placed into task's Namespace before the task began executing! The Handles for any of these will already have been bound to appropriate names in the task's Namespace. E.g., "log".]

Now it gets clever...

For initial contact with a named Handler, you need a "public" access method that is enabled "for all". The Handles to this perhaps just implement create() -- for the type of object managed by that Handler (server). So, a task can contact the File Server to create a new file!

For untrusted tasks, when builds the initial Namespace for the task, he doesn't bind the "trusted" public Handle into the server. But, instead, has the Handler creates *another* such Handle. Same permissions (typically). Then, binds the name for this Handler into the Namespace that it has been building for the task. So, "__File_Server__" is bound to a particular port *in* the File Server.

And, this port may be different for each such task!

All such "public" Handles into the File Server are serviced by one or more threads. I can assign a thread per port (Handle), a thread for the entire service INTERFACE (not counting threads that actually BACK specific Files), or lump groups of ports into port sets so a single thread can handle requests coming in on ANY of these ports.

The ask creates a File, for example. This causes the File Server to create a new port for that File and pass a send right (Handle) to that "object" back to the task in the reply to the create() method.

Note that this object now exists yet has no *name*. No problem! The task knows it by its handle. It can read and write it and even destroy it -- without ever giving it a formal *name*! Furthermore, it can pass the Handle on to someone else and *they* can read and write it as well (assuming the task doesn't tell the File Server to reduce the authorizations on a *new* Handle passed instead!). As long as there are Handles referencing the File, it exists.

If it wants to *name* the File, it can (assuming gave him that authorization for HIS Namespace) bind an Name to the object. THIS NAME ONLY MEANS SOMETHING IN THE TASK'S NAMESPACE! Meanwhile, some other task holding a copy of that Handle can create a different name for this same object (File) in *its* namespace! Or, remove the name, etc.

[Namespaces are just convenience objects. BUT, have the effect of limiting what you can talk to!! If you can't reference it, you can't talk to it!]

Now, assume someone holding a Handle to this File starts misbehaving. Keeps trying to write to the file. Doing this repeatedly when you have already been told (error return) that you don't have permission is abusive -- wastes the File Server's resources!

But, because that offender holds a valid send right to this object, the kernel will keep passing along those RPC's!

The File Server can take matters into its own hands and destroy the port on which the RPCs are arriving. The kernel will then know not to pass the RPC along -- the task has effectively lost the right to send to that port!

Ah, but a determined adversary just reconnects to the File Server via the public port and starts the process all over again!

*BUT*, the only way the adversary can talk to the File Server is via the public port that was created JUST FOR HIM! I.e., __File_Server__ maps to a specific port that, effectively, identifies *him* (his "public" connection).

When the File Server gave permission to access that particular File that was later abused, it knew which pubic port was associated with it! So, the File Server can delete *that* port thereby severing the task's only means of talking to the File Server publicly.

[You can also create initial "credentialed" connections whereby each actor identifies itself with a unique token -- a *port* that it creates exclusively for that particular connection creation operation. As ports are unforgeable, only a task holding this port can act *as* this actor!]

He can still exercise any Handles that he has outstanding for other File objects -- until he gets abusive with those, etc.

So, a server can isolate itself from its clients, selectively. (remember, everything is essentially a server -- for *something*!)

From the other side, when everything drops (destroys, discards) every Handle to an object, that object can no longer be accessed! So, it can be deinstantiated and its resources freed. (this of course is done by the Handler for that type of object)

E.g., when the last Handle into a particular Namespace is dropped, the Namespace itself can disappear (which, in turn, causes handles to all the objects referenced in the namespace to go away... etc.)

Simiarly, when the last Handle into the Namespace Handler (!!) goes away, the Namespace Handler itself can go away!!

[an unreferenceable object is just a waste of resources!]

This is where the notification of Handler death comes into play. Automatic self-GC without a sweeper!

So, "init()" can do its job and then release the Handles that it held to all these "top level servers" *knowing* that the last task to reference any of them will effectively determine their fate -- without init() acting like a zombie clinging to them.

As I said, a picture makes things more intuitive.

Reply to
Don Y

No I didn't.

Amoeba doesn't have any notion of reference counting BECAUSE reference counting doesn't scale [nor does garbage collection, but that's a different discussion]. Amoeba was designed for campus area networks with an eye toward wide area networks.

If every owner ticket is lost, a persistent capability exists until an administrator manually deletes it (or the object it is associated with). That's why administrators get the big bucks.

George

Reply to
George Neuner

That's only true of the original implementation. Later versions expanded the ticket to include an "authorized user" field which variously could be a process, host or host:process identifier, or a admin level user/group id. The user field was protected by a 2nd crypt signature. This enabled servers to restrict who could use a ticket.

In Amoeba, such things are entirely up to the server. It can choose to issue single use tickets, or "numbered" tickets if it desires. The rights field is completely server defined - it can mean anything. And nothing prevented a server from associating additional state with a capability. [That, in fact, was facilitated by the capability handling library.]

There's no inherent limitation - just a different implementation.

However, AFAIK, Mach did not allow copying/marshaling port handles to another host: i.e. to enable a remote task, a server had to accept connections from anyone on the remote host. [Same model as Amoeba: first connect, then accept or reject.]

Only if the server allows it.

About 4 times now I have told you that Amoeba could restrict the user of a ticket.

No, the kernel doesn't know who has tickets or how many copies exist. Yes, a server can tell who is presenting it a ticket and decide whether the user is authorized.

They get told when they try to use it.

That's an administrative issue which has little to do with mechanism. In your system, revoked ports just evaporate leaving clients equally clueless unless *somebody* deigns to tell them why.

Your clients don't know their port privileges have been revoked until they try to use their handles. Someone could be sitting on a useless handle for hours, days or weeks (how long since last reboot?). How is that *substantively* different from a cached ticket in Amoeba?

Coding to force periodic communication isn't an answer.

Only if they were coded respond to the appropriate signal. And Mach didn't allow signaling a task on a remote host.

That isn't substantively different. Amoeba servers don't have to remember capabilities for stale tickets.

And again, a Mach host serving tasks on a remote machine couldn't simply pull the plug on the remote machine. Mach was a bit more efficient at preventing unwelcome *local* communication, but the remote situation (sans proxies) was the same as with Amoeba. (And proxies could be done under any system).

"Remembering".

Not exactly - that's a system design issue.

Remember that an Amoeba task needs a ticket to access the network service. It's perfectly reasonable to assign virtual channels to remote services each having their own access tickets.

The result is more or less the same: "this party does not accept collect calls, enter your phone card digits now". No card, no call.

That's just shifting the load - in either case *somebody* has to do the work to reject the invalid attempt.

As I said above, Amoeba was perfectly able to isolate a process on its own host - it's a matter of system design unrelated to the capability mechanism.

-

Bootstrap tickets were provided by IDL, compiled into the code. Dynamic tickets accumulated as the process executed had to be managed by the program.

All the system calls had 2 versions: one that used the default IDL ticket and another that required an explicit ticket parameter [or sometimes multiple ticket parameters, depending].

There's a *BIG* difference between local IPC and remote IPC. By giving each "channel" separate access, the _local_ communication server gates usage.

So you save 1/2 a local IPC call. I don't see that as a tremendous benefit. Local IPC can be made extremely lightweight: e.g., you can do it without message copying.

Was it really the SS version? I may be misremembering because I played with different Mach versions, but ISTM that a server instance could run processes under multiple user ids ... you just couldn't run very many processes over any particular server instance.

Cool.

My recollection of v3 was that, so long as you stayed away from features that relied on Un*x signals, performance was quite good. ISTM that signaling was the big stake they couldn't drive.

YMMV.

I agree. There is no mechanism that is universally applicable.

George

Reply to
George Neuner

Actually, Amoeba *does* GC! Each server is responsible for discarding "unreferenced" objects whenever a GC cycle is invoked for/on that server.

The difference is, Amoeba doesn't know where all the "references" (handles/tickets) are. So, it has to, instead, implement GC by noticing which objects haven't been *referenced* (actively operated upon).

So, a ticket holder has to periodically tickle the object for which he's holding the ticket -- lest the server decide the object is no longer being referenced ("of interest") to anyone.

Amoeba had a "notice" (can't recall the name of the actual method) operation for each object for this express purpose. By moving the capabilities out of the kernel, *everything* associated with them was moved out of the kernel as well. Including the GC!

The "campus area network" goes to my other point re: the "mistake" Mach, Amoeba made in targeting "UNIX" (or, generic time-sharing service). It means they could assume things about the deployment environment (a sysadm) which, in turn, ensured they could only be applied to such environments.

Imagine (c.a.e) having to clear some accumulated references

*manually* in an embedded device that runs 24/7 ("Ah, there's a resource leak in my HVAC controller. Guess I'll have to CYCLE POWER to clear those stale handles!" :> )

Contrast that with an approach where (kernel) is doing it continuously as a matter of its natural function.

[Note *my* assumptions about how I intend to deploy my system also place limits on how/where it can be applied. It would be an inefficient solution to a distributed time-sharing system implementation (though, given the resources available in "real computers" nowadays, that might not be a deal-breaker). All my design choices are oriented towards systems that have to stay up "forever" AND be unattended.]
Reply to
Don Y

OK. But, did the *kernel* make the decision as to whether to pass the message (IDL) on to the service (encoded in the ticket) based on an examination of that field? Or, did the kernel pass it on to the service and rely on the *server* to decide "You aren't the right person"? (in addition to "You don't have the right permission(s)"?)

I.e., as the kernel coesn't treat tickets as anything other than "data", it can't prevent me from sending "copies" of a ticket to lots of tasks, neither of whom are "authorized user" of that ticket. Who prevents them from pestering the server with this "inapplicable" ticket?

There are two different levels of "rights" (permissions, authorizations, etc.) in this discussion.

In Mach, there are "rights" (their term) that apply to ports. I.e., to communication. A task can own the "receive right" for a port (only one such right exists for each port). A task can be given

*send* rights for a port -- this essentially allows it to pass messages to whomever is in possession of the *receive* right for that particular port, at the present time (ports can migrate).

I.e., for a client to talk to a service, it needs a SEND right to a port on which the service is actively listening -- holding a RECEIVE right. Without the send right, the kernel won't even implement the IPC.

A "send once" right is a send right that is automatically destroyed by the kernel when used. I.e., send *one* message, no more.

For example, if I contact a service and want it to return an object to me, I can give it a port ON WHICH I AM LISTENING (holding the receive right). The service can (whenever -- incl async I/O) deliver that object to me (or, whomever I may have passed that port's receive right along to!) via that send right.

But, I am expecting ONE message from that service. I shouldn't have to delete that "send right" once the reply has been received. Nor should I have to leave that communication path open! If the right persisted, the service could keep pushing messages down the pipe that I would have to handle!

*Above* this set of "communication rights", there are "permissions", "authorizations", that the object's server implements (and enforces) for each object type and operation. The kernel has no involvement in these. If task A has the RIGHT to send to a port being received by task B, then the kernel arranges for whatever messages A opts to send to be passed along.

So, task B (server) could implement a "allow 38 creat()'s" capability -- and enforce it, itself (counting the number of create() IPC's that it handles for that particular sender).

Or, it could pass out 38 instances of a "send once" right knowing that the task on the other end can only invoke the service 38 times before the *kernel* blocks further communications.

I.e., where the counting is done is up to the service implementor. I can let you read my password *exactly* once.

Yes. The same is true for Mach servers. I.e., the thread that handles a particular port (which is being fed by some other client) can implement some particular set of methods and some particular set of limits on the arguments for each of those individual methods. Permissions aren't a "bit set" that you can add to or subtract from. I.e., there's no limit to the number of individual permissions implemented by a single "Handle". Nor do the permissions associated with one Handle have to be the same as the set supported for some other Handle (of the same object type).

Policy resides in the servers.

The NORMA server made ports transparent. You didn't know whether an IDL was going to result in an "IPC" or an "RPC". You just "talked to a port for which you had send right(s)" and the kernel did the rest.

No separate permission required to use the network.

No. The send right is managed by the kernel. If I *have* it, I can "give" it to someone else. But, since it is a one-time use, doing so means *I* can't use it! (contrast this with giving a *copy*).

I've made changes to my implementation that allow me to condition the operations ("methods"!) that can be applied to ports. I.e., I treat the ports (Handles) as objects themselves! Not just references to OTHER objects!

But you haven't said that the kernel ignores any tickets presented to it by anyone other than the "intended owner"! Sure, the *server* can decide to ignore the ticket -- AFTER it has passed through the kernel *to* the server!

Does the "enhanced" Amoeba implementation make this restriction?

-------^^^^^^^^

See my point?

No. There are two port-related "notification mechanisms" in Mach (i.e., concerned with things that happen in the normal course of a port's life).

When the last "send right" *to* a particular port dies (is destroyed, etc.), a "No More Senders" notification is delivered to the task holding the receive right. I.e., no one is interested in the object that you represent any longer. THERE IS NO WAY FOR ANYONE TO REFERENCE IT (other than you!). Please feel free to deinstantiate it and GC it's resources.

When the receive right to a port dies (i.e., when the server that is handling the incoming messages SENDed to that port dies or destroys the port), a "Dead Name" notification is sent to holders of send rights. I.e., the object that you were referencing no longer exists. Don't bother trying to access it!

In practice, I don't use this but, instead, rely on the task who deleted the port to explicitly notify the necessary clients.

Or, let them figure it out at their leisure (when they try to invoke their "send rights", they will see the port is now a "dead name". This Handle is no longer attached to anything!

Above.

[though that's how Amoeba handles GC! :> ]

If a task doesn't care to deal with "port death", it doesn't have to. If it is important to the task's proper operation, it will!

Mach allows any object (port) in the "system" to be accessed from anywhere else in the "system". E.g., if a debugger task running on host A wants to stop execution of some task running on host B, it can do so -- assuming it has been granted access to the port that *represents* that task (tasks are objects, too!). If a task on host B wants to access a file BACKED by a server on host C, it can do so. AND NOT BE AWARE THAT AN RPC IS BEING INVOKED instead of a simple, local IPC.

This is important to me. I want to be able to move the endpoints of communication links freely within the system. The task at one end of a link shouldn't need to know that I've moved the other end to a different task -- OR to a different task on a different HOST!

Likewise, I want to be able to migrate the actual *servers* (executable) while the system is "hot". I can't afford to signal all clients and say, "I'm going to change the network address of the server that will be handling your future requests". Instead, I bring up a new copy of the server on the intended host, then *move* the ports that the old one was listening to (for incoming requests) over to that new task.

Just as if it was a second copy of the task executing on the same host as the first!

See above.

In Mach, the network isn;t visible to the tasks. Everything just looks like an IPC -- even if, in fact, it is implemented *now* as an RPC! (hence my original RPC/IPC references -- tired of maintaining that level of detail)

I actually have no way for a task to determine where the "other end" of a communication (IPC) is handled -- locally or remotely.

Doing the work locally means you don't incur the cost of the RPC (cuz any "send" could potentially end up being an RPC instance).

And, the only way to "punish" rogues and adversaries (if they don't feel the cost of their actions, then the *system* must be feeling it!) is to make the cost of those actions be accountable *in* the actor's resource budget.

E.g., if an actor can cause *you* to consume gigabytes of memory, then *you* bear the cost and it can keep pounding on you. OTOH, if *it* tries to do so directly, then it bears the cost ("Sorry, your memory constraints does not allow that amount")

The difference is Mach doesn't treat the network as "special". It has no (convenient) way of telling you when you are using network resources. So, *I* don't want you to be able to use them (as they are scarce -- in addition to bandwidth, network services are typically their own "tasks" and your use of its resources interacts with other users of that same resource) unless you *should* be able to use them.

I.e., nothing prevents you from picking up the phone and calling every number possible. The folks on the other end can choose to hang up on you -- AFTER they have been bothered to answer the phone, etc.

CID prevents you (in theory) from being a victim of this sort of "abuse" (DoS)

Again, if a service moves (which was facilitated in Amoeba by things like FLIP), then how does the task wanting to talk to it NOW KNOW that it's a "long distance call"? That he NOW needs permission to use the network to contact that service? ("WTF? I was just talking to it 12 microseconds ago...")

No! I save the server having to process that incoming message (remember, messages take up spade in queues; LEGITIMATE tasks opting to use those queues end up waiting; etc.) AND the possibility of network services coming into play if the service is not local. I.e., a round trip to the remote host (through two network stacks, TWICE) just to say, "NO_PERMISSION" -- when the kernel could have done so locally ON YOUR DIME.

You have to look graphically at where each test is being imposed and watch to see how the data flows through the system wrt locations of different services, etc.

Remember, I'm *deliberately* using lots of physically distributed processors instead of *any* "big processors". So, I *expect* lots of chatter on the wire.

Bad choice of names.

Think of single server as "monolithic". Sort of like "real-time Linux" (no, the kernel hasn't been rewritten in a more deterministic fashion! Rather, it runs as a task UNDER an RTOS...).

The final Mach "Unix" implementation was the "multiserver". It was what Mach was initially intended to facilitate.

I.e., a "tty" was an object. Anything related to tty's was handled by the tty server. A directory was an object (list of "files"). Directories were handled by the directory service. Files (within directories) were objects -- handled by File Server. Processes were objects handled by a process server. etc.

And, all of these operating in user-space.

So, a much bigger performance hit as accessing each "service" meant kernel traps, IPC's, etc. By contrast, the BSDSS was "just a big app" -- far more efficient.

(sigh) A lot of digging to gather up all the pieces over the years! Unfortunately, a paper that I have been looking for recently is not present in the archive. This leaves me uneasy wondering what else may have leaked away (it is entirely possible that it was a print copy and I never had an electronic version). The CMU "reports" archive no longer seems to be accessible.

Mach tried to address "signals"/exceptions from the very beginning. (UNIX signals seem to be an afterthought) Lots of things could raise exceptions -- without really being "bugs" (e.g., the "no more senders" notification).

As Mach was multithreaded from the outset, the whole notion of how to *handle* exceptions had several possible solutions. (In a single thread UNIX *process*, the choice is pretty obvious!)

E.g., a signal is delivered to a task (contains threads and resources). Who should "handle" it? Just interrupt one thread at random (sort of like a UNIX process handles signals)? The thread that caused the exception? (what happens if it wasn't really associated with any ONE thread -- e.g., the no more senders).

So, you can nominate a thread to handle your exceptions for the entire task. This is GREAT! The thread need not concern itself with anything other than exceptions! And, its resources can be apportioned such that it *can* handle those exceptions WITHOUT having to also worry about the "normal work" (e.g., what happens when a UNIX process is at its peak stack penetration and a signal comes along?)

But, it means that the exception handler has to be coded to handle *all* these exceptions! You can't just take small mouthfuls like in UNIX (register different handlers for different signals, etc.)

Exactly! I latched onto Mach in the early/mid 90's NOT because it would be a novel (better?) way of building big time-sharing systems (which is what all their literature described, at the time). But, rather, for how it could be applied FROM THE GROUND UP to the sorts of embedded systems I foresaw -- cheaper memory, faster MPU's, then MCU's and SoC's, etc.

I.e., I don't have to have a physical disk drive to implement a file system. It can be a FLASH card. Or BBRAM. Or ROM!

I don't have to set aside big buffers to move data out of and into those I/O devices -- I can just define a 512 byte "page" that a SERVICE uses to mmap() 512 bytes of a FLASH chip into userland. (yeah, its slow, but it doesn't consume any resources!)

I don't have to worry about task A stomping on task B's resources -- MMUs *will* be available and affordable.

And, I can make the system dynamically reconfigurable so load balancing doesn't become a nightmare ("Crap! I exec()'d the binary on the wrong host! In hindsight, that other host over there has nothing to do, now!")

Of course, there is a steep cost for all this (complexity, etc.). But, CPUs are getting faster more quickly than memory, comms, etc. So, burning "100" instructions (assuming some sort of cache) isn't a big deal compared to "50" -- esp if it saves a trip down the network, etc.

I am *sure* this isn't exactly how things will be done in the future. There need to be more economical ways of getting these results.

But, I'm *reasonably* sure "systems" will be more physically distributed. More processors. etc. And, forcing people to use things like Berkeley sockets for all those comms is like forcing a sprinter to wear scuba flippers! (Limbo/Inferno was the final piece of the puzzle to fall into place. Many of its ideas map naturally to this framework -- though with an entirely different implementation! :< Now, I just have to beat it all into submission! :> )

Reply to
Don Y

Not exactly.

Amoeba's distributed filesystem is comprised of peer/proxy servers. When a non-local file is opened, the local server obtains a copy of the file's capability to accelerate future accesses. Then it acts as a proxy to the remote server, performing access validations locally and relaying requests/data over a trusted peer channel.

The file servers time out and remove cached non-local capabilities if they aren't used. They do NOT time out capabilities for their own locally hosted objects.

AFAIK, the filesystem is the only included peer/proxy service - all the others either are host local or are simply peer. I'm not aware that any of the other services time out capabilities.

George

Reply to
George Neuner

Yes and no. Remember, to make ANY system call requires a ticket.

Normally, a process is created with a set of default tickets that enable it to access basic services (filesystem, network, etc.). The default tickets are held by the kernel and can be chosen by the parent when the process is created.

The default ticket for any particular call is assumed unless the process explicitly presents a different one when the call is made.

However, the kernel IPC mechanism doesn't validate the message other than to check that the server address on the ticket is good. E.g., if the process has no ticket for the socket service,, it will be prevented from even trying to contact the socket service.

You can do similar with numbered use tickets/capabilities in Amoeba.

Everything you can think of has an implementation analogue in either system. You're just hung up on kernel enforcement.

E.g., you could design an Amoeba based system where all the system capabilities were held by [or accessible to] a "validation server" to which all client calls are directed. The validation server would authenticate the client's ticket and then hand off the "connection" to the appropriate server.

Permissions in Amoeba aren't a "bit set" either ... unless you want them to be. You have a 32 bit field in the ticket to use however you want. The field can represent a bit map, or an enumerated value, or be divided into multiple fields.

If you don't need to validate ticket holders, you've got another 96 bits in the v2 user id and 2nd signature field to play with.

There is no *practical* limit to the set of privileges an Amoeba capability can administer or that a ticket can claim.

That's not my point.

To do much of anything, a Mach server has to register a public service port with send rights - which any task in the network can scan for and try to connect to. Aside from limiting the count of available send rights to the port, there is no way to prevent anyone from connecting to it.

Only *after* determining that the connection is unwanted can the server decide what to do about it. Using MIG didn't affect this: you couldn't encode port ids into the client using IDL - the only way for a client to find the server was to look up its registered public port.

That tells the server all existing clients have disconnected ... it does NOT tell new clients they can't connect.

This tells the server or client the other side has disconnected. It does not tell them why.

You assume that both tasks [server and client] have implemented notify_server and do-mach-notify-. Unless both sides are using MIG, you can't assume any notifications will be sent, received or handled - it isn't *required*.

If a task doesn't implement the relevant notification handler, it won't know anything has happened until it gets an error trying to use the port.

You're assuming again.

Ok. I misunderstood you to mean it was BSD's single user kernel - not simply Mach's "single server" model.

I really don't like network agnosticism. It simplifies coding for simple minded tasks, but it makes doing real time applications like audio/video streaming, and remote monitoring/control a b*tch because you don't have visibility to adjust to changing network conditions.

I hate RPC for the same reason. I often use loopback connections to communicate between co-hosted tasks. At the very least, using a network API reminds me that the tasks may, in fact, be separated by a network. Sometimes it doesn't matter, but sometimes it does.

I had to deal with an extreme example of this in a component SRT system I worked on back before high speed Internet was common. The system was designed for use on a LAN, but the components were separable and sometimes were used that way. The system could be run successfully over dial-up lines [but it had to *know* which of its components were on dial-up].

The problem was, at that time, businesses often interconnected LANs using dial-up bridge routers: an apparently local LAN address could, in fact, be in another city! [Or even another state!!] Many of these bridges could be configured to ignore keep-alive packets and hang up if you weren't sending real data. The next real message then would be delayed - sometimes for many seconds - while the dial-up link was reestablished. To save on phone bills, IT managers tended to configure their bridges to shut down phone links quickly.

The story is long and I won't bore you with it ... suffice to say that things became a lot more complicated having to deal with the potential for a customer to configure the system for a LAN and then try to run it distributed over dial-up bridges. That caused me [and others] an awful lot of headaches.

George

Reply to
George Neuner

Hi George,

["pro b> >

Ticket encodes "server port" in a virtual sense. I.e., for a "name service" analogy, the "server" portion would be "DNS" (not a specific IP address). The system was responsible for locating and tracking which "processor" was currently "hosting" that service.

(i.e., DNS might be "29" -- today! And, running on 10.10.2.27 -- NOW!)

If you needed a special ticket to use the socket service, how did you talk to *any* service as you had no knowledge of what was "local" vs. "remote"? I.e., that was the appeal of FLIP!

I.e., the kernel had to invoke the RPC if the service port IN YOUR TICKET was located on a remote host. (recall Amoeba used smallish processors originally. 16MB nominal size physical memory and NO paging!)

Exactly! -------------------------^^^^^^^^^^^^^^^^^^

You could implement some silly VM scheme whereby any task (thread) could write anywhere in physical memory -- and, give the owner of that piece of memory the power to allow or inhibit the write. This would allow you to implement protected memory domains whereby task A can't stomp on task B's memory.

It would give you an even finer grained (i.e., BETTER!) scheme than current systems use! A task could opt to let another task write on one *byte* and not the byte next to it! You could share objects at whatever granularity you choose!

BUT, IT WOULD REQUIRE THE "owning" TASK TO PROCESS EACH INCOMING WRITE ATTEMPT! If tasks all were well-behaved, not a problem -- each write attempt would be a DESIRED one!

OTOH, a rogue/adversary could intentionally sit there doing "bad" things. Your memory integrity won't br compromised because you were implemented correctly. But, he's SPENDING YOUR DIME! His actions are causing you to have to burn cycles invalidating his accesses. You can't "block" his actions (in this hypothetical model).

So, while it is fine-grained and *could* give you great performance advantages in a BENEVOLENT environment, it proves to be a vulnerability in an environment where adversaries (or, just "buggy" programs) reside.

I can put a "perfect" firewall on my home internet connection. But, that doesn't stop adversaries from consuming my incoming (and, as such, outgoing) bandwidth! If that firewall executes *in* my host, then I am burning CPU cycles AT SOMEONE ELSE'S BIDDING.

Perfect! *If* that validation server executes using the *client's* resources! I.e., you present a (bogus) credential and *it* executes in your time slot (e.g., as a library instead of a server). So, if you want to waste YOUR time presenting bogus credentials, fine. You don't tie up network resources or resources in the object's server just to invalidate your request.

*That* is my goal. You and only you pay for your actions -- whether they are careless mistakes or deliberate attacks! I.e., you can't "steal" from the system.

Note any *server* that is used to do this can potentially be overwhelmed by "non-benevolent" actors presenting it with bogus credentials KNOWING that the server will have to spend CPU cycles (that it *could* have spent on other, LEGITIMATE services for "benevolent" actors!)

"As is", I don't see any way of doing this in Amoeba. It was *intended* to have the servers validate tickets/capabilities -- so the kernel didn't have to understand "what they meant", just "where to send them".

In my scheme, the permissions are implied *in* the Handle. If the server backing a particular Handle is willing to implement a set of N methods for *that* Handle, then the Holder has those permissions. If it chooses to only implement M methods -- possibly even *different* methods -- for another Handle, there is no inconsistency.

Everything has a context and only makes sense in that context.

No. That's the way Mach was *applied* -- because they were looking towards "UNIX as an application". *Obviously* (?) you need to be able to find services!

Or, *do* you?

What's to stop me from creating a task with the following namespace: CurrentTime Display AND NOTHING MORE!

Further, the Handles that I endow the task with are: (CurrentTime, read) (Display, write) I.e., even though the IDL for the "Time" object supports a "set_time()" method, the task can't use it (it can *try* to invoke it but the server that backs the Time object will return NO_PERMISSION if it does!

This task just spends it's life running: while(1) { now = get_time(); // uses CurrentTime segments = decode_7segments(now); display(segments); // uses Display }

But the client doesn't have to even have access to a particular server's public port!

E.g., there is no reason you ("application") should have access to the device driver for the disk subsystem. Why should I even let you try to *find* it?

In *your* namespace (that YOUR creator sought fit to establish KNOWING WHAT YOU SHOULD BE RESPONSIBLE FOR DOING), there is no entry called "DiskSubsystem"!

Because there is no entry, you can scream all you want but never get a message to that server! No connection possible!

If you *should* have the ability to connect to a server (e.g., the server backing the Time object, above), *it* decides whether to allow particular requests (method invocations) based on "permissions" that have been associated with the "Handle" (incoming port) on which your messages arrive.

YOU CAN BE SEEN TO ABUSE THIS. Abuse is something that a server determines -- not the client *nor* the system! 500 attempts to write() a file that has been closed may not be abuse. An attempt to turn off the system's power, OTOH, *may* be! (i.e., you have permission to EXAMINE the current power status: on, battery, off -- not change it! Your attempt to do so indicates a significant bug or a deliberate attack!)

In that event, the server can pull the plug on your Handle -- i.e., kill the port on which YOUR requests are received. If you manage to acquire another Handle, it can kill the port FROM WHICH your Handles were issued (even if some other task actually was responsible for creating them and passing them on to you -- an accomplice? Or, just someone who "trusted you more than they should have!").

The point is, the system can take steps to isolate your potential for damage. Can prevent you (eventually) from abusing resources.

Ah, but it does! ALL THE REFERENCES TO THE OBJECT ARE GONE! The object still exists IN THE SERVER, but no one can *talk* to "it" (the representation of the object) any more!

I.e., it's like: object = &foo; ... object = NULL; (or, some random number)

*object = .... won't work (at least, not on *foo*!)

So, why *keep* foo around if no one can access it?

It *can* tell the clients that the *server* is no longer backing the object (i.e., the object no longer exists).

Mach used to have a "dead name notification" whereby all actors holding ports (Handles) that referenced that port (object) were notified (sent an exception) of the death.

But, this uses a lot of resources and is seldom necessary. I.e., why not just wait for the actors to "discover" this fact when they try to operate on that corresponding "port"? (the port is *marked* as "dead" but the task isn't "informed" of the death).

Imagine some well known port dying and having to notify hundreds of actors of this fact. Chances are, most will say, "Gee, that's too bad! Where should I send the FLOWERS??"

I.e., there's usually nothing they can DO about it. All you've told them is that, IN THE FUTURE, when/if they try to contact that port/Handle/object, they will receive a DEAD_NAME result in whichever method they invoke.

OTOH, for cases where it is important to notify a "Holder" of this sort of "revoked capabilty", the server that was holding that "receive right" (i.e., the port that just died) can *choose* to asynchronously notify the actor on the other end.

The actor on the other end can also choose to ignore this notification.

AND, things still work "predictably" in each case! If/wehn you try to access the "dead name", the method fails. Too bad, so sad.

The kernel does the no-more-senders notification. Of course, the server doesn't have to *use* this information. If it has plenty of resources, it might just choose to mark the port and GC it later. Or, it might decide to free all the resources associated with backing that object. E.g., if the object is a file and it has writes pending, it might mark the buffers to be flushed, then return them to the heap. If it has read-ahead in anticipation of future read() methods, it can just drop that memory (because no one can send read() methods to that object any more!)

Dead names have more flexibility.

In either case, kernel just provides means to tell folks what is happening. Amoeba can't do this because it doesn't know where tickets are "hiding".

For a "dead name", yes, exactly. If a task can *tolerate* not knowing until that point, then why bother *telling* it? You could also have tasks that *never* try to talk to an object even if they have been GIVEN a Handle to it!

E.g., if you never write to (your) stderr, what do you care if the file (or process!) it represents has gone away?

I *said* so! :> "If a task doesn't care to deal with 'port death'..." OTOH, if it *does*, then there needs to be a mechanism whereby the task can be *told* about this -- because it can't "see" what is happening to a port unless it deliberately tries to "use" it.

BSDSS was an old BSD (license required) kernel ported to run basically as a single Mach "task". I.e., slide Mach *under* an existing BSD UNIX. You gain "nothing" -- except the ability to run other things (even other OS's!) ALONGSIDE that "UNIX"! E.g., you could run BSDSS and POE (DOS server) on the same hardware at the same time -- not "DOS under BSD".

Mach-UX was a more modern, less encumbered, kernel running in much the same way. Recall, about this time, the USL lawsuits were making life difficult for the UN*X clones.

Lites was a similar project.

Mach-US was the "application" that set out to illustrate why the mk approach would be "better" for systems than the monolithic kernel approaches had been, historically. It decomposed "UNIX" into a bunch of different "services" -- typically executing in UserLand. Then, tied them together *via* the microkernel -- instead of letting them rid in a single container atop it (as the single servers had).

Unfortunately, Mach-US was too late and folks had already moved on. Trying to retrofit a reasonably mature system (UN*X) into a totally different framework (mk) was a losing proposition from the start. *Obviously*, the mechanisms developed *in* UNIX to handle had been tweeked to fit it's idea of how a kernel should be built.

To realize any significant advantages, you'd have to rethink what a "UNIX" should *be* -- not what it *is*!

I'm treating the network in much the same way that PC designers treat the PCI/memory busses.

Could you design a "centralized" (non-networked) system with predictable responses if anyone could open the machine and add any BUS MASTERING hardware they wanted? :> How could you reliably know when/if you could access particular I/O's, etc.

I.e., don't design for an "open" network onto which random, uncontrolled entities can be added at will. Instead, use it as a "really long bus" with devices having known characteristics along it's length. Just like when you design a system KNOWING which PCI cards will be inside!

OTOH, you don't want to have to have a bunch of different mechanisms to access different devices *on* that bus! I.e., memory is memory regardless of whether its a buffer being used on a SCSI controller or an mbuf in the kernel!

If your system had a *requirement* for a particular transit time, then its up to the implementors to guarantee that. You can't expect me to spool real time HD video to a 360K floppy disk! :>

OTOH, you don't have to record video on *tape* (historically)! Avail yourself of whatever new technologies offer -- knowing what

*new* capabilities this affords AND new limitations!

I.e., splicing acetate film was a lot easier than digital video! All you needed was a sharp knife and some tape -- no high speed computers and fancy software...

[C is ready. Off we go! :> ]
Reply to
Don Y

Argh! Had to go wade through notes -- which then coerced me to dig through boxes of books that *HAD* been on their way to be donated. Now, of course, having "re-noticed" their actual titles, I will be strongly tempted to start cherry picking through the boxes... and the cycle will repeat. :<

Anyway, _Distributed Operating Systems_ (Tannenbaum -- of Amoeba fame) describing "Objects and Capabilities in Amoeba" (section 7.2).

"7.2.1 Capabilities" covers the idea of capabilities (128b) and the fields therein (server port, object, rights, check).

"7.2.2 Object Protection" describes cryptographic seal and how impossible to forge added rights.

"7.2.3 Standard Operations" deals with operations (methods) that can be invoked on objects. Fig 7-5 tabulates "The standard operations valid on most objects". These are: Age perform a single GC cycle (!) Copy duplicate the object Destroy destroy object and reclaim storage Getparama get params associated with server (backing the object) Info ASCII string describing the object Restrict Produce a new *restricted* capability for the object Setparams set params associated with server Status current status information from the server Touch pretend the object was just used

Second paragraph begins: It is possible to create an object in Amoeba and then lose the capability, so some mechanism is needed to get rid of old objects that are no longer accessible. The way that has been chosen is to have servers run a garbage collector periodically, removing all objects that have not been used in /n/ garbage collection cycles. The AGE call starts a new garbage collection cycle. The TOUCH call tells the server that the object touched is still in use. I.e., the system has no way of knowing where all outstanding tickets for an object may be. So, it can't try to *resolve* references and use that to identify EVERY object that remains of interest. Even if the kernel were to grep() each process's memory, it would never be able to say, "Aha! Here's another ticket! Let's see what it REFERENCES and MARK that object as STILL OF INTEREST." There's no way to *recognize* a ticket in memory -- it's just bits!

So, instead, Amoeba has servers maintain a time to live field for each object. Whenever an object is acted upon, the TTL is reset. An explicit TOUCH operation also resets it -- though without doing anything "actively" to the object (sort of a NOP -- this is what I previously erroneously called "notice")

Whenever an AGE operation is invoked on an object, the TTL decreases by one unit. Once it reaches "0", the server managing the object (i.e., the only entity that is aware of things like TTL) can delete the object. No one has "noticed" it during the course of it's TTL (i.e., those "/n/" GC cycles (invocations of the AGE operation).

So, if an actor holds a ticket that references an object (regardless of what type of object it may be) and just

*sits* on it, when some *other* agency has invoked the GC on the server that manages said object those /n/ times, the object evaporates! "Use it or lose it".

Note that the actor has no idea that /n/-1 GC cycles may have been performed in the time since he acquired the ticket. (remember, each object's TTL can be different at any instant in time -- it's not like *all* "foo" objects will expire at the same instant or GC cycle so no way for the system to alert everyone of an upcoming sweep: "TOUCH anything you want to keep cuz there's a GC cycle up ahead!".)

In practical terms, this means that you have to artificially use every object (ticket) that you hold "often enough" to prevent the system (actually, the individual servers backing each of those objects) from deciding that no ne is interested in the object any longer and GC'ing it.

So, for example, if you have stderr wired to "something" and haven't had any "errors", so far, in your execution (i.e., no need to write() to stderr), you risk having the object that your stderr is attached to (file, pty, etc) being GC'ed

*before* you eventually have a need to use it!

UNLESS you deliberately tickle that object periodically.

Remember, if the object you are tickling is backed by a server on another node/host, that "tickle"/TOUCH means an RPC across the network! For *every* object that you want to "keep alive".

In addition to this waste, how can an application designer (resource consumer) RELIABLY design to protect against such ASYNCHRONOUS, UNNOTIFIED revocations when designing his app?

What is /n/ for EACH particular type of object? How often will the server for each type of object be called upon to perform a GC cycle (AGE)? How much time will my application spend NOT BLOCKED (i.e., able to execute code) in that interval? To be "safe", do I need to write my code like:

payroll() { ... wages = hours * rate; tickle_objects(); benefits = lookup(my_options); tickle_objects(); taxes = (wages - benefits) * tax_rate; tickle_objects(); net = wages - (benefits + taxes); output(net); ... }

Tanenbaum continues: When objects are entered into the directory server, they are touched periodically, to keep the garbage collector at bay. I.e., for *named* objects and a server whose role it is to maintain that (name, object) tuple, a process periodically runs through that service TOUCHing every object therein. I.e., every named object is accessed continuously just to make sure it doesn't evaporate.

Who handles anonymous objects? Or, objects that may not need to be accessible beyond some group of cooperating actors?

"TOUCH"

It's got nothing to do with local v remote. In Amoeba, you simply have no way of knowing if there are outstanding references to an object. You (Amoeba implementor) have two options on how you handle this:

- *expect* any object that has not explicitly destroyed to be of some interest to someone, somewhere.

- expect people interested in objects to periodically RE-express that interest and, after a time, decide that anything for which an interest ha snot been expressed (TOUCHed or otherwise acted upon) is of no interest to anyone! The latter is Amoeba's approach as the former would result in "capability leakage" -- objects persisting long after the capabilities involving them have passed -- with no way of harvesting these.

But, as you can see, it means extra (network!) traffic as a certain level of "interest" must be shown in EVERY object in the distributed system lest some objects be PREMATURELY harvested!

And, still provides no guarantee that this premature harvest won't occur! No way for an application designer to guarantee that he has been "interested enough" in the objects for which he may be the sole ticket-holder! Nor any way for him to be sure he has had enough "CPU time" to *express* that interest!

I can see apps being "tuned" (effectively) as system load changes. E.g., "Hmmm... we'll have to increase /n/ as some tickets are timing out too soon and we can't afford to rewrite all those apps that *use* those objects!"

Sort of like arbitrarily increasing a thread's stack each time you get a SIGSEGV... always *hoping* this latest increase is "big enough".

In Mach's case, the kernel (kernelS) know where all outstanding port rights (send/send-once) are. So, they can tell a server when no one is interested in a particular object (receive right) any longer. As soon as that interest goes away (as soon as the last send right is destroyed).

Mach burdens the kernel with this sort of thing so the actors don't have to worry about it. So they don't have to be "tuned" to a particular server implementation (there's no /n/!)

And, of course, time for my morning tea! :>

Reply to
Don Y

Hi Don

It probably would be faster to read the Amoeba documentation rather than to ask me these questions. 8-)

Hopefully the following will all be clear. ---

I covered already how every process has a set of tickets - either default or explicitly acquired - that authorize it to make particular "system calls". Technically Amoeba has only a handful of kernel calls which support message based IPC, however messaging to/from the standard servers largely is hidden by a library that presents a Un*x-like "system" API.

FLIP itself is a service for which every process needs a ticket. However, FLIP is special because it is involved in *every* IPC call, so access to the local FLIP server is unconditional and the ticket specifies only whether FLIP should permit the process to talk to remote servers.

FLIP isn't part of the kernel, it is a separate task. However, FLIP is tightly integrated with the kernel and has direct access to certain kernel functions. If you're familiar with Minix, in that model FLIP would exist as a "driver task".

Each process has a bi-directional message "port" by which it talks to FLIP. The message port can have associated with it an optional service identifier [which is a network wide UUID]. Every program which will wants to be a server requires a service identifier in order to be located. Every instance of a particular service will use the same identifier, which also is used as the port-id in their service tickets.

The kernel IPC calls copy (relatively) small request/response messages between the FLIP server and these process ports.

All other communication occurs *within* FLIP. Requests may specify optional data buffers which FLIP memmaps for direct access or remaps for local handoff. For remote messages FLIP copies data between the local process and the remote host. FLIP itself implements a large datagram service within the Amoeba network - foreign transport protocols such as TCP/IP are implemented at higher levels using a normal service. [The network driver is separate from FLIP.]

So how is it done?

For each local process, FLIP creates a "FLIP port", yet another UUID which is associated with the process's message port. FLIP locates servers based on their service ids, but because there may be multiple instances, to support stateful servers FLIP maintains associations between clients and servers based on their unique FLIP ports.

The FLIP port globally identifies a particular process within the Amoeba network and it can be used to track the process through host migrations (if they occur).

FLIP implements a distributed name service: it locates servers by looking up the port id in the service ticket associated with the client request. If the service is unknown, FLIP broadcasts a name resolution query specifying the port id to its FLIP peers to locate servers within the network. Replies (if any) identify the hosts and FLIP ports of the server processes, which then are associated with the service's port id entry.

Once a server's FLIP port is known, FLIP copies messages between the local client and the server. If the server also is local, FLIP copies the client's message directly. If the server is remote, FLIP packs the message into a datagram addressed to the server's FLIP port and sends it to its peer on the server's host. [Analogous events happen for the server->client response.]

A send to a remote process may fail because the process has migrated (the old host will say it doesn't exist). If this happens, FLIP issues a new name resolution query specifying the specific FLIP port of the target process to try to find it again. If the process can be located, FLIP will retry the send. [Processes cannot migrate while sending a message, so migration of the remote process will never be the cause of a receive failure.]

Name entries for local services (processes) are removed when the last instance terminates. Entries for remote services are kept separately and eventually time out and are removed if not used.

Nothing stops *your system* from doing anything it wants ... however, we _were_ talking about Mach.

George

Reply to
George Neuner

I'd *love* to read them! Anything beyond what I've *already* read, that is! :> (remember, I tend to keep big archives).

So, I went looking for more recent documents/sources (my Amoeba archive ends in the mid/late 90's). '"256-bit" Amoeba' seemed like a safe search criteria!

This has been an amusing experience! :>

The Wikipedia page doesn't describe (nor mention!) capabilities but offers a clue: "Each thread was assigned a 48-bit number called its "port", which would serve as its unique, network-wide "address" for communication." I.e., this is consistent with the 48-bit "server port" mentioned in other descriptions of 128-bit capabilities (though nothing about that precludes a larger capability implementation!)

In (forgive the wrap), Tanenbaum et al. claim, in describing Amoeba *3.0*: "The structure of a capability is shown in Fig. 2. It is 128 bits long and contains four fields. The first field is the server port, and is used to identify the (server) process that manages the object. It is in effect a 48-bit random number chosen by the server. Later, when discussing planned changes: "Amoeba 4.0 uses 256-bit capabilites, rather than the 128-bit capabilities of Amoeba 3.0. The larger Check field is more secure against attack, and other security aspects have also been tightened, including the addition of secure, encrypted communication between client and server. Also, the larger capabilities now have room for a location hint which can be exploited by the SWAN servers for locating objects in the wide-area network. Third, all the fields of the new 256-bit capability are now all aligned at 32-bit boundaries which potentially may give better performance." OK, this lends strength to your comment that capabilities were "enlarged" to 256 bits.

But wait, there's more! :>

For example (Tanenbaum et al. in "The Communications of the ACM", Dec 1990), in describing "the current state of the system (Amoeba 4.0)" claims: "The structure of a capability is shown in Figure 2. It is 128 bits long and contains four fields. The first field is the /server port/, and is used to identify the (server) process that manages the object. ..." Later, it states: "Amoeba 5.0 will use 256-bit capabilities, rather than the 128-bit capabilities of Amoeba 4.0. The larger Check field will be more secure against attack. Other security aspects will also be tightened, including the addition of secure, encrypted communication between client and server. Also, the larger capabilities will have room for a location hint which can be exploited by the SWAN servers for locating objects in the wide-area network. Third, all the fields of the new 256-bit capability will be aligned at 32-bit boundaries, which potentially may give better performance."

Sure looks like someone recycled a "paper" :> Did you catch the changes in the versions mentioned in each? ;)

OK, so I went through *my* archive. The latest documentation I have is *5.3*! Everything there mentions 128-bit capabilities. Textbook (previously quoted), papers, user/system/programming manuals, etc.

Hmmm... perhaps a "consistent typo" (where have I seen *that* sort of thing before?)

Dig through my copy of the sources. From amoeba.h (note am_types.h defines the other types mentioned here without surprises):

------8 FLIP isn't part of the kernel, it is a separate task. However, FLIP is

"All low-level comunication in Amoeba is based on FLIP addresses. Each process has exactly one FLIP address: a 64-bit random number chosen by the system when the process is created. If the process ever migrates, it takes its FLIP address with it. If the network is ever reconfigured, so that all machines are assigned new (hardware) network numbers or network addresses, the FLIP addresses still remain unchanged."

I.e., each "local" process registers itself with the FLIP layer. Any *local* message goes through FLIP to see that the destination "port" (FLIP address) resides on the local host.

Similarly, if the FLIP layer has "been in contact" with some remote FLIP address(es), if caches the information needed to reconnect with them in the future.

If ever it is unable to contact the expected (or, unknown) host, it resorts to a (series of ever widening) broadcast queries.

This is no different from how much of IP works. Except for the added layer of there mobile, "virtual" addresses (ports).

[I spent a lot of time wading through the FLIP implementation as the same sorts of issues have to be addressed in *any* distributed OS. E.g., Amoeba's portable "service ports" are similar to Mach's *transferable* ports (esp receive rights). And, as it is far more common for a Mach port to be "moved" than an Amoeba *process* (to which the FLIP address is bound!) the costs of locating and tracking such entities across the system is of greater concern in a Mach-based approach!

Broadcast queries, self-publishing, periodic tokens, etc. Lots of ways to deal with systemwide objects WITHOUT a central "control". But all of them have drawbacks and consequences so picking what's right for a particular *application* (no, not OS!) is tricky]

We don't disagree on our understandings of how FLIP is implemented. What I *don't* see is any "gating function" that prevents tasks from accessing a server that is remote (assuming I have a valid ticket bearing that server's "server port" -- which *may* migrate from local to remote in the cource of the ticket's lifetime in my context).

I don't see anything that imposes additional "authorization" checks on local vs remote transactions in the sources.

So, you're reading from a different play book than me. I just want to peek over your shoulder and read along! :>

Ah, but there's no practical difference between the two in that regard! :>

Mach "out of the box" can do exactly the same thing! It's just a matter of which port you pass to each task as you create the task that defines it's "namespace". I.e., instead of using a SINGLE, SHARED, GLOBAL namespace "for all", you could just as easily build one for each task. Or, different ones for different *types* of tasks. I.e., it is trivial to pass (send rights) fot *different* ports to each task on instantiation and have the receive rights for each of those handled by the same "netmsgserver". But, as that

*NAME* server would be able to identify on which port any particular name "lookup()" request (IPC/RPC) was issued, it could tesolve the name in a context DEFINED FOR and associated with the task(s) that have send rights *to* that particular port!

E.g., all "user" tasks could have "/dev" removed from their namespaces simply by eliding those names from the "tables" that you build in the netmsgserver to service the ports (send rights) *given* to "user tasks".

But, the UNIX-orientation crept in, yet again. UNIX has a single shared namespace so why not implement a single shared namespace?! :< It apparently never occured to them that separate namespaces are a powerful tool! E.g., "UNIX namespace", "DOS namespace", "task 1's namespace", etc. And, all *might* map to similar/shared objects, concurrently!

"COM1:" in the DOS namespace maps to the same device that "/dev/cuaa0" maps to in the UNIX namespace -- neither of which exist in task 1's namespace because he's not supposed to be using that sort of I/O...

[C's email machine has died. I'll have to fix it tonight lest I get the pouty face come tomorrow! (sigh) Sure would be nice NOT to have to be an IT department!! :< ]
Reply to
Don Y

Hi Don,

Hope you and C had a nice Thanksgiving.

Sorry for the confusion.

When I played with Amoeba - circa ~1990 - the basic OS had 128-bit capabilities, but extended capabilities were available as patches. At that time there were a number of competing versions (with slightly different structure) providing additional and longer fields [I actually worked with 2 different ones].

I've tried to search some for the patches, but most of the old Amoeba archives seem to be gone now. When I was in school, there were dozens of universities using Amoeba.

I didn't pay a lot of attention to it after leaving school, but since Tanenbaum announced that (some version of) extended capabilities were to be in version 4, I assumed that they had agreed on what they wanted. From what I can see now, it appears that nothing ever happened.

FWIW: the most up to date version of Amoeba appears to be the Fireball distribution

formatting link
It's based on 5.3 with a number of additional goodies. However it still doesn't include any version of extended capabilities.

Unless they changed things dramatically, it should be apparent in the in the code for exec() and in the kernel IPC code. There was an extended exec call which took flags and an array of tickets to set as defaults for the child process.

There wasn't any explicit FLIP "ticket" because FLIP wasn't an addressable service. IIRC, the same-host restriction was a bit in the process's local message port identifier (which the kernel provides to FLIP when the process makes an IPC call).

When I used Amoeba, FLIP ports were 48-bits, same as server ports.

It could be something that simply faded away. Tanenbaum and Mullender famously disagreed about having special support for (semi-)private workstations and servers. Their groups created early versions of Amoeba that had different extensions.

George

Reply to
George Neuner

Yup -- homemade pizza! :) I assume "mom" refused all (most) efforts for help?

OK.

As I said, I wasn't able to find anything -- I figured "256 bit amoeba" would be about as *vague* as I could devise for a search criteria (not even mentioning "capabilities"!)

So, a learning opportunity is lost. It would have been informative for them to crank out another paper explaining what the problems with the 256 bit implementation were that caused it not to be pursued "formally".

I'd also like to have seen how they handled passing the capability to a surrogate (e.g., how do you interpose a "debugger" agent without the actor being debugged having to "cooperate", etc.)

Yes. I pulled down the sources and see not much has *significantly* changed. Actually, appears the guy working on that release has gone in directions that AST had "belittled" in previous pubs. :>

I can see the "decision" where the "local" branch is invoked.

I think what I have to do is extend the "authorizations" that are implied (by the server backing the object) to also provide other "authorizations" for the underlying port/Handle itself. I.e., if I want to *know* who is Holding a particular Handle, then disable the ability to copy and/or propagate that Handle when I give it to the "Holder". Then, I know any activity on that particular Handle *must* be coming from that Holder (because the kernel is involved in the duplication process -- unlike Amoeba's tickets).

Dunno. I'll have to see what sorts of problems this presents. And, the costs as tasks migrate (IIRC, Amoeba really didn't migrate tasks as much as "picking a suitable INITIAL HOST" for a particular task)

Biscotti, tonight. Have to start tackling the holiday chores :<

--don

Reply to
Don Y

There was quite a bit they did agree on. AFAIK the main arguments were over how much to increase signing strength and whether increased flexibility justified adding a second signature.

I worked with 2 of the proposed extensions:

48: server port 32: object 32: rights 48: user port 64: signature 32: reserved

and 48: server port 32: object 32: rights 64: signature_1 48: user port 64: signature_2

The first was Tanenbaum's own proposal, which actually defined only

224 bits but reserved bits for future strengthening of the signature.

The second was Queensland's proposal. It defined 288 bits (36 bytes) which was an unwieldy length but featured independent signing of the user port field which made delegation simpler: a surrogate could take an existing ticket and fill in a new user without needing the object server to resign the rights.

There also was talk of making Amoeba ids 64 bits, which Tanenbaum's structure could accommodate. Queensland's structure would have grown to 320 bits, but in either case all the fields would have been 32-bit aligned (so hopefully quicker to work with).

And it was expected that both memory sizes and network speeds would be significantly increased in the near future, so nobody really was worried about ticket sizes.

When Tanenbaum announced 256-bit capabilities for v4, I assumed that dual signatures had lost because everyone previously had agreed that the existing 48-bit signing was insufficient. It didn't seem likely that Queensland's dual signatures would be squeezed into 256 bits.

??? It's all academic at this point.

Simple: you couldn't. Not that it would have been impossible, but Amoeba didn't provide any way to do it.

FLIP made connections based on *private* identifiers and maintained specific client:server pair associations for stateful services. It was possible for a debugger to impersonate the *public* port id of either endpoint (or even both), but it was not possible to break into an existing connection, nor could the debugger steer new connections to itself if the endpoint was already running ... FLIP would simply see the debugger as yet another instance of the requested target.

WRT surrogates, I'm not sure what really is the question. You either connect to the surrogate directly and pass the ticket in a message, or name the ticket and stick it in the directory where the surrogate can find it.

Yes. Amoeba didn't do task migration out of the box - it simply provided location transparency so that migration services could be added (relatively) easily on top of it.

George

Reply to
George Neuner

So, if they had one (or two) implementations, why not release either/both of them? And/or a paper(s) describing the pros/cons of each? Academics seem to *live* to write papers!! :>

Yes. Perhaps I will try writing to AST to see if there are any odds and ends hiding in a private archive. From past experiences, that has been a workable means of getting at things that weren't formally "released" or that may have had too many blemishes and too little time to clean up.

In Mach, I can slip an agent into any "communication path" (which, after all, is what all these ports represent) by manipulating the task (which, of course, is just another "object" -- represented by the same sorts of mechanisms!) directly (using an actor of very high privilege -- i.e., one holding send rights to the port that represents the task being modified!)

Imagine a file (ick). "Owner" owns the file. He creates a restricted capability (read+write access, but no delete) for that file and passes it to "Task". "Task" is charged with scanning the contents of the file (perhaps it is a structured array of data samples for a waveform) and altering them based on some particular criteria.

"Task" wants to invoke a service that is very good at analyzing waveforms -- "Analyzer". But, there is no reason "Analyzer" needs to modify the file -- analysis doesn't require the ability to alter the file's contents! So, "Task" wants to hand the file (as represented by it's "capability") off to "Analyzer".

However, when Analyzer tries to access the contents of the file (which it does by talking to the server that *backs* that file object), the server notices that "Analyzer" is not the entity for which the capability was created/signed.

Furthermore, "Task" can't create a (*further*) restricted capability for the file because the capability that "Task" holds is not the OWNER capability (at least one of the rights bits has been cleared... giving him *limited* -- read+write; no delete -- access to that object).

"Analyzer", in turn, may want to pass the file on to some other actor to perform some particular analysis aspect on behalf of Analyzer (who is doing this on behalf of Task).

So, you want to be able to take *any* object "Handle" (returning to my terminology) and adjust the "authorizations" downward... regardless of whether you are the "paramount" holder of that object. And, keep passing less_than_or_equal rights along the chain.

Well, it's still a fair bit of work bottling up an executing task, all its state and shipping it off to another CPU in a potentially heterogeneous execution environment! :> One of Mach's big IPC/RPC performance hits came from having to convey type information in a neutral fashion across the network. (Not just big/little Endian-ness but, also, representations of floats, etc.)

That's where my use of a VM (at the application level) is a win.

Reply to
Don Y

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.