Managing "capabilities" for security

Don Y · 2013-11-01T20:34:05+00:00

Hi, Not sure exactly how I want to ask this question; i.e., how best to differentiate the examples where X should be allowed vs X should be prohibited. I have a capabilities based security model. Each capability has "authorizations" associated with it (trying to avoid using the word "capability", again :< ). These authorizations are defined by the entity that creates the capability based on the "authorizations" that *it* has available to it! I.e., if I own a resource, I have all of the authorizations conceivable for that resource. I can give all or part of those authorizations to entities (actors) of my choosing. E.g., if the resource is a file, I could elect to give A, B and C read access to that file and write access only to B and D. Similarly, if the resource is a mechanism, I might give the ability to move it RIGHT to A, B and D; the ability to move it LEFT to B and C; the ability to power it OFF to only A; etc. It's important to be able to give subsets of your "authorizations" to others -- that you presumably trust (whatever that means). This allows them to act on your behalf. E.g., if I have read & write access to a file, I might want to give *read* access to that file (principle of least privilege) to someone who will encrypt it's contents for me (returning an encrypted copy of the file but not altering the original; I trust him enough to *see* the file's contents but not enough to allow him to *alter* them -- to, for example, replace the file with its encrypted form... *I* can do that with my write authorization). Similarly, I might want to give subsets of my authorizations to several different actors concurrently -- so each can do "whatever" to the resource without requiring me to serialize their accesses to it (multiprocessing) And, I may also want to *forfeit* my authorizations -- possibly after passing them on to someone else. OK? Some of the trickier issues I'm trying to address include: - "revoking" an authorization that I have previously given ...

D

Don Y 12 years ago

Ah, OK.

Some of the old Burroughs machines were wonky like that. Makes you wonder why those mechanisms disappeared over time... (illusion that increased speed renders cleverness less important?)

Vote

T

Tom Gardner 12 years ago

Ivan Godard wrote the Burroughs DCAlgol compiler :)

Have a look at

formatting link

"The Mill is a new CPU architecture designed for very high single-thread performance within a very small power envelope. It achieves DSP-like power/performance on general purpose codes, without reprogramming. The Mill is a wide-issue, statically scheduled design with exposed pipeline. High-end Mills can decode, issue, and execute over thirty MIMD operations per cycle, sustained. The pipeline is very short, with a mispredict penalty of only four cycles."

Vote

D

Don Y 12 years ago

OK, then his mind is already "sufficiently warped" in this regard.

Ah, that explains all the posts (that I "killed" :< ) mentioning "Mills"! :< I will have to see how to "unkill" them in Tbird...

Vote

R

Richard Damon 12 years ago

Hopefully the design phase for the first release is over before the first release is released! (How else can you see if it is correct?)

And, yes, the third party can, if he wishes, open back up the spec and change it, but then HE takes on the responsibility to verify that all previous code can work with the new spec, and if he can't then he can't change the spec! Generally later modifications want to be backwards compatible to avoid this problem.

Actually, in this case there isn't really a problem for the revoker. If the spec is that I need to give the Holder 100ms to reply, and it takes

200 ms to send (and presumably receive) a message, then I just need to send the notice 500ms before I actually revoke the privilege, that will give the Holder the required time to respond.

And yes, if a system is built on certain assumptions, trying to move to a less capable environment often requires looking at lots of the system. What you really want to do is think when you first make the assumptions as to what you really need as assumptions, and what you don't need to assume. In this case, I would likely have made the time allowed to notify as something configurable/negotiated, and if the grantor really needs that 100ms, then yes, you can't use the satellites, but if it doesn't then perhaps the grantor can be told that the link is slow so it needs to be patient.

First, I never said that all revocations need to have notification! It sounds like in your case here, because there is a real chance of resources going away asynchronously from external causes, asynchronously removing permissions should not cause significant issues. My comment was just that this is not always the case, so there are some situations where asynchronous revocation is not the right way to do things.

I wasn't talking about the memory physically going away, but some process t first granting another process the right to access some chunk of memory and then suddenly and without warning revoking that permission and removing the access rights. Since the normal result of this would be aborting that process, this can be very bad. The only way that process can reliably operate would be to use some form of operation that atomically checks the rights, and does the access and returns an error flag that needs to be tested. This will very likely greatly slow down the process defending itself from privilege revocation just because the grantor is unwilling to first send notice and wait a reasonable time before actually revoking the right.

Many things are very unlikely to "just happen" at random. Presumably the grantor of the privilege is doing so because there is a reason to grant the privilege. It doesn't make sense to burden the process being granted the privilege with unneeded problems.

Generally if you grant a privilege to an actor, and it is subject to a revocation request, they will reply back that they are done with the privilege (a "self revoke"), perhaps because there may be a limit to how many people this privilege will be given to at a time. You also can learn that they aren't there anymore when you signal them that you are preparing to revoke.

You normally can't revoke access to a file once the other process has opened it.

Many times privilege is managed, not by force of kernel, but by cooperation of the actors (this presumes that the system can be assumed free of hostile actors). Actors ask for permission, not because they could do the operation without it, but because the permission is needed to do it correctly.

Of course there are catastrophic conditions like loss of power where the crashing of a given task is minor compared to other effects that are happening, and many normal promises aren't going to be met, hopefully the emergency recovery system will work to minimize the damage.

If the holder really needed the permission, then it would have waited until it got it. Many operations get MUCH more complicated to have to worry continuously about every possible failure mode. To casually convert and error condition that normally would be indicative of a major hardware failure (and thus major software failure isn't unreasonable) to something that really might happen and should be dealt with REALLY make programs much hard to right correctly and even harder to test to make sure they are correct. All this because the designer figures it is ok to define that authorization carries no promises that it will continue?

I pity your team if this is how you think of them. First, you should only be granting permission for things that you are will to give it for.

If the system is theirs, they have the right to be greedy, and if it causes problems, it is their problems.

If the system is yours, why are you giving them permission in the first place, if they aren't giving you the value you want, then kill them.

If they are paying for the access, make sure you charge them for their usage (and shame on you for not requiring them to meet the design requirements for their pieces).

Then you also should consider that you are making your "friends" bear much higher costs to do what you want them to.

Vote

D

Don Y 12 years ago

That was my point! The design phase *is* done (from your standpoint) before the third party starts adding that new feature. From *his* standpoint, he would like to think the system can accommodate *his* goals, as well. *He* wants the design phase to overlap *his* activities so he has a "say". (sorry, too late. sooner or later, you've got to "shoot the engineer")

In reality, that's not practical. Will Apple let you revise iOS to suit your needs? Or, will they say, "sorry, too late"?

So, you want to address *likely* needs without dragging in the kitchen sink (only to discover that no one *uses* the sink anyways!)

That assumes *you* can delay when you revoke it! Or, can tell in advance when you will need to so you can give the early warning. It also assumes you *know* that the transport delay will be as long as it happens to be -- which you might not be aware of until after the message is actually sent (the route a network message takes can vary over time based on availability, bandwidth, error conditions, etc.)

Exactly. What can you (reasonably) *expect* your users/developers to accommodate -- given the other design criteria that you have to address.

But all this adds complexity. And, at the end of the day, the holder will still have to be able to deal with the case where his authorizations "don't work" (e.g., what if the object is deleted??)

See what I'm saying? If you're going to have to deal with this possibility anyway, then why complicate things with other mechanisms that might not work?

Mach has a concept called a "port". It's a communication mechanism that is probably the singlemost important item/concept in the design. I.e., it's not some "afterthought" shoehorned in at the 11th hour.

Mach includes a provision whereby you can request a given port be "renamed". Sort of like saying, "I want file descriptor 2 to hereafter be called 73". There are some potential advantages to allowing a client to have such a change made on its behalf.

But, the request isn't guaranteed to be handled. It may simply not be possible -- today. As a result, the client has to be able to live with the port having its original "name" (presumably, there was a reason the client did NOT want to do this!). So, all of the code that is in place to exploit accessing *renamed* ports sits idle -- and the code to handle unrenamed ports remains in play.

THEN WHY IS THIS FACILITY IMPLEMENTED?

There's never a perfect fit for all cases. But, if you try to include provisions that make *every* case "easy", you end up with a more complex system/implementation. And, questionable "returns".

E.g., the Mach system call is a privileged operation. It's more expensive to implement things "in the kernel" (kernel gets bigger, mistakes can have dramatic consequences, etc.). If you can't be guaranteed that it will be usable, why go to this effort?

Why does that have to be the "normal result"? The consumer could check to see if his operation "succeeded" at the end of his use of that region. Or not. Then, roll back whatever part of his activities has (may have been) compromised in the process.

If you are dealing with something as basic as memory, then you would presumably have hardware supporting memory objects.

If it's just a mutex governing a shared object, that's below the granularity of what I am discussing. How long you hold a lock isn't the same issue.

If, OTOH, I opt to protect all of *my* files and deny you access to them (filesystem analogy), I should be able to enforce this restriction even if you currently have several of them "open" (i.e., the permission need not ONLY be implemented "on open()" but on *any* reference to the object). I could, conceivably, remove the *image* of each open file that you are actually operating on at the time (e.g., un-mmap() them)

Why does it have to "check the rights"? Just *do* what you intended to do. Your request will either succeed (which implicitly tells you that you *held* a valid capability and that the capability was still valid while your request was being processed) or fail (which tells you that you either had a bad capability *or* that it was revoked before your request could be completed).

[Remember, you have to present the capability in order to perform *any* operation on the object. Everything is mediated by the object's "Handler"]

But a privilege (capability) can be granted *hours* or days before it is ever used! There's no "freshness seal" imposed on capabilities. Surely you don't want the holder to have to periodically check to see that the capability is "still valid".

Likewise, if you force a client to defer requesting a capability until just before it is needed, then the client risks a delay in beginning his activity as the capabilities are negotiated, etc.

Often, a task is spawned with the knowledge of what it is intended to do. It makes sense to endow it with the capabilities that it is going to eventually need when it is created -- instead of having it remain connected to its parent *just* so it can request those capabilities when they are needed.

That's not an assumption you can make. You are assuming a parent always hangs around to watch its children die. It might, instead, create its offspring and *then* die -- knowing they have the tools that they need to perform their tasks (what other role does the parent have -- cheerleader?)

In the systems *you* may be familiar with. That doesn't mean its a

*rule*! (see above example). As all accesses to the object (file) have to involve the capability/ticket/key/Handle, I can choose to not let you read/write another sector/byte. If I maintain the backing store for the file system, I can opt to replace that page with a page full of 0x00. The *file server* defines the contract for the files that it handles. Clients use its services with this in mind.

I'm using capabilities for the express purpose of preventing rogue/malfunctioning actors from "doing things they shouldn't". That includes "doing things they have been TRICKED into doing".

See my email_address_t example.

Let the holder wait until he needs it. He asks. And is told "no". Now what?

He is told *yes*, presents the capability for his first access to the resource. All is well. A moment later, presents same capability for second access and the request *fails*. (capability revoked; resource deleted; service unavailable; etc.) Now what?

You have to expect these sorts of failures -- especially in complex systems. "Network is down; try again later"

Not my "team". Rather, folks who will maintain this after me. If there is an easy and a right way to do things, I'm willing to bet "easy" is going to win out. And, all it has to do is win out *once* and it will invariably have consequences that make lots of other "right" decisions harder. It's *really* hard going into an existing "mess" after-the-fact and trying to fix it... especially when you've been tasked with doing something else, entirely!

My goal is to make the "easy" way the *right* way.

No! It's not *theirs* any more than the systems you design belong to

*you*!

Shooting people is frowned upon in the FOSS world -- just because someone's code isn't up to snuff. :>

"Friends" can see having *two* ways of doing something as "extra work": "Oh, great! Now I have to code for the early notification case *and* the asynchronous revocation case..."

You can't force people to be good designers. But, you can put tools in place that make it a lot more likely that you get the results that you want.

SWMBO tracked construction expenses at a large local hospital. "The Guys" (construction/maintenance staff) would complain that when they needed "a few things", they had to fill out a lot of paperwork which took a lot of time: "The bathroom is flooded

*now*! We can't wait for purchasing to approve the supplies to repair the leak!"

So, they created a policy whereby they could use credit cards issued in the hospital's name.

Suddenly, there are no more formal purchase orders -- even for the long term, *big* projects! And, everyone has thousands of dollars on their credit cards each month. Needless to say, the folks in purchasing are pissed -- cuz they have been cut out of the loop; accounting is pissed because they have no control over the monies; management is pissed because they have no idea what sort of progress and budgetary constraints have been applied. The only guys who are happy are The Guys (construction/maintenance).

Hmmm... can't disallow the credit cards as there will always be "emergencies". So, take their initial complaints and fold them back on themselves:

"OK, you can sidestep the paperwork process and the delays that are associated with it by using the credit cards. However, you have to file the paperwork *after* the purchase in order for

*your* card to be paid off -- forget to file, and your credit is automatically turned off. And, so that 'we' know why you chose to use the expedited credit card approach instead of the normal purchasing procedure, we need you to prepare an analysis of the factors that went into this decision and include that with the abovementioned paperwork. Surely, that ALL makes sense, right?"

Suddenly, credit cards see a lot less usage (construction workers, plumbers, electricians, etc.) aren't real keen on writing up "reports". Much easier to just fill out a purchase order for "normal stuff" and let it go through normal channels! The "right" way is now the *easy* way!

Vote

D

Don Y 12 years ago

One of the goals of the automation system I am deploying here is to address the needs of folks with (various) "disabilities" (whatever that means).

So, the UI is very abstract. Unlike most systems, it isn't implicitly assumed to be "visual" (my preferred means of interacting with it is aural -- so I can keep my eyes and hands free to do other things! I'd hate to have to set down something I was carrying *just* so I could pick up a "display" to ask for the lights in the room to be turned on!).

But, I expect others will write more (user specific) code than I. I've invested a lot in the infrastructure and core services (along with hardware/firmware). How do I "entice" others to embrace this same "neutral UI" approach that I have adopted? I suspect their first inclination is to "draw" some pretty control panel... then figure out how they will resize it for different output devices, etc.

Along the way, support for other non-visual interfaces will go away. Because folks tend to focus on *their* needs first (and often "move on" thereafter). Will they "back fill" the support for audio interfaces? haptic ones? etc. (wanna bet the answer is "I don't have the time... besides, the visual one is really COOL looking! I've even got flying toasters in the background!!")

In my case, I don't provide *any* tools that make visual displays easy to create. Instead, the user interface is just a set of available commands for any particular situation that are presented in whatever output modality is selected. I.e., for a visual display, this may just be a bunch of large rectangular buttons with text legends. For an aural display, it may be a spoken menu. etc.

I can't *prevent* someone from developing a fancy GUI. But, they'll then find that the rest of the system doesn't "fit" into that. So, they'll also have to reengineer the existing UI's. etc.

Easier to just follow the (code) templates that I create and

*know* that the system will present them to the user in whatever form the user needs!

The "easy" way is the "right" way. Exploit laziness.

Vote

R

Richard Damon 12 years ago

Don, I started to do a point by point rebuttal, but realized that we were losing the forest by classifying every tree.

My complaint was to your statement that the ONLY proper way to revoke a permission is asynchronously. My position is that you can't make such a statement and that you need to apply design to the situation, and that in some conditions revocation should have a defined notification before it is to be revoked.

Let me put a real world example, generally ones right to drive a vehicle is a privilege granted by the government, and the government has the power to revoke this privilege, but if it does so, there are notification requirements so that you do know your privilege is being revoked. This means that once your have gone through the procedures to get the privilege to drive, you can safely do so, knowing that if for some reason your privilege is revoked, you will be given sufficient notice so that you don't get in trouble.

Imagine instead, that the government reserved the right to revoke your privilege without notice (but did give you a way for you to check if your right has been revoked), also check points were established at random to check that you DO have current privilege to drive, and that driving without privilege was a capital offense. Would you want to drive? IF you did have to, I bet you would want to spend a lot of effort checking that you haven't been revoked.

This is exactly like the case that can happen for some forms of privilege, like access to shared memory, if this sort of access is to be revoked asynchronously, it generally means that process doing it will be aborted, or the process needs to not treat it as shared memory but use some sort of kernel call to check the permission and do the access atomically (instead of just accessing the memory).

I agree, that in SOME cases, the asynchronous revocation is a good model, but not all. In most cases where a notification/cooperative revocation system makes sense, for reliability concerns, a backup asynchronous method make sense, to allow you to revoke a malfunctioning process, but at that point, since it is already malfunctioning (since it didn't complete the cooperative revocation method), the problems imposed on the task are likely reasonable. This doesn't mean that the cooperative method was worthless.

Also, non-backwards compatible specification changes ARE expensive. That is just the way things work, at least if you want to be able to talk about software having correctness. This does mean that you do want to put some effort into defining your requirements, to put into them the things you need to verify/prove correctness, but not things that you don't need that add unneeded future limitations.

Vote

D

Don Y 12 years ago

Sorry, that wasn't my intent. What I was trying to address was the

*practical* aspect of all this.

*I* have to create the mechanisms that will ultimately be used throughout the system. Run the thought experiment(s):

-Imagine I make a system that notifies, waits "some" time, then revokes.

-Imagine I make a system that just revokes -- and notifies after the fact.

I then tried to present possible scenarios for what *might* happen in each case. I.e., notification gets lost/delayed/ignored -- or he was "blocked" while the notification came in. In each case, how does that affect the eventual actions of the "holder"? Ans: he has to deal with NOT having the capability when he opts to use it. I.e., he can't blindly *assume* it will "work".

So, he has to code for both cases: that he received the notification and is going to try to comply in an orderly fashion; and, that he didn't have enough warning (or *any* warning) when the notification arrived and has effectively *lost* the resource before/during its intended use.

My *opinion* is that this extra complexity -- both "in the system" and in the "applications" -- will end up wasted. That to be effective, it would require even *more* mechanism than we have discussed (e.g., negotiating a "early warning" interval, deciding how to handle the case when that interval can't be met, etc.).

Given that "holders" will have to tolerate the case of the capability "going away", it seems easier to just handle that case and make folks aware of it in the API.

Remember, these are "exceptional conditions". You *expect* to be able to hold a capability that you have requested and been granted. I'm just not willing to make that a *guarantee*. So, I need a way to "change my mind" -- BECAUSE I HAVE A GOOD REASON FOR DOING SO, NOW (just like I can preempt your execution if I have a good reason!).

What if I can't *guarantee* that notification arrives sufficiently early for you to do anything about it? If you will be able to cope with this (by implementing your algorithm differently -- even if it requires a complete rollback), then why shouldn't I opt for this as the normal behavior?

If you *insist* on this, then I may need other concessions from you to ensure the level of performance is met. E.g., maybe you can only hold a capability for a fixed period of time -- that way, *I* know all I have to do is wait and I get it back automatically? But, this complicates your work in other ways...

Ah, but they will only *try* to give you notification! If that notification doesn't make it to you (you've moved, were out of town, etc.) and you later encounter a police officer, you're in the same situation as if you *had* been notified and chose to ignore it.

[Sure, you could go to court and hope you get a rational judge but it's not The State's responsibility to ensure you have been notified -- only that they "made a concerted attempt".]

But you're assuming there *is* some "really bad consequence". What if you rarely drive? What if there are few police officers in the parts of town that you frequent?

What if you are approached by a cop while *walking* and he asks for an ID. He sees that your license is expired and confiscates it.

Or, you go to cash a check at a bank and the bank officer does this on behalf of The State?

I.e., any time you would *normally* use that credential you run the risk of it NOT being honored -- even if you aren;t "punished" for this (your "punishment" is not being allowed to USE it)

Protect shared memory with a mutex. Hold it as long as you want. If I want to control that with a capability, I can wrap the mutex access with the capability: so, you can't *take* the lock without permission but, once held, I can't interfere with your holding the lock.

It's a capability. I can make it "control" whatever I choose. And, implement whatever *else* I choose to ensure that this control makes sense.

E.g., if I revoke access to a piece of memory, I could opt to

*suspend* your process at the same time. Then, make a copy of the memory while someone else accesses it. Then, restore the original before resuming your process (and restoring your capability).

I.e., you are *always* at the mercy of the kernel. I just have to ensure that I uphold any contracts that I have agreed to with you. And vice versa (of course, if *you* cheat, I can bitch-slap you! :> )

I'm not claiming "one is good" and "the other is bad". I'm just trying to look at the realistic consequences of each approach. How to balance complexity, resources, etc. against "convenience" (for want of a better word :< ) I suspect most folks will just code as if they could lose a resource prior to using it or *while* using it. I imagine the result code from accessing the service/resource will be *all* they look at. And, that any signal handler for "resource revocation" will simply be undefined. It's just the least effort approach (it should be obvious that I expect folks to be lazy in their implementations!).

When faced with this sort of condition, I *also* expect these folks to just report "FAIL" for their activities and not even

*try* to get things "right" (i.e., "as good as possible in the circumstances")

This is why I am spending the effort *now* considering how variuous scenarios are likely to be handled. I don't want to have to make a change down the road because I "discovered" something that "can't work".

I'm not vain enough to think I can come up with the Right way to handle every situation. But, I *do* think I can come up with a practical way that handles most situations economically and *all* situations "properly", even if not efficiently.

I can always decide *not* to revoke a capability! Then, *none* of the mechanism gets invoked.

Vote

G

George Neuner 12 years ago

In the original version yes ... later they went to a 256-bit ticket to include more end-point information and better crypto-signing.

In your case, kernel resources are consumed. 6 of one ...

And unless you can prevent A from even connecting to B there will be "wasted" effort on B's part anyway.

I may be misunderstanding, but ISTM that you're trying to pack too much into the meaning of capabilities [or possibly too much stock into prior authorization].

Regardless of how capabilities are implemented (user vs kernel), every system I have read about would divide the credentials and authorizations involved in this problem among multiple capabilities:

- X(H) is a legal operation on H - B administers H - A can perform X(H) - A can connect to B - B can perform X(H) as a proxy - B can perform X(H) as proxy for A

etc.

It seems as if you want to go straight to the final one - but the question is: how do you get there?

Who grants to A that final capability that implies all the others? To get that capability presumes that A can talk to B (or some other granting authority) in the first place ... which you seem to want to prevent.

Obviously, B can tell the kernel that B administers H ... but how does the kernel know what A wants with B? How can A try to access H directly? "URN: A doesn't know about B." Ok, but then can B act as a proxy for anyone, or just for "authorized" users? Who decides A is authorized for H? B? How does B (or anyone else) know A wants access to H if A can't even ask?

Amoeba and others solve the problem by letting B administrate. A connects to B, asks for access to H. A can present a ticket for H if it has one, or B can issue a ticket to A if A is allowed but doesn't have one. [Amoeba servers have a public access API which anyone can connect to ask for a ticket granting specific access to a managed object. After first getting the ticket, they can connect to actually perform the allowed operations. Getting access then is a 2 step process.]

None of this requires free roaming user-space capabilities ... it all can be with handles referencing secure capabilities kept by the kernel or another credential server (Kerberos model).

How does the kernel know H belongs to B? How does A know to ask for H in the first place?

What "transaction"? The set of possible objects and the actions that might need to be performed on them both are unbounded.

A generic "do-it" kernel API that can evaluate every possible action on any object is a major bottleneck and a PITA to work with. Even if the high level programmer has a sweet wrapper API, the low level programmer has to deal with absolutely anything that can be pushed through the nonspecific interface.

For decades, Unix has been moving toward more verbose APIs and away from trying to cram everything into ioctl(). [How many options do sockets have now? And how many different parameter blocks?]

Linux, OTOH, went back-ass-wards with its new driver model in which every operation is performed by reading/writing some special file.

A common directory service is fine, but I'm not particularly a fan of uniform "file" interfaces. I rather like the idea of being able to ask an object (or its managing proxy) what functions are available.

Unfortunately, doing this generically is a PITA (so no one does it). If you are familiar with COM or Corba, it amounts to the server returning an IDL specification, and the program [somehow] being able to interpret/use the IDL spec to make specific requests.

Yes. However it is necessary. If you no longer trust Q, then, by transitivity, you no longer trust anyone Q may have delegated to.

Yes. But as you said to someone else, every program must deal with the possibility of permission being denied. Under those circumstances, notification can be deferred until attempted use.

System-wide synchronous revocation is impractical, but revocation can be done asynchronously if master capabilities are versioned and derived capabilities indicate which version of the master was in force when they were issued.

It suffices for the owner/manager to be able to say "all capabilities for H [or better, X(H)] issued prior to CurVer(H) are no good".

It also can be done with time stamping, but that presupposes a system wide synchronized notion of time. In practice, versioning is simpler.

So? In your system host kernel's exchange capabilities and proxy for one another. How are you going to notify a host that's powered down?

The analogy is semi-flawed: capabilities shouldn't be thought of as student key cards that open some subset of the doors on campus.

Properly a capability opens only one lock [i.e. addresses one object]. A rejected capability is known to be useless, so there's no point to keeping it.

The "one lock" principle is applicable to replicated services: every instance of a particular service should answer to the same set of capabilities.

Obviously a capability system *can* provide key card functionality, but you need to look at the situation in the opposite way: i.e. the student's key card doesn't open a group of locks, but rather a group of locks share capability to admit the card.

Semantics ... but important semantics.

But hosts may be offline: powered down or network partitioned. How long do you keep the "expiration" of a capability? That just clutters up your store.

At some point, you have to accept that a remote host may try to use a capability the resource's host no longer honors.

resource

But who decides what permissions A and B have wrt the service?

That's a nice feature. Amoeba didn't have this, but other capability systems did.

Again, this is a scenario of replicated service: local proxies should be considered an instance of the remote service. The user's capability to access the service lets it access the proxy. The proxy itself should have a separate capability to access the remote service so that the chain of trust remains valid.

George

Vote

D

Don Y 12 years ago

Hi George,

[elid> In the original version yes ... later they went to a 256-bit ticket to

OK. But that just changes the size of the copy. It still allows you to create as many copies as you want -- without anyone knowing about them. And, makes "a certain bit pattern" effectively the same as another copy of that capability!

Yes. No free lunch. *Big* limitation but, I'm hoping, one with worthwhile tradeoffs!

A user (task) somehow gets a set of "authorizations" to a particular object (an object may actually be a service, another task/thread, etc.). This could come from a "parent" task handing the authorizations and object reference -- together called a Handle, in my lexicon -- to the task. Or, from the task requesting that (object,authorization) from some chain of "directory" services -- ultimately terminating at a service that is responsible (and capable!) of satisfying this request.

The user then wants to invoke a method supported by that object. The Handle (which indicates the object and the authorizations thereof FR THIS INSTANCE OF THE HANDLE) is presented to the kernel in an IPC/RPC request (wrapper for the method to be invoked).

If the user doesn't have the *right* to connect to the "service" that implements that object, then the RPC fails before it gets started. I.e., a task can't talk to anything that it doesn't have the *right* to talk to (this is a more fundamental "permission" than the "authorizations" implemented in the capability/Handle).

I.e., I can disconnect your Handle from the service that backs it and you're just a spoiled brat crying in a sandbox. Nothing you can do about it -- even if you *had* the authorizations to do grand and wonderful things! I've just "unplugged" the cable tying you to that service.

Once the kernel has decided that you *can* "talk" to that service (the one that backs the object in question), the IPC/RPC proceeds (marshall arguments, push the message across the comm media, await reply, etc.).

On the receiving end, the service sees your request come in. Knows the object to which it applies (because of which "wire" it came in on), identifies the action you want to perform (becasue of the IPC/RPC payload) and *decides* if you have been allowed to do that!

It does so by noting what permissions it has *recorded* for your Handle when it *gave* you that Handle (or, when someone else gave it to you on its behalf). If the recorded permissions/authorizations allow the action that you have requested to proceed, then the service implements those actions and completes the IPC/RPC accordingly (possibly returning ERROR if some OTHER, non-permission-related aspect of the action fails).

As the Handler makes the *final* determinationas to whether or not it wants to *do* whatever you've asked it to do to the referenced object, it is free to define any number of such actions -- and any number of arbitrary constraints on them!

E.g., it may let *you* write numbers into a file but someone else can only write *letters* -- to that same file! (I have no idea why this would be important :> ) So, unlike AMoeba and other ticket-based systems, the number of "authorizations" isn't defined by a bitfield *in* the "ticket/key". Rather, its whatever the Handler considers to be important.

"I'll let you send a message to this email_address_t -- but, it has to be a short one."

"I'll let you send a message to this email_address_t -- but it can't have any attachments!"

"I'll let you send a message to this email_address_t -- but it can't contain any profanity"

"I'll..."

Much of the implementation is Mach-inspired. Think of Handles as port+authorizations. Handles that don't implicitly have *send* rights to the receiving port (which is held by the "Handler") can't reference it (remembering that send rights can be revoked. I.e., the holding task can be "disconnected" if the Handler decides he is being abusive, etc.)

I.e., there is an IDL for X(H)

... and task B holds the receive rights for the port that references H (so, any references to H USING THAT HANDLE will end up in B's lap)

... because "someone" told B to allow those permissions for requests coming in on the port assigned (given) to A by which it can access object H

... because A (still) holds a send right to the port for which B is the receiver

... because it is B's job to implement X on H (or, to know how to get *other* agents to perform portions of that operation) A doesn't know *how* to "read a file", "turn on a motor", etc. I.e., the methods associated with H

As above.

In the Beginning, ... :>

Kernel doesn't *care* what A's intentions are! Doesn't *want* to care! It wants *H* to determine what can be done -- on H! Expects "someone" (task) to implement those actions -- call him B, Q or Elephant.

All kernel does is let these two parties talk to each other. And, prevent others from talking that don't have the "right" (deliberate choice of word) to talk to each other. The Handler for an object ultimately implements the permission(s) and actions ("Sorry, I don't want to do that for you and you can't make me!")

A has no knowledge of who is "backing" H. A starts with a *name* for an object (assuming it isn't trying to *create* a yet-to-be-named object). It consults a namespace (another Object that has been created for it and, to which, it has been given access "authorizations" -- of some degree) that has been created for its use. Only things that are referenced in that namespace "exist", as far as A is concerned!

Think of it as chroot($HOME) -- /etc/passwd doesn't exist in that context unless *you* happen to have coincidentally created your own "object" and named it such.

The namespace, like any other object, is "backed" (handled) by some active entity. When you use the Handle that you have been pre-endowed with (by init?) to access (and operate on!) that namespace, you can ask the namespace to resolve a name... however "names" are defined in your namespace (e.g., names might be simple integers, or 8000 character strings, or binary numbers, or...). You obviously must have some agreed upon convention WITH THE ENTITY THAT CREATED YOUR NAMESPACE about how names are defined -- and possibly *used* -- in that namespace.

That convention may be different for some other namespace -- even if that other namespace is handled by the same active entity! All that matters is the agreed upon syntax of the API -- as evidenced in the IDL for that "method" -- and the conventions you agree to (when your code was written).

When you "lookup" a name, the namespace service (for that namespace, yada yada yada) gives you a Handle to the *object* that is paired with the name you provided. Or, "ERROR_NOT_FOUND", etc.

Again, by convention, you know the type of the object that you have just been granted a "reference" to. So, you know what methods you can *potentially* ask to be performed by that "object" on your behalf.

The Handler that backs that object (referenced in your Handle), holds the receive right (Mach-speak) for that "port". (You now hold a *send* right to it). When that Handle is used in an IPC/RPC, the identifier of the particular IPC/RPC "method" of interest, along with any arguments involved, will be delivered to the Handler holding that receive right FOR THAT PORT (meaning the

*object* associated with that port/Handle).

If, for example, "H" is the file system, then you might be asking B to "create a new file" in that filesystem. Where in the *real* filesystem it actually resides may be hidden from you. All you care is that you will subsequently be able to access it using the name "foo" -- that you provided (presumably avoiding any conflict with other names IN YOUR NAMESPACE -- because the Handler for your namespace won't let you create a "new name" that conflicts with an "old name" (part of the convention that you adhere to when you interact with a Namespace object!)

Presumably, you will put something in this file. Or, perhaps not. Maybe your role was just to create it, prevent its deletion and place it into a *new* namespace that you will pass onto one of your "offspring" -- so *it* can fill it with content!

Who decides that UID "don" can access ~don but not ~george?

Same thing, here.

Same sort of approach. But, the kernel has no explicit knowledge of what that "specific access" entails. It just routes messages between endpoints after ensuring that you have the "right" to use a particular endpoint!

User-space capailities allow the kernel to get out of the loop. But, mean that the kernel can't *do* anything to control the proliferation of copies, etc.

It doesn't. It just pushes a message down that "pipe" and... Gee, look, B is suddenly READY to execute, again! How'd that happen? :>

Convention. How do you know to ask for ~/.profile when a user logs in? Why not /foo/biguns?

Yes. Kernel cares not about *what* A is asking B to do on H. Does your UNIX box care if you push "ABCD" down a particular named pipe to some random process on the other end? All it does is make the mechanism available to you as an AUTHORIZED USER of that mechanism. The fact that ABCD causes the reciving process to erase every odd byte on /dev/rdsk is no concern of the kernel!

Handlers and Holders conspire as to what actions they want/need to support. If you want to be able to erase every odd byte on the raw disk device, then *someone* has to write the code to do that! If you want to ensure this action isn't casually initiated, then someone has to enforce some "rules" as to who can use it -- and even *how*/when (e.g., you might have authorization to do this, but the Handler only lets it happen on Fridays at midnight). Let the Handler and Holder decide what makes sense to them!

I wanted to keep the kernel out of the "policy" issues and just let it provide/enforce "mechanism".

Unfortunately, it makes the kernel a bottleneck as all IPC/RPC has to be authenticated there. But, it gives me a stranglehold on "who can do what". It also gives Handlers the ability to decide what constitutes abuse of privilege -- *its* privilege! And, provides far more refined ideas of what those privleges actually *are*.

E.g., the email example (that I seem to have become obsessed with). I can have "something" put textual representations of email addresses in the RDBMS. Something (else?) can pull them out, wrap them in a "method" and hand them to "consumers". Those consumers can invoke the method (".sendmail") on the object (address) and never anything more. If I later want to ensure they can;t continue to use that object (email address), I can revoke their "authorization" to use that method on that instance of that object. (Or, I can "unwire" the Handle completely -- so, any future operation throws an error)

My approach is more like pushing untyped data through a function interface and knowing that the thing on the other end will make sense of it. The IDL lets "humans" agree on just what any particular set of data on a particular interface are LIKELY to mean!

This is the Inferno way, as well. In some aspects, its nice. But, its also tedious.

I don't have a filesystem. I have *namespaces*. *Multiple* namespaces. Filesystems traditionally bound names (and containers) to "magnetic domains on a medium". Then, to "drivers" for particular devices.

In my case, a namespace binds a name to a Handler. What that Handler does and how it does it can have absolutely nothing in common with any other Handler in the system.

The *namespace* "object" has operations that can be performed on it (methods defined in the IDL that can be applied to any Handle that references that particular *flavor* of namespace). E.g., resolve(), create(), delete(), etc. But, it has no sense of reading/writing *to* the Handles that it manages.

I don't implement a full-fledged factory. Rather, I assume you know everything there is to know about the objects with which you are interacting. That you and their Handlers have conspired beforehand to some set of agreed upon methods (abilities? trying to avoid using the word "capabilities").

So, when you decide to revoke the "move motor left at high speed" authorization from a Handle that previously *had* that authorization,

*you* and the Handler know what this means. The kernel doesn't care! If, tomorrow, you decided to implement a "reduce motor operating current until full stall" authorization, so be it. Kernel never changes. None of the other "tasks" change. Just users of that IDL (and, specifically, this new method added to it)

I'm trying to find a middle ground.

I don't want a Holder to have "poll" to see if an authorization is still valid (or, that even the *object* to which that authorization applied still exists!).

Nor do I want to prenotify before revoking authorizations (or deleting objects or unwiring connections or...).

I figured the best compromise (noun: a situation where EVERYONE gets screwed) is to allow asynchronous revocation but provide a notification ex post factum. I.e., if they haven't *yet* tried to exercise the authorization, they get notified. If they are in the process of using it, they may or may not succeed (depends on how the race is won). And, if they don't *care*, they can ignore the notification and wait until they try to ues the authorization, later!

It *seems* like the most bang for the least buck.

No need for versioning. Handles are unique -- not "reused" (until all references to it are known to be gone). As they can't be duplicated (without kernels involvement), it knows when it is safe to reuse a stale Handle. (a task can *try* to hold onto it but the kernel that serves that task *knows* it doesn't exist anymore. "File descriptor

27 is no longer attached to a file -- regardless of what you may *think*!"

The tasks running on that host (whose Handles are held *in* that host!) are dead. They can't access anything even if they wanted to!

The handles in *other* hosts that reference objects *backed* by tasks in that host are told that the other end has come unplugged. So, all of *those* Handles cease to exist (and they are notified).

If tasks on the down host referenced objects on these "up" hosts, the Handlers for each of those objects are told that the connection is broken and they need no longer expect requests on those Handles.

The problem is more one of *recovery* after the fact. How do you rebuild these connections? I currently have no notion of persistence in the system. Once it goes down, it reboots from scratch -- anything in progress is lost (unless the agents doing the work deliberately elected to create persistent objects from which they could resume operations)

Yes. "Set of keys" implies "set of locks". If keys can be freely copied, there is no way to know where every copy resides. No way to *notify* the holder that a particular key no longer works: "The lock has been changed"

Assumes you have *tried* the Handle and discovered it to be useless. Or, been notified (see above) that it has been revoked (rendered useless).

My point was that a set of 64 (or 256) bit values in memory tells you nothing about whether you should keep them -- or not. You'd have to go around "trying your keys" to see which ones are worth keeping.

Mch like finding a set of keys in a desk drawer: you try them on every lock you can think of. The ones that work, you set aside. The ones that don't, you decide if they are worth discarding (Hmmm... are there any locks I have forgotten to test??)

OTOH, if you don't want to test them (now), the only "safe bet" is to hold onto them -- just in case!

The kernel doesn't care about this. It's up to the Handler for the objects in question to make his implementation choice.

E.g., two Handles (in the same or different tasks) can map onto the same object.

A Handle can map onto multiple objects -- if a proxy handling the Handle acts on your behalf ("The phone only rings in one location. If you want ot be able to call two people, you need two phone numbers and the ability to dial both/either).

Two file descriptors in different (or same) process can reference the same file. If you want to reference *two* files, you need to have a proxy that knows how to interpret your request(s) for each file (said proxy having two file descriptors).

Or, do it yourself as two fd's.

When host comes back up, local Handle doesn't exist. Memory is empty. Local kernel has no knowledge of what happened before the lights went out.

If you are incommunicado for "too long" (whatever that means), others come to the conclusion that you are powered off. Anything "wired" into you is invalidated. Come back on-line and *claim* you've been running all this time regardless of how it looks? "Gee, that's too bad. We though you had moved out and sold all your stuff..."

How do you decide that task A should be able to turn the motor on but not task B? You MAKE THAT DECISION and then you put it in the code. Unless the code gets rewritten (or bug), B simply never thinks about talking to the motor.

I think it is important for things like init -- to be able to go away (free up its resources AND IT'S UTMOST PRIVILEGE LEVELS!)

Exactly. A on host 1 doesn't talk to the Handle for B on host 2. A, instead, talks to a proxy on host 1. The kernels have conspired to wire this proxy to another proxy (actually, a part of the remote kernel) on host 2 that, in turn connects to B.

So, when host 2 dies, the proxy on host 1 sees that (because the kernel on 1 loses contact with kernel 2 -- anything that is "wired" to that remote kernel is now notified of the failure. That in turn is propagated up to A, et al.

Never instantaneous. But, anything "in the works" when the host goes down fails to see a completion code so knows it has been unceremoniously aborted "in progress".

(see why I htink async notifications ex post factum are the only realistic solutions?)

Now, to see if news server bellyaches about length of this post...

Vote

R

Richard Damon 12 years ago

Ok, have you thought this far enough to be able to say that for *ALL* system that need to be able to revoke permissions the *ONLY* valid method is asynchronous revocation without prior warning. That specifying that it is *NEVER* proper for the permission system to give prior warning and give the actor given the permission an opportunity to clean up and indicate it is done. This is my objection, the categorical statement that only one method is right.

No, if he writes his code to be able to respond in time, then most of the time a revocation will have the likely much lower cost of an orderly shut down, and only in the rare cases that something has gone wrong suffer the higher cost of "random" revocation. If the cost differential is high enough, the holder may need to use a different, less efficient, algorithm (maybe check-pointing information often to allow resumption in case of failure) to minimize the cost of random revocation, even if it should only be rare.

Now, perhaps in the situation you are talking about, where communication is so unreliable, everything already has so much error checking, that random revocation isn't an issue, but I think that is actually the rare case, and in such a system you probably can't be promising much performance anyway. Other situations the trades work differently, and in some of them, a cooperative process of revocation may give much benefit.

This becomes a cost trade off. A cooperative system is much better able to handle making good trade offs. The cooperative system may want the asynchronous method as a back up to handle extreme cases or fall backs for a process not meeting the requirements. If you ONLY have the extreme cases then maybe you don't need the cooperative system.

The "try" generally includes positive indication that you have received the message or not. If you don't get it because you moved, then you have broken the protocol, as you are required to give notification of the move (and if you fail, it is your fault for not do so). For notifications with serious consequences, there often IS a requirement that some official of the court (Police officer or some other trusted individual) be able to attest that they delivered the notice or performed the legally required attempts before you can be punished for not knowing about the action.

In the process model, the revoker has the obligation to make a good faith attempt at delivering the notification, and the holder the responsibility to listen for and act on the notification. If the holder doesn't keep his end up, he can't complain about the cost of not reacting to the message. Even if the communication channel isn't totally reliable, hopefully it is reliable enough

An "expired" credential is something that you can know of ahead of time that it is coming up, and if you do your job right, you know to renew it so you have the renewal in time to continue acting.

The case present was a case where you had to use a resource/capability, but you couldn't know for sure you still have it, and the cost for using it when you didn't was high. This is a possible case. Pointing out cases where the cost is low does NOT negate the cost in the cases where the cost is high.

i.e. you are admitting that there as some capability that can't be just asynchronously revoked.

Hopefully you document that any process that get this resource is subject to being randomly blocked for period of time.

Yes, the kernel is capable of doing anything it want to your process, BUT if it to be considered a "working" kernel is shouldn't. Every process should have a programming contract with the kernel, of what the process can expect from the kernel, what it can ask for, and what can happen.

You also are assuming that there IS a kernel that is the master over the machine. In many system there is not this sort of kernel, but the kernel just coordinates various actors, under the assumption that each actor follows the rules and plays nice. It costs resources to place each actor in their own "jail" to make sure they behave, and if you have control over the code in the machine, sometimes it is just better to play nice.

You seem to assume that testing every usage for revocation is easy, or that having the capability randomly removed will always be of relatively low cost.

But you have decided that it *NEVER* makes sense to have a protocol to ask the holder of a privilege to quickly finish up and release it. (Since it is always better to use asynchronous revocation).

Yes, there are cases where you know that you now have a need for a resource that is so sever and so immediate it doesn't really matter the cost imposed on the actor given the resource, you need it back. In this case an asynchronous revocation makes sense. Other times you may come across the need for something that isn't immediately urgent, but is important to get done before the actor might normally finish with what it is doing. This is where a cooperative protocol adds value.

Vote

D

Don Y 12 years ago

Let's try this again... "*I* have to create the mechanisms that will ultimately be used throughout the system. Run the thought experiment(s):" I.e., *I* am the system architect. It is my responsibility to design the environment in which all actors will operate. It's not a "homework assignment" where I can talk theoreticals. It has to *work* in The Real World.

You can design systems lots of different ways. You can implicitly trust every actor and every developer and assume they will ALWAYS have the needs of others in mind. I.e., they will never "hog" a resource unless they absolutely need to (including the CPU itself!). You can assume they will be very technically competent and know all the right ways of doing things to maximize their cooperation, etc.

In such a system, you need very little mechanism. The "mechanism" is already implicit -- each actor (and developer) KNOWS that other agencies will only ask for something when they need it. *OTOH*, the other agency also knows that an actor who chooses NOT to relinquish a resource does so because *he* has decided (fully aware of your request!) that *HE* needs it more! (i.e., no need for an asynchronous revocation mechanism at all! And, arguably, no need to *request* revocation of a resource, either -- the actor holding it WILL release it immediately after he is done with it! If you need something, just WAIT FOR IT (it will turn up when it "should" -- and no sooner!)

Such systems are inherently limited in size. It's just not practical to keep track of every use and *hope* they all sort themselves out at run time.

They are also *closed*. Adding something (another actor) to one requires too much knowledge about EVERYTHING in the system in order to know what you can expect *from* it.

They also IMPLICITLY trust all actors (and developers). A hostile actor can too readily compromise the entire system. THE SAME APPLIES TO A ROGUE ACTOR (not intending to be hostile but, having a bug or operating in a failed state).

I.e., these systems are uninteresting in the real world -- except as examples of how badly things can fail.

As complex systems are becoming increasingly more "open" (bad choice of words as it suggests "free", "easy to inspect", etc.), you can neither expect to NOT encounter hostile/rogue actors -- nor can you expect to encounter "highly skilled/aware/cooperative" developers, etc. Esp if the system is exploited commercially!

Unless you interpose some active agency to "qualify" the sorts of software *allowed* (authorized) to run on/in the system.

[If you think otherwise, we are simply talking at cross purposes]

So, I have taken the approach of putting clamps on damn near everything. Who can talk to whom, which memory belongs to each, who can use what, etc. At considerable cost (resources). Because I believe this is a key part of making systems that "stay up forever" in spite of the inadequacies of the components thrown at it/into it. If a component has a defect, the defect should penalize the component, not the rest of the system.

So, how do you implement "permissions" in such a system?

One approach is as above: trust everyone to Do The Right Thing. But, that won't work in anything other than a "Perfect World". Perfect Developers creating Perfect Actors with plentiful resources.

You can also "Trust but Verify" -- hope everything happens in an amicable fashion and include a mechanism to deal with cases where it *sometimes* doesn't. (i.e., still giving actors the benefit of the doubt).

Each of these assumes a benign environment with largely cooperating actors/developers and just the occasional "fluke".

A more realistic approach may be "Trust and Enforce" -- hope folks do what they should and extract a penalty when they don't! So, lazy/inept actors suffer for their laziness/errors and those who "behave" are effectively rewarded (!punished).

This allows a rogue/hostile actor to do damage -- but, hopefully, reduces his capability to do so, over time (by allowing penalties to escalate -- possibly infinitely!).

I've opted for an approach closest to this last -- IN TERMS OF THE SYSTEM'S ABILITIES! I.e., the system doesn't *rely* on cooperation. The implementor of a resource can choose to be as cooperative as

*he* wants. Or, as heavy-handed!

If you implement a "math facility", you can choose to allow actors to have exclusive control over when -- and if -- they relinquish that resource ONCE THEY POSSESS IT. Or, perhaps you will expect them to honor requests from you that it be released (because you know you have another client waiting to use it). Perhaps you allow them a certain time interval? Or, a certain number of "operations"? Or, some other criteria negotiated (and tracked BY YOU) when the resource was initially granted?

But, as The System is the only entity that can actually *effect* changes made to a "capability" (Handle), The System still needs to be able to revoke a capability unceremoniously. BECAUSE THERE WILL BE TIMES WHEN THIS IS NECESSARY. I.e., NOT having this ability means an actor can impose an unbreakable deadlock ("Sorry, we can't shut the system down yet because Task A won't release Resource X")

You want to use a resource? You adhere to the contract governing its use. If you don't want to deal withthe possibility that the resource can be asynchronously revoked, then DON'T USE IT! Because you *know* it *will* be revoked, sooner or later (in a 24/7/365 application).

GIVEN THAT SYSTEM ABILITY, wanna bet most resource *implementors* opt for asynchronous revocation? No, they won't *eagerly* do this... thy will try to allow you to keep a resource as long as you NEED it (but, why would you hold a resource any *longer* than that? Are you being inconsiderate??). But, when they want/need those resources back, they will simply *take* them -- and tell you that they have done so (assuming you don't discover this by trying to access the resource at that exact time and received a FAIL_PERMISSION error).

Remember, each implementor uses resources that may have been granted to *him* by someone else further up the food chain. So, for *you* to be kind to your clients, *he* would have to be kind to you! (the presence of any "impatient" resource implementor in your supply line means *you* have to be similarly impatient)

I don't see this as a problem. If you can't tolerate the possibility of an asynchronous revocation, don't use the resource. If you need

100% of the CPU, then you can't run. If you need more memory than is available, you can't run. Find some other environment to operate in.

Ask yourself how EVERY application running on your Linux box can

*magically* tolerate a "kill -SIGKILL". Those that can't (e.g., those "flying the aircraft") can't run in that environment! (and there's nothing wrong with *that*, either!)

If an (Linux) app wants to be able to recover from something asynchronous like this, *it* bears the cost of doing so. It doesn't force the rest of the system to bear it on its behalf.

Vote

G

George Neuner 12 years ago

Don't know what the limit is, but I've seen messages several thousand lines long in various groups. If everyone edits judiciously, ISTM that it would be hard to get there in any reasonable discussion. The ridiculously long messages I have seen often were the result of repeated top-postings or "me too"ing with no attempts made at editing.

Yes. However, the enlarged capability was an improvement over the original because it carried information on client(s) authorized to use the capability.

Server has an addressable port per managed object? Seems like overkill.

Yes. As I noted previously, when the set of "authorizations" is arbitrary, the role of the ticket has to be demoted from self-contained capability to some kind of capability selector. But it doesn't require kernel involvement - it could be done all in user-space.

Understood.

Yes. However, capabilities can be managed in user-space by the services themselves - which IMO actually makes more sense if the set of authorizations they control are wildly different. All that is necessary at the kernel level is to validate "port send" permission.

But in any case, we're back to how it is granted 8-)

allowed.

Previously you had said that your kernel was able to prevent clients from making connections on the basis of complex permissions like the right to "erase every odd byte" of the object. That's why I asked how the kernel knows what the client wants.

Now you are saying that the kernel only checks the client's "port send" authority and more leaves complex decisions to the server.

Which is it?

See, again here you seem to be saying again that the kernel can make decisions based on fairly intimate knowledge of the client's intentions.

So implementing the IDL gives you capability? Or just potential? I.e. you can assume that an imposter task has implemented the IDL for the service/object it wants to hijack.

But if you are using "handles" ("indexes", "selectors", whatever) to represent arbitrary collections of authorities, you're going to run out of them pretty quickly unless the handle objects are fairly large.

I.e. 4 billion [32-bit handles] seems like a really large number until you actually start parceling it out: e.g., if "objX,read" is distinct from "objX,read,append" is distinct from "objY,read,append", etc.

That's part of the reason Amoeba used wide tickets. [1st version used

80-bits without the crypto-signing field, 128 bits in all. 2nd version capabilities were 256 bits].

George

Vote

D

Don Y 12 years ago

I think NNTP servers are free to impose their own limits. I've previously bumped up against it and found it annoying to have to edit my own reply before being allowed to send it...

So, how are surrogates handled? E.g., send capability to X and X wants to delegate some or all of it to Y. I.e., it can create a new capability from a subset of its own (which Y can then do for Z, etc.) but how do you track down all derived capabilities (or, just not recycle "identifiers" so any stale copes eventually find their IDs invalid WHEN PRESENTED FOR SERVICE)

Yes. But think about how many managed objects you are likely to have. E.g., only *open* files need to have handles...

Yup. In my case, the Handler provides "policy"... and can decide whatever those authorizations make sense for this instance of this object.

Kernel provides communications and "Handle-related" (i.e., think of the Handles as objects in their own right -- not just REPRESENTATIVES of other objects) operations.

Correct. And ensure the "messages" (IDL) destined for each "object" (Handle/port) get routed to the right Handler for that object.

Initially, everything is hierarchical. So, whatever *I* create,

*I* can give to others (e.g., my offspring -- directly or indirectly). But, they are free to create *their* own objects and act as Handlers for them -- and, give them to other actors that they are made aware of (e.g., via a directory service, their explicit namespaces, etc.)

Nothing talks to anything without kernel's involvement. If you don't hold a send right for a port, then you can't *send* to it! So, you can repeatedly trap to the kernel -- but never get past that point. (send rights are not forgeable).

Whatever *backs* the object (the "Handler" behind the "Handle") decides how to interpret each communication (e.g., IDL).

But, the Kernel acts as Handler for certain object types as well!

E.g., a Task is a container for resources and Threads. You may want to operate *on* a task (e.g., change its priority, scheduling algorithm, stack allocation, kill it, etc.). So, each Task has a (at least one) Handle. When someone wants to SUSPEND a task, it takes a Handle for that Task and passes it to the task_suspend() IDL. If the caller has permission to *talk* to that object (i.e., the Task), the kernel routes the IPC/RPC message to the Handler for that object -- namely, the kernel itself! If the permissions recorded *by* the Handler for that instance of that Handle include the ability to SUSPEND the task, then the Handler (i.e., the kernel) suspends the task and returns SUCCESS to the caller.

If an object is a page of memory and there is some (wacky) operation supported on memory pages that allows "every odd byte" to be erased, then anyone holding a Handle to a memory page object for which that "authorization" has been granted can invoke the "erase odd bytes" method on that object.

(The kernel has some involvement with memory though not exclusive)

Kernel acts as initial gatekeeper. Implements communication and transport *mechanism* along with the "port capabilities" -- send and receive (plus others not discussed here).

To each actor, you're always talking to the kernel -- your Handle resides *in* the kernel, the communication that it represents is implemented *by* the kernel, notifications, operations on those Handles, etc.

Actor has no knowledge of who is backing the object (Handling the Handle). To it, everything LOOKS like a kernel interface.

This is different from Amoeba where actors are conscious of the fact that they are actually talking to other actors.

In Amoeba, you could pass a capability from Task A to Task B using USMail (or whatever). The kernel didn't need to be involved! Or, if it was, it could just provide a *pipe* -- no real checking going on, there.

Since my Handles are implemented *in* the kernel, the kernel has to be involved in every communication. But, this is what I want -- I don't want Task A to be able to *bother* Task B unless it has previously been authorized to do so!

And, if Task A turns out to be hostile or goes rogue, then Task B can revoke Task A's ability to "send" to it and effectively isolate it.

If Task B only notices this annoying behavior on a couple of Handles that it provides to Task A, it can disconnect those Handles (ports) without affecting other Handles that Task A may currently hold (that are backed by Task B).

I.e., I can implement fine-grained damage control instead of taking a meat cleaver to Task A.

IDL is just a collection of bytes that tell the recipient of the message (the envelope that contains the bytes) what they mean. You can "say" whatever you want -- no need to go through the stubs generated by the IDL.

I.e., you can fabricate a message that says "write to file" and push it to a Handle (port). If it happens to agree with the correct form for a "write to file" message *and* the Handle happens to be backed by a "file Handler" *and* that instance of that Handle allows write authorization, then you will cause the file to be written! The IDL stubs are just convenience to save you this trouble.

OTOH, if you send that message to a Handle that represents a

*motor*, it won't make sense. *Or*, can mean something entirely different for a motor (perhaps it means APPLY BRAKE). If you don't have APPLY BRAKE authorization for the motor that is backed by that Handle, then the IPC/RPC will fail.

If you *do* have authorization to APPLY BRAKE, then the brake will be applied -- even though you *thought* you were fabricating a message to cause a "file write" operation!

In reality, this is minimized because you tend not to create your own messages. And, message ID's are disjoint. You would have to really work hard to create a message that works on an object of type X while thinking you were dealing with an object of type Y!

[Sorry, sloppy explanation but I think you can imagine what the machinery looks like. Bottom line is the content of the message can't change the things you are "authorized" to do nor the things on which you are authorized to act!]

Again, think about the sort of applications and the things that are big enough AND IN PLAY to require a Handle.

E.g., the email_addr_t example I've enjoyed playing with... you only need to represent an email_addr_t as a "live object" (i.e., a Handle backed by a Handler) when it actually *is* "live". You can have tens of thousands of email addresses in your address book (RDBMS) but only those that have been instantiated for live references/operations need Handles!

But Amoeba also allowed persistence for capabilities. So, you

*could* store a capability in the RDBMS alongside each of those thousands of email addresses! Or, one for every file on the disk (bullet server).

But, you don't have thousands of file descriptors (Handles!) in your code! You don't fopen(2C) every file in the file system when your program starts -- "just in case". Instead, you create fd's as you happen to need them and the kernel (in most OS's) keeps track of what *actual* file each pertains to. When you close a file, the descriptor ceases to exist (in all practical terms) and the resources (kernel memory) that were associated with it can be reused for some *other* file reference.

Make sense? It's not a "lean" way of doing things but I think it's the only way I can get all the isolation I want between (possibly hostile and/or rogue) actors.

Gotta go finish building a machine to deliver tomorrow. Still have a few apps to install and snapshots to take before I will feel "confident" letting others screw with it! :>

--don

Vote

G

George Neuner 12 years ago

to

You don't track them down, you just invalidate them at the source.

Some refresher background:

In Amoeba, tickets are public objects, but capabilities are server held *private* objects. Tickets are cryptographically signed to prevent forging. The signing function is kernel based (for uniformity). There is a public server API (not discussed here) to ask for tickets you don't have.

For brevity here I am focusing only how tickets are created, validated and revoked. Tickets may carry additional information beyond object access rights which I will not discuss here. [but see further below]

Definitions:

- "capability" is a tuple of { capID, rights, check#, ... } which is associated by a server/service with a managed object.

- "ticket" is a tuple of { svrID, capID, rights, signature, ... }. The svrID identifies the server/service. The capID references a particular capability offered by the service.

- "rights" are N-bit wide fields. The meanings of the bits are defined by the issuing server.

The actual sizes of these data are version dependent on the capability subsystem. Both capabilities and tickets adopted new functionality over time.

It's important to understand that Amoeba capabilities are, in fact, "versioned", though versioning is neither sequential nor readily predictable.

When an object manager [server] creates a new capability, it generates two large random numbers to be used as the capability ID and as a "check" number associated with it. The ID will be made public, the check number is private, kept secret by the manager.

The rights specified in the manager's capability tuple reflect the full set of privileges *this* capability can offer - which is not necessarily the complete set of privileges offered by the object.

The capability ID, rights, and check number all are passed into the signing function to generate a signature. An "owner" ticket then is constructed from the ID, the rights, and the signature (the check number remains private to the manager).

A "non-owner" ticket having reduced privileges is constructed by first determining a value for the ticket's rights field. "Owner" and "non-owner" tickets are distinguished by whether the rights field in the ticket _exactly_ matches the rights field in the manager's capability.

The reduced rights value then is "combined" with the capability check number to create a derived check number. [Amoeba XOR'd them but any deterministic method will work] A derived signature is generated (as above) using the ID, reduced rights and the derived check number, and the new "non-owner" ticket is created from the ID, the reduced rights and the derived signature.

[Signatures (and rights for issued non-owner tickets) can be stored to optimize server side ticket validation, but all the signatures could be recomputed if necessary using data from capabilities and tickets.]

To validate a ticket, the object manager finds the specified capability using the ID field of the ticket. If the ticket's rights exactly match those of the capability (i.e. an "owner" ticket), the manager uses the check number to compute the expected signature and compares the value to the signature field of the ticket.

If the ticket's rights don't exactly match the capability (i.e. a "non-owner" ticket), as above a derived check number and derived signature are computed, and the ticket is checked against the derived signature.

------------

At this point, it should be clear that every issued ticket is tied to a specific "version" of a capability by the capability's secret check number. If the capability is versioned - i.e. the check number modified- or if the capability record is deleted, then every ticket issued referencing that (no longer existing) capability is immediately rendered invalid.

Of course, there is the possibility that the same pairing of ID and check# for an existing or past (deleted) capability could recur for an unrelated object. Amoeba used per-server [not global] capabilities and *large* randomly generated ID and check values to minimize the chances of that occurring.

------------

So how to handle surrogates?

The meanings of bits in the rights field of the ticket are completely defined by the issuing server: the value may be an enum or a bitmap, there may be subfields ... whatever the implementer chooses.

One bit can be defined as meaning "this is a surrogate ticket". A surrogate ticket holder would be permitted to ask the server to create a new reduced capability for the managed object.

The new capability maximally would allow only those privileges that were granted to the surrogate, allowing the surrogate independently to delegate by issuing "non-owner" tickets based on its own capability.

The surrogate capability might also permit the surrogate to name group peers, fail-over alternates, etc. by transferring its "owner" ticket if other factors allow this (see following).

Because capabilities are kept private by the issuing server, surrogates capabilities can be linked to the owner's capability, allowing the owner to void delegate and/or surrogate tickets by versioning/deleting the appropriate capability. The surrogate, of course, can void delegate tickets by versioning/deleting its own capability.

Further having tickets encode who is authorized to use them permits more restrictions, e.g., preventing delegates from enabling peers by copying the ticket.

All versions of Amoeba's tickets specified a server (or service) ID - the field wasn't sign protected because the ID might be a task instance, but it allowed servers to immediately reject tickets they couldn't possibly have issued.

Later versions of the capability system widened tickets to include an authorized user/group ID field protected by a 2nd crypt signature. [And also enlarged the rights field.]

Understood. IDL based RPC mechanism.

Well, servers managing the objects anyway.

Does the kernel recognize DOS attacks on itself?

Amoeba v2 effectively could do the same.

Anyone could persist a *ticket* - but the referenced capability might no longer exist when the ticket is presented for use: e.g., following a restart or after version management performed by the capability owner.

Ever see an image based OS? No files (or, at least, none the user can perceive): just a virtual space containing "program" functions and "document" data structures with a directory for finding things. All "programs" and "documents" available at all times.

Like working in a Lisp or Smalltalk system but extended to encompass all activity.

Current NV memory based systems, e.g., for tablets, appear to work similarly, but they still perceptionally are "file" oriented.

George

Vote

D

Don Y 12 years ago

(by source, we agree to mean the server "handling" the object)

Yes. Tickets can be freely copied and passed around. Nothing

*prevents* that. The capabilities (object, authorizations) behind them are "protected".

~dgy/.profile can be *known* to many, yet inaccessible to damn near

*all*!

Th esigning function could similarly be implemented within the "service" for a particular ticket -- or, in addition to. I.e., anything that needs to know that "secret" can perform that duty.

By contrast, "Handles" (ports) in my scheme are just "small integers" in much the same way that file descriptors are "small integers". And, while nothing prevents you from *copying* a particular "small integer", the integer itself is neither the ticket nor the capability.

Rather, *like* a file descriptor, it acts as a "name" for a particulat Handle IN A PARTICULAR CONTEXT! (that of the task holding that handle!)

E.g., "23" interpreted as a file descriptor in task (process) A can refer to a particular pty. Passing "23" to some other task breaks the association with that particular pty. "23" is *just* "23" -- nothing more.

OTOH, 0xDEADBEEF010204302893740 passed from Amoeba task A to Amoeba task B carries "rights" with it. Encoded within the cryptographic envelope!

In my case, the capability is embedded in the Handle and implemented by the Handler. The Handler could conceivably *change* how it interprets a set of "capabilities" (terms are getting WAY overloaded, here!) on the fly. Doing so without the actor's awareness could be challenging :>

There is no concept of a "ticket" in my scheme. A "Handle" only exists in a specific context. Remove it from that context and it loses all meaning -- it's just a bunch of bits.

What I've called "authorizations". Except there is no visible "bit field" in my implementation. Each Handler decides how it wants to implement a set of "authorizations".

E.g., a file server could have two threads (groups of threads) that are responsible for read or write access to a file. Files opened for read access are service (Handled) by thread R while those that are opened for write access are handled by thread W. Read requests are never *seen* by thread W and vice versa! (because the endpoint of the eventual read/write RPC differs -- wired differently when the open() is granted!)

(I also have "communication rights" beneath the "rights" that are associated with the object being managed)

This can just be seen as extending the namespace of the capability in a manner that makes for easier management. I.e., so a Handler (server) can opt to ignore "older" rights (because it can't vacuum memory to find and remove all instances of a particular "ticket")

Correct. As I don't have to give you read *and* write access to a file. Or, could opt to only grant APPEND access. Or, any other operator that I choose to implement (remove_duplicate_lines(), compress_in_place(), etc.)

But, there is no direct *tie* to the original ticket from which this one (subset) was created! (or the one before *that*; or the one before *that*; etc.)

So, when a ticket is presented, you can't look at the ticket and decide that the ticket from which it was created was revoked and, therefore, so should this one!

(hence version. But, now the handler/server needs to keep track of which version is current for each outstanding!)

Yes. So, the Handler/server needs to effectively treat the version as the ID of the actor to which a ticket is granted IF IT WANTS TO REVOKE THOSE CAPABILITIES and only those WITHOUT AFFECTING OTHER TICKET-HOLDERS.

Of course.

Of course. This is my "do not duplicate" attribute.

But that means the handler/server has to do all this work! "Remembering".

In my case, the copying has to be done *in* the kernel. All that's exposed is the "small integer" so an actor can't do squat with it.

In my case, the Handler/server never is *presented* the communication in the first place -- unless the path has been previously created by possession of the Handle.

Just like you can't write to a file without a file descriptor having been created. The file server never sees your actions; they are blocked *in* the kernel.

Yes. And the IDL is just a convenience service provided to the developer. Sort of like the difference between using the native X API and one of the widget sets. (the latter just encapsulates the former)

Yes. In my approach, you are always "talking" to the kernel as *it* is responsible for validating (the communication portion) implementing the actual RPC/IPC/kernel trap (*all* look the same to the actor)

It currently doesn't. Nor do I see a need to do so in the future.

Any attacks on services (Handlers) *through* the kernel (as the communication medium) just come out of the attacker's resource share. I.e., *your* timeslice is being consumed while the kernel is trying to determine if you are entitled to this action. If that's how you want to spend your time... You could just as ridiculously spend it spining in a tight while(1) {}!

A "direct" attack (i.e., asking the kernel to perform an action that is known to be *backed* by the kernel itself) has the same net result. It's your dime, if you think this a wise way to spend it, then so be it!

Of course, the *system* ends up losing performance because it's supporting a task that is "doing nothing productive". But, how does an independent agency make that distinction?

Do I kill a task because it has tied to do something it isn't entitled to do? What if that ability has been revoked? Do I penalize the task for this? What if the only realistic recovery mechanism for an "unavailable (at this time)" resource is to "try, try again"? When do I decide the actor is attacking vs. normal behavior?

Kernel tries REALLY HARD not to implement policy. Let the services and handlers make that definition AS BEFITTING THE APPLICATION (or portion thereof)

Yes. In my case, I don't support persistence of "Handles". I.e., they can't be created -- nor recreated -- from a store. Instead, everything (with the exception of bootstrap) is built dynamically and persists until explicitly killed/revoked

*or* the system shuts down.

But the actors don't hold "Handles" to all of those objects! E.g., I can have millions of files in the file store -- yet only need

*dozens* of Handles to interact with those dozen objects that are "live" at the present time.

The Handlers role is to create "live" objects, "however". If that means mapping some blocks on a disk to a particular Handle, so be it. If it means wrapping one of thousands of email addresses *in* an email_addr_t, likewise.

The "problem" with my approach is that all of these things -- for the complete set of tasks executing on a host -- are contained in the kernel. Amoeba (et al.) allows the references to be moved *out* of the kernel into task-space (whether that's user-land or not). REGARDLESS OF WHETHER AN OBJECT IS LIVE OR NOT.

One of the Mach problems, IMnsHO, was their desire/goal of trying to reimplement UN*X. So, any "impedance mismatches" between their model and the one used by the UN*X implementors was a performance or conceptualization "hit" (hence my deliberate choice of "impedance mismatch"). None of these things were "deal breakers" but they conspired to make it a bad fit. overall.

I'm looking at themechanisms in a different light. To address a different class of problems FROM THE START instead of trying to back-fill to an existing implementation.

E.g., the Standard Library wasn't reentrant. Users had to take pains to preserve "static" members (thereby exposing bits that should have remained *hidden* within the library!). Or, functions had to be redefined to export these entities.

In my case, I can implement the libraries as a *service* that you "connect to" ("load library"). That service can take it upon itself to instantiate thread-specific copies of all these statics. Without exposing any of this to the application.

Of course, UNIX could do likewise! But, now the library had to be aware of the details of the process/thread model *in* UNIX. In my case, I just create a tuple binding the "connection" (handle) to its specific "statics" WITHIN the "library SERVER"!

Vote

D

Don Y 12 years ago

The flip side of this is also important: -- how do you know when all outstanding "capabilities" (object, permission tuples) FOR A PARTICULAR "object" are "gone"? I.e. how doesa Handler know that no one is interested in the "live" object any longer (if "nothing" is tracking "outstanding" Handles/tickets/capabilities?) How do you Know when to free the resovrces set aside to implement/manage that object?

Keep "delegated" capabilities in mind, as well.

Ignoring these "niggly issues" i5 how smaller, simpler, faster kernels get their "performance edge". :(

Vote

G

George Neuner 12 years ago

On many systems, that actually is not true - the descriptor is a global identifier for the referenced object. Opening the same object in separate processes you may discover that the descriptors all have the same value.

So what? The envelope validates the contents. Encoded rights are useless if the envelope is broken or even just sent to the wrong server.

Amoeba chose a public ticket representation specifically to facilitate distributed agents. The entire point of it was that agents be able to delegate modified rights on their own.

I think "extending" is the wrong term. A particular version of a capability may be seen as defining a name space in which its referencing tickets exist, but the effect of versioning the capability is to destroy one space (and everything in it) and to open another.

Moreover, the capability itself exists within a name space which is not changed by versioning the capability.

The *tie* is through the capability. The ticket is only a reference to it.

You misunderstanding something.

Tickets don't create tickets, servers do. You can't modify the rights of an existing ticket and copy it to someone else - it simply won't work. Creating a valid ticket requires knowledge of the capability's secret check (version) number, which only the server has.

You have to present your ticket to the server holding the capability and request issue of a new ticket having the desired rights. The server will tell you to go to hell if you don't have the privilege to create new tickets.

If you have the privilege, you may ask the server to create an

*additional* capability for the original object that you can administer independently. This does not alter the original capability that created your ticket.

Only the holder of an "owner" ticket can request the server to version the capability. That invalidates *every* ticket referencing the old version ... including the owner's own ticket!. When versioning a capability, a server must immediately issue a new "owner" ticket in response.

No. A server needs to track only those (versions of) capabilities for which it intends to honor tickets. It does *not* need to retain any information regarding revoked capabilities.

To a 1st approximation, yes.

A particular version of a capability is known by the pairing of its public identifier and private check number. A different version is, in a real sense, a different capability.

However, it makes little sense to have multiple versions of a capability all which have the same public identifier. If you want simultaneously *valid* tickets to reference unique capabilities, you create simultaneous capabilities with unique identifiers.

The multiple version issue is avoided and irrelevant.

"Remembering".

In your case the kernel does all this work for _all_ of the servers.

In Amoeba's case, the sender needs a ticket that says it can ride the communication channel. 6 of one ...

It's easy enough to create separate channels for each service with unique tickets needed to access them.

But under your system, the kernel has to track capabilities for every live object even though it isn't *responsible* for them.

That's a misconception ... I know someone who worked on the original implementation.

The designers did not *want* to reimplement Unix - rather they believed that [recompile] compatibility with Unix was necessary for Mach to gain acceptance while a critical mass of native programs was being written.

Mostly due to time pressure [Mach was developed on a grant] they took a disastrous shortcut. Realizing that they couldn't *quickly* implement a *full* Unix API translation on top of native services, they grabbed an existing BSD Unix kernel, modified it into a Mach server task and hacked the system libraries to RPC the "compatibility server".

Their problem was that too much of the BSD kernel they used was serial non-reentrant code. Fixing it would have taken as much effort as doing a real API translation - effort they weren't prepared to give at the beginning. Running as one task among many, the kernel inside the compatibility server was painfully slow.

Mach 3.0 brought out an almost complete, lightweight API translation library, and saw a complete rewrite of the compatibility service to offer only those few Unix things that had no analogue in Mach. But by then it was too late.

Creating the compatibility service in the first place turned out to be Mach's fatal mistake. Because Mach could run Unix programs, few people bothered to port anything to the native API. And because Mach ran their Unix programs slowly, people judged the OS itself to be unworthy. [Which was unfair: native Mach programs were faster than equivalent Unix programs on the same hardware.]

YMMV, George

Vote

G

George Neuner 12 years ago

It doesn't matter - if the object is destroyed, all of the capabilities associated with it are destroyed.

Capabilities only control access to an object, not its existence. They can be created and destroyed independently of the object.

When the owner destroys it.

The handler should not care whether or not someone will try to access a nonexistent object in the future.

Any signatory to a joint account can close it. If an object can have multiple owners, then any of them should be able to destroy the object and it's associated capabilities (which should be automatic).

What's important is to be able to distinguish owners from their agents if/when necessary.

Too often I think you perceive complexity where none really exists.

George

Vote

D

Don Y 12 years ago

Agreed. But, you missed the point of my question.

Capabilities (Handles) are (object&, permission) tuples. Each capability references an object.

What happens when every reference to an object disappears? How does the Handler (server) BACKlNG the object know that all references to it (along with their particular permissions) have "disappeared"? How does it know that the resources that it has set aside to manage it can now be released for other uses?

But the object hasn't been destroyed; just the last ovtstanding REFERENCE to this LlVE lNSTANCE of it!

Five actors hold Handles to a particular "file". Each has some set of permissions enabled by *their* particular Handle (capability). The Handler backing that file (i.e. File server) has set aside some resources to implenent that/those live instance of that object.

E.g., read and/or write buffers and/or a mmap()-ed view of the actual file-on-disk. Synchronization primitives within the server to ensure actions by actors are serialized in some predictable manner.

How does the file server know when the last reference to this object disappears? I.e., the last actor holding a capability has terminated or died (unceremoniously). No one knows how many outstanding capabilities may still exist for that (Amoeba) object so the file handler never knows when it can forget about the object.

[Rather than digress into a discussion about the bullet server and its oddities, replace "file" with "motor", above. How does he Motor Server know when it can afford to power down the translator/motor driver for a particular motor because no one has any OUTSTANDING live references to it? The motor still exists. A "non-live" reference to it still exists from which *live* references could be created in the future (e.g., from the soap server). It hasn't been "deleted" by it's "owner". Just everyone has currently lost interest in it -- for the time being!]

The email Handler instantiates a particular email_addr_t. It looks up the "human" representation of an email address in the RDBMS (i.e., that's a privilege that only *it* has). It copies this into memory somewhere inside itself and returns a Handle to some actor that allows that actor to do certain things to/with that email_addr_t.

Of course, what really happens when the actor wants to do something with that email_addr_t is the email Handler is called upon to perform that particular action under the authority granted by that particular Handle (capability).

Actor gives some subset of his permissions to another actor. One or both of them invoke actions (methods) on the email_addr_t through their respective Handles. Eventually, both "lose interest" in that particular email_addr_t ("close()" file). When the last such "open Handle" into a particular email_addr_t instance is released (closed), the Email Handler can free its resources set aside for that email_addr_t.

To "close" an email_addr_t in my scheme, all you do is forfeit your Handle. Because the Handle is implemented in the kernel, it knows every such reference to the object. It can notify the Handler when/if an actor holding a Handle *dies* -- without previously having explicitly told the server that it was no longer interested in the object backed by that particular Handle!

With Amoeba's tickets, *anyone* could hold a valid ticket for an object. AND BE DOING SO LEGITIMATELY! How does the server backing a particular object referenced by those N copies of that ticket know that it is NOW safe to free the resources set aside to implement that object? It has no way of knowing what N is at any instant. Or, when it goes to 0!

It has to rely on an actor explicitly saying, "close" the object represented by this ticket -- and all other future references to that object that may come along. If the actor responsible for doing this dies, there's no one to clean up the zombie objects! You'd have to implement a keep-alive policy so the server could automatically "shut down" objects that haven't been referenced, recently.

And, actors would have to deliberately "tickle" every object for which they hold tickets just to be sure those objects didn't get closed due to inactivity!

[IIRC, this was how the bullet server dealt with the possibility. And, that was a situation where it would be relatively *easy* for a client (ticket holder) to be reasonably expected to tickle the object regularly -- as the object (file) had to be created locally before being sent to the bullet server for "commitment" to media. The same sort of GC was required of all "live" objects -- under policies defined by their services. E.g., each time the GC was invoked, any "untouched" objects (objects for which tickets had not been presented in the previous N GC cycles) were deinstantiated. If an actor happened to be too sluggish to use a ticket (or, was perhaps BLOCKED from doing so), then the object could go away! If the ticket's capabilities didn;t allow him to recreate the object...]

Again, you're missing the point. How does the bank know when both account holders have DIED?? (bank is a bad example because there are undoubtedly laws governing this) I.e., they can't just *spend* the monies in that account -- cuz either account holder may show up to claim them! Do they put the monies in a box for all eternity? JUST IN CASE someone shows up with a valid credential 50 years hence?

Instead, they garbage collect. Accounts that haven't been referenced in N years are automatically closed (and monies go ???). Or, mailed statements that are returned by USPS as "undeliverable" trigger similar response.

I.e., you have to have some periodic activity that FORCES TICKET HOLDERS to show that their tickets (capabilities) are still "of interest".

In this case, you appear to have overlooked some complexity! :>

IIRC, the Hurd people went through a similar "challenge" when they looked at moving to L4. It, being one of those "smaller, simpler, faster" kernels didn;t provide the same sorts of mechanism that Mach afforded so, trying to *emulate* those behaviors *on* L4 ended up making the L4 implementation as sluggish as the Mach approach!

You always trade away something as you move down in complexity. I can do a context switch in near zero time -- *if* I don't have to preserve any process state!! :> (while this *sounds* ridiculous, you can actually be very effective in creating applications with this model! But, you have to be very disciplined, as well -- cuz *it* doesn't do much FOR you!)

Tea time... Then, The Pork Dish! (yummmm!)

Vote

Managing "capabilities" for security

Join the Discussion

Didn't find your answer?