"Up" in hierarchical namespace designators

Hi,

I support multiple intersecting, overlapping, disjoint, etc. namespaces on a distributed system. My conceptual model for the namespace consists of three different types of "things" (avoiding the term "objects"):

- namespace

- bag

- terminal

A namespace is a hierarchy of names. E.g., "/" on traditional Eunices.

A bag is an object (grrr) that holds other bags -- or terminals. E.g., a "directory" ("folder" for folks who had the misfortune to grow up in a MS world).

A terminal is ... a "leaf" -- the actual "thing" being named.

All of these are *active* objects ^H^H^H "things".

It is important *not* to think of these as "filesystems", "directories" and "files". Many of them are not persistent, etc.

Two namespaces can overlap, intersect or be completely disjoint. A name in one namespace (for a bag *or* a terminal) need not agree with the name used for THAT SAME OBJECT (grrr "thing") in some other namespace. (think of how symlinks work, for example).

I handle name resolution by passing a "description" (akin to a pathname/filename) to a name resolver for whichever "thing" the description is rooted in/at. E.g., I could pass "foo/bar/baz.txt" to the "thing" rooted at "/usr/dgy/" to give the recognized behavior of "the file located at /usr/dgy/foo/bar/baz.txt" (again, remember file is just to make this seem more familiar in this explanation). I could, similarly, pass "/usr/dgy/foo/bar/baz.txt" to the thing at the *root* of (a particular) namespace to get to that same "thing".

Now, the nitty gritty:

Each "thing" (remember, these are active entities) looks at the description passed to it and applies its own syntactic rules to the interpretation of that "description". So, when a "bag" (again, think just in terms of a bag being a "directory" in a "disk file system") gets handed "foo/bar/baz.txt", it's semantics *know* that anything residing in that "bag" will not contain a '/'. Said another way, it can strcspn() the string looking for '/' and take everything up to that '/' as an identifier for a "thing" residing in that "bag". (e.g., "foo")

If it discovers that "foo" does, in fact, reside in this "bag", it removes "foo" (plus the '/') from the description and passes the remaining portion of the description to "foo" (i.e., to the active object that implements the "foo" bag). Then, "foo" applies its syntactic rules (which, for sake of example, are the same as its predecessor) and strips "bar" off the head of the description and passes the remaining tail to "bar".

I.e., each "bag" can apply whatever syntactic rules it wants to the "descriptions" it processes. For example, the "foo" object might force all identifiers for objects in it's "bag" to be exactly 5 characters long (!). As such, it looks for "bar/b" as an indentifier for one of the items (presumably) contained in it's bag. Finding it, "az.txt" is passed to "bar/b" as the balance of the description.

The important thing, here, is that the syntax of items within a "bag" is defined by the bag itself (i.e., there are various flavors of "bags"). And, the manner in which a particular flavor of bag processes the description that is presented to it is left up to that bag to decide.

It should be obvious that traditional filesystem names are just "special cases" of this algorithm. I.e., in UN*X, "look for everything up to the first '/'"; in DOS, "look for everything up to the first ''"; etc. (obviously, there are special cases -- especially in the MSbraindamage world!)

The issue that I am trying to address is how to handle "go up a layer" in the namespace.

Note that specifying a valid name "in my world" requires knowledge of the syntax imposed by each "bag" encountered on your way to a particular terminal. Context is very significant (it is intended to be as each bag can be a different "thing" in functionality, etc.).

And, there is *nothing* that is "special" in any of my "descriptions". There isn't even a notion of '/' (root)!

So... the idea of adding *a* special designator *just* to reference "up" really is grating. First, the typical notation ("..") is expensive -- two (actually three) characters just to represent one simple concept. (granted, I could opt for the '

Reply to
D Yuniskis
Loading thread data ...

My thought is that if "go up" is useful to a significant subset of bags, add it to the base class. You therefore have a uniform way of using it and a bias and template to implement it where appropriate. You probably want to handle the "not implemented" case in a standard way, too.

--
Thad
Reply to
Thad Smith

Well, it's not really useful to the *bags* as much as to the guy creating that reference "description". E.g., unless a particular flavor of bag simply can't support an "up notation" (i.e., syntactically), the only reason I can see for *not* supporting it is for chroot() jails and their ilk.

OTOH, it is *incredibly* expensive to implement as it ties up a lot of resources for the duration (which can be considerable) of the name resolution. (unless I can come up with a cleverer way of implementing it :< ).

I.e., the "easy" solution is to just get rid of it! In most API's, I can't see it having much use (??) -- I think of how I have typically referenced things (e.g., files) in hierarchical namespaces and can't remember any case where my *code* deliberately said "go up from here".

Yet, each time I build an executable, etc. my #include's, makefiles, etc. rely heavily on this ability (but those are

*tools*, not *applications*).

I think I need to sit on a mountaintop and contemplate my navel for a while and hope for enlightenment :-/

Reply to
D Yuniskis

I do see a conceptual problem with your proposed arrangement. To re-express it succinctly while keeping it general in nature, you have two types of object - the atomic ones (in the indivisible sense of the term, i.e. files), and containers that can reference other objects, either atoms or more containers.

However, as you wrote your description the references (filenames) have a many-to-one relationship with the objects they reference. Consider two containers /a and /b that each contain refernces to c. Therefore /a/c and b/c are equivalent. However, if you are in c how do you define "up" even on a conceptual level? Which parent does it point to? It is not for no reason that Unix does not generally allow hard links to directories.

At this point I see an obvious comparision to logical paths in Unix's Korn shell and successors. If /b contains a symlink to c then when you navigate to /b/c you actually arrive at /a/c. However, Korn shell keeps track of the path you used to navigate to direcotry in question. If you do a "cd .." is goes back one step along the path you took (to /b) rather than to the object's real parent (/a).

In Korn shell this is slightly messy because of mixed metaphors but you use this kind of "back" functionality everyday in your web browser. Ignore the address bar - you start at your home page and click on a link, and then on another link form the object that link referenced. Clicking the back button a couple of times returns you to where you started.

This kind of "back" feature removes the problem from the namespace altogether - it becomes a feature of the application instead. It is not perfect of course - even with multiple starting points (roots) you are not guaranteed access to the entire namespace, although a chroot environment would simply have a non-root starting point. There is also the issue of what is backwards of the starting point. However, I can see it could satisfy some situations where the filesystem-like metaphor breaks down.

--
Andrew Smallshaw
andrews@sdf.lonestar.org
Reply to
Andrew Smallshaw

Yes -- though don't cling to the "file" concept as it will lead you down the wrong path :-/

Again, "names" or "descriptions" -- not "file names"

Correct. But, /a/c and /b/c might exist in different namespaces.

I.e., consider two "filesystems" (again, I only use this as an example that maps into a "real life" structure that is recognizable to most):

/a /C /D /E /G /H /q /t

and

/b /C /p /z /y /U

where uppercase indicate terminals (note that /q/t is an empty container!)

Assume C is the same object in each case.

The first namespace references C at /a/C while the second references it at /b/C. In the first namespace, /b/C doesn't exist (i.e., there is no "/b") while in the second, /a/C doesn't exist.

[n.b., one could create something having the same *name* "/b" in the first namespace that could be an entirely different container than the "/b" in the second namespace]

This is where the filesystem analogy leads you down the wrong path. :<

Imagine the filesystem *structure* is stored on a separate volume (disk) than the actual "files". Now, imagine having one such volume for each *namespace*. So, the containers, their names and their contents *differ* from one namespace to the next -- yet they can still (ultimately) resolve to the same "object" (file).

But that's because you have just a single namespace and everything resides in it. You can't get the behavior I am describing with a traditional UN*X filesystem :-/

"Up" (avoiding your "back" terminology) only makes sense in a particular context. That context is a namespace and a place *in* that namespace.

To keep things simple, I am only describing "up's" role in a "full description (name)" where the context is visible immediately to the left of the "up" notation, etc.

I question whether or not up actually has any real role in an "application". I.e., it's role is to exploit relationships in the filesystem (namespace) that the application *assumes* exists. E.g., "../include" vs. "../obj", etc.

But, I wonder if the same can't also be achieved by telling an application where it's "root" (not to be confused with *the* root) is and having it make all name references wrt that "name" (in the namespace applicable to it). (?)

Reply to
D Yuniskis

I coded a critter very much like what you are describing as part of a real-time operating system. Right down to the multiple separate namespaces.

Basically, we punted the notion of "up" completely, because it only really arises when you have a notion of "current working bag" (to use your term).

In the absence of $cwd, there's no reason for .. - which is why you see it so seldom in application code, but rather frequently in symlinks and Makefiles.

Note that if you add the notion of an application's root, you are creating the need for an "up", if there's ever a situation where the application might want to go "up" from its root. If not (for example, isolation / security reasons), then there's again no reason to implement the concept.

--
Steve Watt KD6GGD  PP-ASEL-IA          ICBM: 121W 56' 57.5" / 37N 20' 15.3"
 Internet: steve @ Watt.COM                      Whois: SW32-ARIN
   Free time?  There's no such thing.  It just comes in varying prices...
Reply to
Steve Watt

Hmmm... I'm not sure one implies the other.

I.e., (using a more conventional filesystem naming) there's nothing to prevent you (an application) from fabricating a pathname FROM THE ROOT that contains notions of "up":

basepath = "/home/dgy" path = basepath + "../common/spooldir"

That's the argument I am trying to make (to myself). It is "wrong" for an application to expect a certain structure to the namespace *beyond* it's concept of some "base" portion (avoiding the use of the word "root") of that namespace.

Note that this "base portion" need have no relationship to the application's home.

However, doing so means some applications might now need an extra argument -- namely, the "name" of that "base" -- and then they can act relative DOWNWARD from that base.

So, for a programmer's example, if you keep your sources in ./src, includes in ./inc and want your objects in ./obj, then you don't say:

cd src cc foo.c

but, rather:

cc base_of_namespace foo.c

or

cc src/foo.c

and assume cc will ask the resolver to trim the prefix (which *it* doesn't know to be "src/"!) from "src/foo.c"

What you need is a notion of the application's "base" -- not it's current *name*. I.e., if bin/cc expects to find it's .so's in bin/../lib, then you tell cc that it's *base* is whatever_is_above_bin_and_lib so that it can create a downward path from that base to ./lib

I'm unhappy with this model, though. :<

OTOH, supporting "up" is a *huge* runtime burden as it forces all resources to be tied up for the duration of the name resolution (which, given that component resolvers will often be *true* RPC's, can tie things up for a VERY long time -- especially in light of network timeouts and retries)

I.e., I am trying to figure out why up would ever be "necessary and essential" -- then, I can figure out how to make *that* go away! :>

Reply to
D Yuniskis

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.