You have an amusing concept of "apology"! Given that *I* didn't write the NFS RFC's; *I* didn't select the OS you're using (client/server);
*I* didn't write the NFS client or server side implementations; and *I* didn't write *your* code, I am amused that you think *I* would be "apologizing"!Think carefully about what you are claiming NFS is LEGALLY doing (because you are claiming that *it* -- not YOUR PARTICULAR CLIENT OR SERVER -- is "horribly broken". I.e., that it *can* behave as you are stating it *is* behaving).
You are saying that it is deferring the actual deletion of the "physical file" until a point AFTER a file with the same name has successfully been created AND accessed! (or, is arbitrarily deciding to delete a file before being asked to do so)
[Are you sure the file isn't just NOT BEING DISPLAYED? I.e., are you looking at the directory's contents via shells on the client AND server machines?? Are you sure the share isn't managed by automounter (which may have decided to umount it?)]The "physical file" may, in fact, be unlinked some time after you actually ask for it to be (you've stated there is just one link to this file so unlinking it should result in it "disappearing"). But, the file system semantics *must* still preserve "the name associated with a deleted object is freed for reuse for another object".
E.g., for a single client accessing a particular "filename": unlink(filename); fd = fopen(filename, "r+x"); MUST always work (barring out of space, etc. errors). Otherwise, you could encounter cases where a name was used some INDEFINITE TIME prior, unlink()ed... yet NOW you were being prevented from reusing the name (It would be impossible to create a file called "ReadMe.txt" because *someone* probably used that name at some time in the past!).
[granted, your stdlib might not support 'x' but the principle being illustrated remains]So, unlink() *must* be synchronous -- FROM THE VIEWPOINT OF YOUR NFS CLIENT! (I'm ignoring the issues of multiple competing clients because you are claiming this behavior affects *you* as a singular client!)
Indeed, the NFS "REMOVE" RPC *is* synchronous. When it returns, the server has indicated that the filename *has* been "unlinked" (or, that the server can *guarantee* that it *will* be unlinked regardless of any "anomalous situations" that it may later encounter)
Of course, the client-side NFS implementation can choose to cache operations (for performance). But, your call to unlink(2) can't say "everything is fine and dandy" if, in fact, the NFS system hadn't already told it that the NFS REMOVE() RPC
*also* said "everything is fine and dandy" (remember, it's a SYNCHRONOUS interface!).Likewise, the server-side NFS implementation can choose to cache operations. But, the same caveat applies: if it responds to a REMOVE() RPC with "no error", then the client-side NFS implementation will pass this indication on to the "unlink(2)" API which, in turn, will pass it on to your application.
I.e., when unlink(2) returns success, YOU CAN COUNT ON THE NAME AS BEING AVAILABLE FOR REUSE! (from *your* client's perspective)
Furthermore, the NFS implementation now knows the file handle is no longer associated with a valid object (link count was 1 prior to the REMOVE). So, further attempts to read or write on *that* file handle should signal errors ("stale file handle").
[If a file is "deleted" while still "open", NFS creates bogus little "silly files" to allow the accesses to continue while the name is removed]What is most likely happening in *your* NFS implementation (client-side, being the most probable culprit) is the "smarts" in the client are confused. It is forgetting that the file handle is no longer valid and is allowing you to reuse it. I.e., it is not accurately modeling it's own concept of the state of the file store by marking the handle as "invalid" once the associated file was closed and unlinked (while it's a stateless protocol, the client *does* maintain a model of
*it's* concept of state!) [Are you remembering to close it?]Simple test that requires you to write *no* code (you *did* stuff like this BEFORE coming to your conclusion, right??):
mount_nfs remote:/name/of/exported/filesystem local/mount/point cd local/mount/point touch filename
;look at /name/of/exported/filesystem on the remote machine -- e.g, ; via a telnet/ssh client -- and verify the existence of "filename"
dd if=/dev/zeroes of=filename bs=1024 count=1000000
; big enough that *some* data will probably actually make it through ; to the physical medium "reasonably quickly"
ls -al filename
; verify the *locally* visible size is consistent with what you "wrote"
rm filename
; see if filename is still present locally (it can't be!) ; watch to see when it disappears "remotely"
cat filename ; should fail chmod 0 filename ; should fail
; it has to be "gone" so this should fail! Otherwise, what's the ; point in "deleting" a file (name)? ; Just in case, make it inaccessible to confound the following actions...
ls -al > filename
; the above should be a *new* file on the remote device -- the old ; should have been successfully removed prior to it being created
I did this on three hosts last night while simultaneously "Watching" the results on the client and server sides. (Hey, if NFS *is* so "horribly broken", I should know about that!! Gee, probably merits a *paper*!! :> )
All behaved as I expected.
"OK, maybe there's something more subtle involved. Let's knock together a simple app to hammer on the implementation a bit more..." (should be reasonably portable -- note I have no idea what *your* "simulate" entails so I just do a variable number of variably sized writes to the nfs mounted file)
[Note the "-99" return code is intended to note events that I should be "very interested in".]----8 0) { count = fudge(10,64); if ( count != fwrite(&buffer[0], sizeof(buffer[0]), count, fd) ) { fprintf(stderr,"Couldn't write to file during simulation run!\n"); fprintf(stderr,"errno = %d\n", errno); exit(-99); // complainant's alleged issue! } }
// "At the end of each simulation, I deleted the file..." if (0 != fclose(fd)) { printf("I haven't handled this case!\n"); exit(-99); } if (0 != unlink(filename)) { printf("I haven't handled this case!\n"); exit(-99); };
// "... and created another one with the same name for the next run." fd = fopen(filename, "w+x"); if (NULL == fd) { // (Pedantic: make *sure* the file was successfully unlinked!) fprintf(stderr,"Unable to open '%s'\n", filename); exit(-99); } }
----8