Filesystem performance overheads?

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Hi all,

Is anyone aware of studies that have been done to measure the
performance overheads that result from using a filesystem? We want to
know the performance loss that we would suffer using a filesystem like
FAT16 on a ramdisk versus using the memory as it is (raw reads and
writes to memory, without considering it as a ramdisk with a file
system).

As far as I understand the performance loss would be miniscule,
especially compared to what the filesystem buys us in convenience. But,
still I am looking for some hard facts to back up this claim.

Thanks & regards,
Sachin


Re: Filesystem performance overheads?
    Measure it yourself, by using, say, UNIX shell scripts.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
We've slightly trimmed the long signature. Click to see the full one.
Re: Filesystem performance overheads?

Quoted text here. Click to load it

Correct. No "generic" performance test will give you
the *exact* stats that *you* need.

Alexander Skwar
--
One is not born a woman, one becomes one.
                -- Simone de Beauvoir


Re: Filesystem performance overheads?

Quoted text here. Click to load it

To which I would add, you should also look closely at what you
specifically need from your data store. You might find that a simpler
data organisation than an off-the-shelf filesystem suffices; if not,
there are a large number of filesystems to choose from, with different
characteristics, features and tradeoffs. Don't just test one.

--T

Quoted text here. Click to load it
We've slightly trimmed the long signature. Click to see the full one.
Re: Filesystem performance overheads?

You don't need a 'study', you just need to have some familiarity with
what happens in a file system disk access.

  We want to
Quoted text here. Click to load it

Reads and writes to memory take fractions of a microsecond these days.
Reads and writes to a RAM disk can easily take over 100 microseconds,
after they've made it from the application into the operating system,
through both the file system and disk-driver layers, and back again.

Quoted text here. Click to load it

No, in fact the performance difference between direct memory access and
RAM disk access is more than a couple of orders of magnitude.  Of
course, the performance difference between a RAM disk and a real disk
can also be something close to a couple of orders of magnitude, so if
you were instead asking how much *additional* performance improvement
you could get over using a conventional disk by using direct memory
references rather than a RAM disk the answer would be, "Not too much."

As usual, the way you frame the question has a lot of influence on what
the answer to it is.

- bill

Re: Filesystem performance overheads?

Quoted text here. Click to load it

This is definitely true for seek performance, but if several
contiguous blocks are to be transferred, the RAM disk might even be
slower if copied under program control, while a real disk might
perform the transfer using DMA with minimal program intervention.

Paul
 

Re: Filesystem performance overheads?
Quoted text here. Click to load it

Well, it depends on how you define "miniscule". Compared to using memory
directly, no.

A read or write through any filesystem requires a system call. The
computer has to trap to the OS (involves a context switch), process the
system call (Linux is VERY good at this, but it's still hundreds of
instructions), and then do the operation which may involve editing
several regions of the ramdisk filesystem. You might ask the question
"Can't I mmap(2) the file in?", to which I would reply "Yes, but then
what would be the point of using a file on a ramdisk?"

The best case for a read or write to a ramdisk is the minimal system
call overhead, which is still hundreds clock cycles assuming all the
relevant code is in the CPU cache. The best case for a read or write to
memory is L1 cache, which is like 1-2 clock cycles, which can be less
than a nanosecond.

If the filesystem buys you convenience, just use the filesystem. Linux
does very good caching with unused system memory, which is similar to
the overhead of using a ramdisk but it's less of a PITA. Without knowing
why a file is convenient for you it's hard to say, but if it's something
like "it's easy to append to" then you can use a linked list or
something to do the same thing in memory, maybe have a thread to spew
the records to disk asynchronously.

In summary: ramdisk takes a big hit relative to directly using memory.

Re: Filesystem performance overheads?
Quoted text here. Click to load it
You will probably have to do this with your own machine under the conditions
you are really running under with the programs that really concern you.

Imagine that the file system is 1000x slower in some sense than the RAM, but
that you do little IO between the compute-limited processing of whatever it
is you are reading or writing. Then the IO performance overhead would really
be of no interest, and if speed is a problem, the processing algorithms
would need improvement, not the IO.

At the opposite extreme, imagine your process were completely IO limited.
Then the file system overhead would matter a lot. But if you run a 5400rpm
hard drive with a tiny buffer using IDE interface transferring a byte at a
time, this could be serious, whereas if you run a 15,000 rpm Ultra/320 SCSI
hard drive with an 8 Megabyte buffer and your OS chains the SCSI commands,
the file system overhead can be considerably less. Also, if you run Linux,
you can have an enormous amount of stuff cached in RAM even with the file
system in use, and avoid doing a lot of IO if you have enough memory for the
cache.

Only after you have considered all these issues does the question of the
file system used begin to be of interest.

But with all these (and other) variables, you can see why no one would
bother doing such a study. It would take too much time to do, and too much
space to print the results that would be of very limited interest.

--
  .~.  Jean-David Beyer          Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A         Registered Machine   241939.
We've slightly trimmed the long signature. Click to see the full one.
Re: Filesystem performance overheads?
Quoted text here. Click to load it

It may be "small" but not "miniscule".  For every access you would have
to do a context switch and the kernel would have to run just to read
data, as opposed to just being able to read it with one instruction and
no context switch.

Jon
----
Learn to program using Linux assembly language
http://www.cafeshops.com/bartlettpublish.8640017

Re: Filesystem performance overheads?
says...
Quoted text here. Click to load it

FAT16 (and variants) require walking the FAT chain whenever it is
necessary to append, truncate, or seek in a file.  Depending on
cluster size and file size, that can be quite significant, and result in
quite noticeable delays.  Additionally, depending on the implementation,
it's necessary to only allow one process access to the FAT at a time, in
order to prevent corruption.

--Gene

Re: Filesystem performance overheads?
For another take on this topic, you might have a look at this article that
compares a database in a ramdisk versus an in-memory database, both on Linux
systems.

In-Memory Database Systems

Linux Journal, September 1, 2002

http://www.linuxjournal.com/article.php?sid61%33



Or the proprietary version:
http://www.mcobject.com/downloads/memorybenchmark.pdf








Quoted text here. Click to load it



Re: Filesystem performance overheads?

Quoted text here. Click to load it
There is certainly going to be a high degree of variation on the
answer.  A filesystem is going to have some level of blocking, a
directory structure, and a mechanism for expansion/deletion, along
with fragmentation.

Raw read/writes don't have any of this overhead.  Therefore, your
answer is going to vary depending on your filesystem settings and the
type of data you are using.



Site Timeline