ext3 kjournald causing periodic pauses in application

Hi,

We are using kernel 2.4.19 on an embedded system based on an x386. We have a single ext3 filesystem, the root filesystem, in a DISKONCHIP. Every 15-20 seconds kjournald consumes a large percentage of the CPU locking out the application, which is highly visible to the user. I have found references to this problem and a suggested workaround. However I have found that the workaround does not work under 2.4.19 (others have found this as well).

I'd like to know if anyone has been successful trying something like the following:

Mount the (root) filesystem as ext3 echo 40 0 0 0 60 300 0 0 > /proc/sys/vm/bdflush

This should cause kupdated to run every 0.6 sec and kjournald to flush a dirty buffer after 3 seconds, but this doesn't work under 2.4.19. I believe if I could get this to work, the duration of the pauses would be short enough to go unnoticed.

If you've been able to tune kjournald which version of the kernel (complete with patch level) did you use and what commands did you use?

We switched from ext2 to ext3 as under ext2 we were experiencing file coruption after cycling power, which ext3 fixed.

If this is not the appropriate place to post this question where would you suggest I post it?

Thanks, Randy.

Reply to
Randy Cooper
Loading thread data ...

We use something very similar which results in more frequent, but less intense, 'disk thrashing'.

echo 40 500 0 0 60 300 60 0 0 > /proc/sys/vm/bdflush

The above certainly works with 2.4.[20,21,22-ac[x] kernels.

Perhaps setting via /etc/sysctl.conf might help. I know it shouldn't make a difference, but it may be worth a try.

--
Keep Concorde flying - sign the petition.

http://www.saveconcorde.co.uk/sign
Reply to
Chiefy

I do it like this on the 2.4.22 kernel and reiserfs in /etc/sysctl.conf but the system has a lot of memory too.

vm.bdflush = 100 1200 128 512 500 6000 500 0 0

--
Confucius:  He who play in root, eventually kill tree.
Registered with The Linux Counter.  http://counter.li.org/
 Click to see the full signature
Reply to
David

Oooops! Sorry I hadn't seen that you were working on an embedded system.

--
Confucius:  He who play in root, eventually kill tree.
Registered with The Linux Counter.  http://counter.li.org/
 Click to see the full signature
Reply to
David

EXT3 is perfectly allowed to do such kind of things, causing huge latency to user tasks.

Linux is _not_ a real time OS.

If you need hard real-time you need to add RTAI or something similar.

You say "...users see...", this suggests a multimedia type of application That kind usually ask for soft real-time performance. Kernel

2.6 is said to be a much better soft real-time OS than 2.4. The IDE driver is known to be the source of huge latency problems in Kernel 2.4. It's completely rewritten in 2.6.

-Michael

Reply to
Michael Schnell

I disagree. A system used for any interactive operation should not have long latencies in any operations. If you are doing batch or similar application, I would agree, but not for interactive (or real time) operations.

But it is "good enough" for a pretty wide range of applications.

I agree. However, with care you can run a patched 2.4 Linux at up to 80 Hz or so with acceptable performance.

Actually, my measurements do not support such a statement. If you check the kpreempt-tech mailing list:

formatting link
I put results showing that running 2.6.0-test7 with an ext3 file system performed far worse than under 2.4.22. I believe the maximum latency on my system ended up being about 150 msec when my 2.4 worst case was less than 5 msec (w/ low latency and preempt patches).

Robert Love indicated he would get with the IBM guys who worked on the RCU code in ext3 to see what they can do to reduce the latency, but I have yet to see any results. --Mark Johnson

Reply to
Mark H Johnson

In my test I found more that 100 mSek max latency (imposed on an high real-time priority process) with a low standard priory process doing file copy on an IDE device. I don't remember if that was with EXT2 or EXT3 (supposedly EXT2). Robert's patches did not help with the max latency (while they helped a lot with the average latency). I found a research claiming that with Kernel 2.6 this latency (max an average) is reduced dramatically (by factor 20 or such). I'll do my own tests early next year.

-Michael

Reply to
Michael Schnell

Hi,

I applied the ext3 patch to 2.4.19 so that I could use the commit option for the ext3 filesystem. Althoug this reduced the frequency of the problem, it did not reduce the severity when the problem did occur.

Next I upgraded to 2.4.20 so that I could try echo 40 0 0 0 60 300 0 0 > /proc/sys/vm/bdflush This didn't fix the problem either.

However, using lsof I was able to determine that several files in /tmp were open for writing. Mounting /tmp using tmpfs resolved the problem.

Thanks to everyone who posted suggestions.

I will try 2.6.? on my current project, although I'm planning on using JFFS2 and CFI compliant NOR flash and don't expect to encounter a similar problem.

Randy

BTW: The high visibilty referred to flickering on a sign viewable by the public.

Reply to
Randy Cooper

The 2.6 kernel has introduced "htree" to the directory structure of the ext* filesystems. This is incredibly useful for heavily populated directories, such as web caches with many thousands of images or video image directories with every frame of a video stream separated out. I'm looking forward to its use quite a lot.

Reply to
Nico Kadel-Garcia

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.