Do you have a question? Post it now! No Registration Necessary
Subject
- Posted on
how to figure out where a process hung
- 08-01-2006
- Captain Dondo
August 1, 2006, 3:32 am

I've got a problem...
I have a process run from cron that hangs in an uninterruptible sleep.
It reads data from a webcam, and writes to a tmpfs partition.
This process runs every 15 minutes; and most of the time it will run just
fine, but once in a while, it hangs. Then cron runs it again, and it
hangs again. Pretty soon I have a bunch of hung processes that consume
all resources, and for all practical purposes my little system is dead.
The frustrating thing is that it happens rarely; the process runs every 15
minutes and sometimes it will run for days just fine, and then it will
start hanging.
Is there some way to find out where the process is hung after it is hung
up?
The program is spcacat, a very simple snapshot util for webcams using the
spca driver: <http://mxhaard.free.fr/ . Anyone have any suggestions? I
have 3 weeks to get this up and running, and that doesn't give me much
time....
--Yan
I have a process run from cron that hangs in an uninterruptible sleep.
It reads data from a webcam, and writes to a tmpfs partition.
This process runs every 15 minutes; and most of the time it will run just
fine, but once in a while, it hangs. Then cron runs it again, and it
hangs again. Pretty soon I have a bunch of hung processes that consume
all resources, and for all practical purposes my little system is dead.
The frustrating thing is that it happens rarely; the process runs every 15
minutes and sometimes it will run for days just fine, and then it will
start hanging.
Is there some way to find out where the process is hung after it is hung
up?
The program is spcacat, a very simple snapshot util for webcams using the
spca driver: <http://mxhaard.free.fr/ . Anyone have any suggestions? I
have 3 weeks to get this up and running, and that doesn't give me much
time....
--Yan
--
o__
,>/'_ o__
o__
,>/'_ o__
We've slightly trimmed the long signature. Click to see the full one.

Re: how to figure out where a process hung

What does this mean ? A sleep() call needs to specify a time, so it
can't "hang".
Moreover, AFAIK, a user land process only can do uninterruptible sleep
(a very short nanosleep() ), if it is assigned very special attributes.
Is it possible that the process waits for some hardware event that does
not occur due to defective hardware ?
-Michael

Re: how to figure out where a process hung

From 'man ps':
PROCESS STATE CODES
Here are the different values that the s, stat and state output
specifiers (header "STAT" or "S") will display to describe the state of
a process.
D Uninterruptible sleep (usually IO)
These processes show up as 'D', which means they cannot be killed.
I am guessing that thse processes are waiting for some camera event that
never occurs, but I have not figured out why only sometimes....
The camera shares the USB bus with a GPS, which is being polled almost
continously. I suspect there is some bus contention which triggers this,
but I have no idea where to start looking; all of the code I've looked at
looks OK so far.
--Yan
--
o__
,>/'_ o__
o__
,>/'_ o__
We've slightly trimmed the long signature. Click to see the full one.

Re: how to figure out where a process hung
Hello,

Your cronjob could kill all running instances before starting a new one.
That way the ressources would stay free and the system won't get problems.
It is not a clean solution, but at least it can keep your ressources free.
You could log if any instances are killed (instead of a clean exit) and
maybe find some event which causes the program to hang.

That's an idea, but it will only help against the ressource leak of hung
processes, not against the problem.
I'd guess the program waits for something (a camera's event?), but never
gets it.
Regards,
Sebastian

Your cronjob could kill all running instances before starting a new one.
That way the ressources would stay free and the system won't get problems.
It is not a clean solution, but at least it can keep your ressources free.
You could log if any instances are killed (instead of a clean exit) and
maybe find some event which causes the program to hang.

That's an idea, but it will only help against the ressource leak of hung
processes, not against the problem.
I'd guess the program waits for something (a camera's event?), but never
gets it.
Regards,
Sebastian
Site Timeline
- » I2C drivers (for AMBA)
- — Next thread in » Embedded Linux
-
- » Developing an embedded touchscreen system using a linux installation on a vortex86
- — Previous thread in » Embedded Linux
-
- » Crosscompiling for ARM: reloc type R_ARM_ABS32 is not supported for PIC - ...
- — Newest thread in » Embedded Linux
-
- » Static allocation and data hiding
- — The site's Newest Thread. Posted in » Embedded Programming
-