Program crashes - debugging suggestions?

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Hi, I was hoping to get some suggestions on debugging a multi-threaded Linux
program that crashes about every 10-12 hours.  The program coordinates the
behaviour between several (about 4)attached devices (serial and ethernet).
There is generally one thread for each attached device. Unfortunately when
it crashes the threads stop responding one-by-one, no seg fault or other
obvious error occurs, making it very hard to pin down.  What I suspect is
happening is one thread is gradually overwriting memory and it crashes as
soon as the memory being overwritten is in use by another thread.  It
currently has a 4k guard between threads.

Does anyone have any suggestions for how to figure out which code is the
source of the problem?  I've inspected the most likely areas but haven't
been successful in fixing it.  Any techniques using gdb/ddd, or other tools?
If it generated a seg fault it would be easy......

Thanks in advance for any help, this is really driving me nuts!

Mark



Re: Program crashes - debugging suggestions?


Quoted text here. Click to load it

Sound like deadlock or race condition (losing condvar signals).  Try pstack
with unstripped binaries to see where the threads are stopped.  man pstack
for more details.

Joe Seigh

Re: Program crashes - debugging suggestions?
jseigh snipped-for-privacy@xemaps.com wrote...
Quoted text here. Click to load it

Since he's running Linux, not Solaris:

http://sourceforge.net/projects/lsstack/

Re: Program crashes - debugging suggestions?
snipped-for-privacy@domain.invalid wrote...
Quoted text here. Click to load it
Linux
Quoted text here. Click to load it
tools?
Quoted text here. Click to load it

Oops.  I shouldn't have spoken so soon.

Quoted text here. Click to load it

Or this one is clever--wraps around GDB:

http://oss.oracle.com/projects/pstack-gdb /

Re: Program crashes - debugging suggestions?

Quoted text here. Click to load it

Have you tried valgrind?

 From its homepage at http://valgrind.kde.org /
Valgrind is a GPL'd system for debugging and profiling x86-Linux
programs. With the tools that come with Valgrind, you can automatically
detect many memory management and threading bugs, avoiding hours of
frustrating bug-hunting, making your programs more stable. You can also
perform detailed profiling, to speed up and reduce memory use of your
programs.

The Valgrind distribution includes five tools: two memory error
detectors, a thread error detector, a cache profiler and a heap
profiler. Several other tools have been built with Valgrind.


HTH
boa@home

Re: Program crashes - debugging suggestions?
Quoted text here. Click to load it
Linux
tools?
Quoted text here. Click to load it

I haven't heard of either of these tools before, I will try them out and
hopefully get closer to solving it.

Thanks for the tips!

Mark



Re: Program crashes - debugging suggestions?
Quoted text here. Click to load it

Which version and flavour of Linux you are using ?

Re: Program crashes - debugging suggestions?
mark snipped-for-privacy@excite.com wrote...
Quoted text here. Click to load it

Why don't you run only one thread at a time, and use Valgrind to
examine each thread in turn?

You'll probably never figure it out if all the threads are running.


Site Timeline