Program crashes - debugging suggestions?

Hi, I was hoping to get some suggestions on debugging a multi-threaded Linux program that crashes about every 10-12 hours. The program coordinates the behaviour between several (about 4)attached devices (serial and ethernet). There is generally one thread for each attached device. Unfortunately when it crashes the threads stop responding one-by-one, no seg fault or other obvious error occurs, making it very hard to pin down. What I suspect is happening is one thread is gradually overwriting memory and it crashes as soon as the memory being overwritten is in use by another thread. It currently has a 4k guard between threads.

Does anyone have any suggestions for how to figure out which code is the source of the problem? I've inspected the most likely areas but haven't been successful in fixing it. Any techniques using gdb/ddd, or other tools? If it generated a seg fault it would be easy......

Thanks in advance for any help, this is really driving me nuts!

Mark

Reply to
Mark
Loading thread data ...

Sound like deadlock or race condition (losing condvar signals). Try pstack with unstripped binaries to see where the threads are stopped. man pstack for more details.

Joe Seigh

Reply to
Joe Seigh

Have you tried valgrind?

From its homepage at

formatting link
Valgrind is a GPL'd system for debugging and profiling x86-Linux programs. With the tools that come with Valgrind, you can automatically detect many memory management and threading bugs, avoiding hours of frustrating bug-hunting, making your programs more stable. You can also perform detailed profiling, to speed up and reduce memory use of your programs.

The Valgrind distribution includes five tools: two memory error detectors, a thread error detector, a cache profiler and a heap profiler. Several other tools have been built with Valgrind.

HTH boa@home

Reply to
boa

Linux

tools?

I haven't heard of either of these tools before, I will try them out and hopefully get closer to solving it.

Thanks for the tips!

Mark

Reply to
Mark

Which version and flavour of Linux you are using ?

Reply to
Vikram

Why don't you run only one thread at a time, and use Valgrind to examine each thread in turn?

You'll probably never figure it out if all the threads are running.

Reply to
Vendor Neutral

Since he's running Linux, not Solaris:

formatting link

Reply to
Vendor Neutral

Linux

tools?

Oops. I shouldn't have spoken so soon.

Or this one is clever--wraps around GDB:

formatting link

Reply to
Vendor Neutral

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.