I am an academic researcher interested of late in testing, debugging and static analysis of concurrent software. Two of the directions I am interested in are: (1) developing static techniques to prove that a certain piece of software is free of, say, deadlocks and data races, and (2) developing better testing tools for concurrent code, e.g, understanding what the right coverage criteria should be. It is easy to lose sight of the forest and get too occupied with trees while defining research problems, so I thought I would talk to some people who are really on the frontlines of concurrent code and seek some advice about what are the real problems that you would like to see solved.
The first question is rather specific: can you tell me of a piece of software that is: a) written by fairly competent people--- i.e., not just a case of bad programming; b) highly concurrent and uses complicated concurrency constructs. this is the most important requirement; c) buggy, or makes non-standard assumptions about the platform (of course, there may be bugs that are unknown); d) not very large in size, as I would like to start by trying to prove properties of the code by hand. e) ideally but not necessarily, has recursive control flow in some parts.
I have looked at some embedded software and some web servers, but have not yet found anything that somes even close to fitting these criteria. Any thoughts?
The second question is: what features would you like to see in a testing tool for concurrent software? My understanding is that concurrency testing in the real-world setting is really stress testing, and that there are no good notions of coverage. Would you agree with this statement and that things need to change? Regarding bugs or security flaws that you have actually found, what categories would you say are the hardest or need tool support? Timing errors? Data races? Deadlocks? Is there a repository of concurrency error reports somewhere that I can look at to understand the issues?
Best regards, Swar