Good point. We have done it with a 16 bit compiler, plus some ASM, plus remote debugging via monitors. The n processors run the same code (one task), and communicate with each other via common memory and dedicated bus signals. The debugging could be done because we had the source code of the monitor: we had to change it in order to start and stop the n programs 'simultaneously', etc. It was a funny experience. And it worked. By the way, the hardware was developed by ourselves.
Of course, we followed a strict process (CEN 50126 / 50128 / 50129) which included (documents and code) inspections, rigurous C coding procedures ('safe' subset without dynamic memory usage, very restricted use of pointers, etc.), unit testing with 100% coverage (this was hard), incremental integration, strict validation procedures, etc, etc.
We have all of that. What we don't have right now is a tool-chain to do the same job, but for 32-bit platforms. We know our 16-bit tools pretty well (we've been working with them for 15 years, and all their peculiarities -i.e., bugs- are known and well documented; that's what I call 'proven in use'). I wish we could get a reliable tool-chain for
32-bit, just to reduce the costs of debugging them: our process will (hopefully) detect those bugs, but the fewer the bugs, the fewer the cost.-- Ignacio G.T.