For me to answer your question let me tell you a little bit about myself. In the fall of 1987 I entered college at RIT. I was exposed to a lot of new computer hardware. Growing up I was exposed to computers designed for data processing. I bought a Commodore Amiga to do my school work on and it turned out to be an excellent choice because it allowed me to work files from IBM PC and Apple Macintosh environments. Remember at this time IBM's were still primarily CGA (4 colors - cyan, white, magenta and black) and Macintosh's were black and white. Commodore Amiga had a quasi-12-bit color mode called HAM. For recreation one of the first freeware applications I discovered raytracers. The Commodore Amiga was a 16/32-bit MC68000 at about 14 Mhz (IBMs were 16, 20, 25 and Mac was 8 Mhz). In some of my free time between classes I spent time at the library researching different ways to accelerate raytracing. The first and most obvious way was to buy an accelerator or co-processor card with a faster processor and floating point co-processor. I think it was in byte magazine I saw an article on Transputers and I had read articles on transputer products being developed for the Amiga. I saved my money while waited for the products to be completed but eventually the projects were canceled. Late one winter with the money saved I bought a CSA Education Kit. I could compile and run transputer applications on an IBM bridge card and the copy them to the Amiga file system and view them from the Workbench desktop. I also made it a habit of visiting Rochester's surplus shops and through dump luck I found a factory tray of eight T800s. The guy who ran the shop didn't know what they were, seeing that they were gold told me he would have to charge me a premium for them, $10. Using a Vector prototyping board I connected the eight processors to the CSA card. I just wired them up so that they could properly reset. I didn't have money buy any memory so I just used the on chip ram. I could implement a very small raytracer and when I out grew the memory of one processor I would pair them up. Eventually I had a tightly coupled processor made up of an arrangement of 8 transputers in a cube topology. I think it was about a year later I was a HAM radio flea market found my next upgrade. This guy and his son brought a real truck load of junk. I remember him have bar code scanners, data entry pads, and parts of old telephone system. One of things I found was a black PC expansion case. The front was ripped off, on the back I could see the rows of 37 pin connectors and through the vents I could see the tops of gold chips. I asked him how much it was. He told me it was marked and came over and found the price for me. He charged me $20 for it. The friend with asked me what I bought and I told him I'm not sure but I'll show you. We took it back to the car where I removed the top. Inside where 5 CSA 4 transputer boards, a crossbar board, an INMOS B008 with the graphics TRAM and who ever had it had tucked the cable for the graphics TRAM inside. My transputer setup had moved from the Amiga to a dedicated Everex Step 386/33 Mhz. My raytracer evolved into a hypercube and I was able to let the main rendering routine recurse more or I added on more features. As time went on, the topology evolved into a sophisticated pipe line. A few years after graduating from college I started buying them through eBay. My system is split between an industrial PC, the old black PC expansion case and a VME cabinet. The last time I spent anytime doing anything with I was having problems with the worm program that maps the network. I could determine if the network had gotten so big it was timing out before it had finished discover the network or if there was a hardware failure. I do follow the other news group (comp.sys.transputer). I haven't compared it to a modern PC, currently it I have a PIII 500 Mhz laptop and dual 733 Mhz desktop. But it would require a rewrite to take advantage of the PC threading architecture.
I bought the NIOS II Development kit because I liked the development tools and I can see the potential for doing the same kind of things that I have done with transputers. I bought the kit and a Lancelot video adaptor. I plan on developing a 3D graphics core for it with a similar api to OpenGL with intentions of making it into a commercial product. With the Stratix II development board, I see the SDRAM as the biggest bottle neck. I have sketched out an elaborate buffering system that should alleviate this. I would also like to be able to configure the resolution and color depth from software. When I roll it over as a core the wizard would give the engineer the option of letting it be programmable with default values or hard code the settings.
I have been poking around the couple of days about and have found a couple of post about engineers implementing multi-processor systems. I would say have of them sounded like student projects. If anybody has implemented multi-processors systems I would like to hear about their experiences and any after thoughts from the experience. Since a lot of this is still new to me, I'm still at the steep part of the learning curve, I would appreciate if anybody has any projects that they can share with me.
Derek