I was updating a CPU design I did a few years ago and I was a bit disappointed in the results I see. The CPU was originally targeted to an Altera ACEX part which is 5 volt compatible (to give you an idea of its age). I did my own CPU because Altera does not support their NIOS for that family. I spent a fair amount of time optimizing the architecture to be easy to implement in 4 input LUTs and other basic elements found in FPGAs. I coded it up for the ACEX async memories and got it running. If memory serves me, it clocked in at 55 MHz max and I used it at 40 MHz.
Currently I wanted to look at how fast it might run if I redid it for a current FPGA architecture using synchronous memories. I compiled it for a Spartan 3 and got the speed up to 77 MHz using less than 10% of an XC3S400 (315 slices). I am not impressed with the speed. I expected a much larger increase and had hoped for operation at over 100 MHz. I checked the timing analyzer output and the signal paths are pretty much what I expected, no oddball logic generation and I got carry chains where I wanted them. The slow paths have a few long route times, so although it may approach 100 MHz with careful floorplanning, I don't think this is worth the effort compared to the >> 100 MHz CPU cores you can get from the FPGA vendors.
I was wondering if this small speed up is typical of improvements from one or two generations difference in FPGAs? The ACEX parts are designed for economy, not for speed, just like the Spartans. When I did the initial design 3 or 4 years ago, the ACEX parts were old news then! Given that there was nothing in the design that is tailored for one FPGA family over another, I guess I expected more like a 2X speedup in the current technology chip. Isn't that reasonable given the vast difference in the timing specs in the data sheets?