Hi,
currently I am designing (as an amateur project) a 32bit Stack oriented CPU with two stack-pointers (Data Stack/Return Stack) and some additional registers, that are partly purely auxiliary, partly dedicated for the intended purpose of the CPU as a specialized Lisp-Processor. The control is microcoded and the greater part of the microcode is already written and successfully tested (in simulation with Icarus). Missing at the moment is parts of the ALU functions and the complete interrupt/exception logic. Nevertheless the design (done in Verilog), when synthesized, occupies already about 1100 slices in a Spartan 3 FPGA, which I feel is a bit heavy for what seems to me a very simple design.
Below I give the output of the Xilinx ISEWebpack synthesis tool
Logic Utilization Used Available Utilization Note(s) Number of Slice Flip Flops 621 3,840 16% Number of 4 input LUTs 2,561 3,840 66%
Logic Distribution Number of occupied Slices 1,517 1,920 79% Number of Slices containing only related logic 1,517 1,517 100% Number of Slices containing unrelated logic 0 1,517 0% Total Number 4 input LUTs 2,751 3,840 71%
(about 400/500 slices can be subtracted from the above figures, as they result from accompanying structures like VGA driver and the like).
What catches my eye is, how small the utilization of Slice Flip/Flops compared to the utilization of slices is: Can this be an expression of the fact, that there is much combinatorial logic (adders, multiplexors) and, relative to that, few registers/state elements? Are especially adders, that I used quite generously to speed up the instructions, a source of slices consumption? Or are multiplexors with many alternative inputs more likely the culprits?
I would be very happy, if someone with more experience than me (being just an hobbyist) could look at the Verilog source of the CPU and give me some hints how to possibly lower the amount of resources needed by the design.
Greetings,
Jürgen