I am speeding up a design for data processing, where many simple steps are done causing much overhead. Therefore, I try to increase the system speed, by eg. inserting some FFs for critical paths but found fitting problems with the multipliers.
My solution, was to parallelize some large (40x40) multiplications and used multi cycle contraints (two clocks) to make it run. Quartus says, it is fine. After using the constraints, I obtain speeds above 150MHz.
Problem: I cannot check this in Modelsim, because the result of the multiplications show up immediately after on clock, which is not the case in reality.