Optimizing a State Machine

I designed a SM which can run at 30Mhz below my targe frequency(that's what XST shows after synthesis (no PAR)

It is a pretty big SM(a lot of states, input, output and interna

signals) and I have to optimize it somehow. My question is wha influences the timing so I can concentrate on it I have a Mealy type SM - 2 processes

1st - synch to CLK. the internal signals and outputs are assigne her 2nd - sensitive to all inputs, internel signals(from 1st process) an current state I am using one-hot encoding for the SM ideas ;)
Have you ever looked at implementing the state machine in a ROM = BlockRAM? You can have hundreds of states and still run at several hundred MHz. Pick a ROM size, feed part of the outputs directly back as inputs (remember, theBlockRAM is synchronous!) and add the condition inputs to the addresses. For exmple: 1K x 16 with 7 outputs fed back gives you 128 states with 3 additional inputs that can encode the jump condition at each state. You can conditionally (8-way) jump from anywhere to anywhere, without any restrictions. And you still have 9 freely assignable outputs for each state. Runs at 200 MHz+. Peter Alfke, Xilinx Applications

Peter Alfke

Hi, besides the hints of Mr. Alfke there's something more you can do: Check your Synthesis report for the number off infered adders and/or counters. If you find high numbers of these elements your FSM description probably contain assignments like:

when thisstate => if something then someoutput if something then couterenable I designed a SM which can run at 30Mhz below my target

I have seen no evidence of this using either Quartus or Mentor Synthesis. A clock enable is properly inferred.

-- Mike Treseler

Mike Treseler

I've had this happen to me using Altera's MaxPlusII Verilog (I remember cause I got yelled at). This was a while back, and the Quartus version may have improved since then. I think that a separate counter with an explicit counter enable makes more intuitive sense to me, with the possibility of better results across different synthesys tools. Just my two sense:)


- As for the Block RAM - the SM can not fit in it as the device I a using does not have enough of it synthesis report from XST: (not enough BRAM in this device) But even then the SM is not fast enough

- I have only 2 inferred Adders/Subtractors - I will see how i ca

change them

- Another thing -i have the following concurrent statemen


Can you send me an e-mail, so we can discuss this privately? Peter Alfke ( snipped-for-privacy@xilinx.com )

Peter Alfke

Hi Newman,

You bet it has. Results are very close to Precision and Synplify. It now also has just as good a language coverage as the other two.

Best regards,


Ben Twijnstra

