You first have to decide how many differnt states you need, how many outputs you need, and how many control inputs that cause conditional jumps. Here is a very simple design that folds everything into one BlockRAM The BlockRAM (really a ROM) is configured as 1k addresses and 18 output bits. Connect 6 of the outputs back to the address inputs. They perform the counting=stepping function for 64 states.. That leaves you 4 control inputs, and gives you 12 freely assignable outputs. So you have a 64-state state machine with four control inputs and 12 outputs beyond the 6 counter outputs. And it runs at BlockRAM speed, up to 500 MHz.
Many variation are possible, like more states with fewer control inputs, or vice versa. Peter Alfke, Xilinx Applications