Hello Group,
I'm having difficulty convincing Xilinx ISE (8.1SP1 WebPack) PAR to pack some of my signals into the IOBs of an XC4LVX25-FF668. A simple demonstration case is the MIG 1.4 dimm72 VHDL design for the ML461 memory eval board. The FPGA editor shows that most of the DDR command bus outputs correctly push their registers into the IOBs; however the DDR_CS and DDR_ODT output get registered within the fabric. The resulting skew between DDR_CS and the rest of the command bus outputs becomes fatal when generating cores for DDR systems with multiple ranks (chip-selects). The skew of 2ns or so is clearly visible in timing simulations, and my DDR2 model throws a tantrum about invalid setup time on DDR_CS.
I have tried seducing ISE into pushing the DDR_CS registers to the IOBs by setting XST "Pack I/O Registers into IOBs" to both "Auto" and "Yes". I have set XST register balancing to "No" and to "Yes" with Move First/Last Flip-Flop Stage disabled. I am also enabling the MAP option to Pack I/O Registers/Latches into IOBs. I am new to Xilinx FPGAs. With Altera I would enable Fast Inputs/Outputs in Quartus, and all the registered I/O would get pushed to the pads. Am I missing something obvious?
One thing I did notice: DDR_CS (and ODT) is different from the rest of the command bus in that it is directly fed a combinatorial term, whereas the rest of the command outputs are just relatches of signals registered on previous clocks. Should this make a difference? It seems to me that a combinatorial term could be routed from a slice to an I/O register, just as easily as a registered signal could...
Any thoughts or meditations are appreciated,
-Peter