Xilinx FPGA, OFFSET OUT AFTER

U

uvbaz 19 years ago

hi,

I want to operate the XILINX Virtex-4 XC4VLX60(Speed -12) at a frequenz of 130 MHz. So i write the following constraints to my UCF(user constraint files).

NET "i_clk_adc" PERIOD = 6 ns HIGH 50 %; # clock periode,

6ns(about 160MHz) OFFSET = IN 3 ns BEFORE "i_clk_adc" HIGH ; # Input signal must be ready, 3 ns before rising edge OFFSET = OUT 5 ns AFTER "i_clk_adc" HIGH ; # Output signal must be at the PAD, 5 ns after the rising edge.

However, the constraints OFFSET = OUT 5 ns AFTER "i_clk_adc" HIGH ; cannot be met. Actual timing is about 7.8ns.

The specification from XILINX says that, this device can be operated at about 300MHz. Why i cannt achieve this timing level?

This is the code, i use this to test the device:

output: PROCESS(rst_n, i_clk) BEGIN IF(rst_n = '0') THEN o_data

Vote

G

Gabor 19 years ago

There are a number of factors that can cause long clock to output delays.

1) Output I/O standard. If you use LVTTL or LVCMOS standards, the default mode is slew-rate limited. This can add nanoseconds to your output delay. You can disable the slew-rate limits by attaching the FAST attribute to the port in your .ucf file.

2) Placement of output register. If the output register is not placed in the IOB you can have additional routing delays. Make sure you allow the mapper to push flip-flops into IOB's for outputs. In XST there is also the option for the synthesis stage to place flip-flops in the IOB.

3) Additional timing paths. Sometimes the timing report is not showing your normal clock-to-output delay but rather the reset to output delay. This occurs if your reset term is synchronous to the clock. Thus for example if rst_n was created on a rising edge of i_clk, the timing report will see a path from i_clk to rst_n through the reset of the output flip-flop and finally to the output pad. Usually this is not a path that needs to meet your OUT AFTER specification. You can use TIG on your reset nets to avoid this. You can see this in the timing report if you look at the path that fails and see the reset net somewhere in the chain.

HTH, Gabor

Vote

F

Frai 19 years ago

However, if you take the TIG approach on synchronous reset nets, you might find out that the post-place-and-route simulation does not work. This might happen if the path between the reset flip-flop and the reset input of any other flip-flop is about the length of your clock period. In this case there will be a violation on reset input with respect to clock input, throwing an unknown at the output that will never be recovered, trashing all your simulation.

There might be approaches to avoid this, but I don't know them. I only use TIG for asynchronous resets where you cannot predict when the reset signal will actually happen. Later I play with the reset timing in the testbench to make sure it does not interfere with the operation of my circuit during simulation.

In general, I think it's good to add the following lines to your UCF file:

ENABLE = reg_sr_q; NET "nreset" TIG;

where "nreset" is an asynchronous signal. The "reg_sr_q" option enables the analysis of paths through the Reset and Set pins of your synchronous elements. The "nreset" signal will not be analysed since you TIG'ed it. However, any other synchronous resets that you might generate inside your circuit will be analysed to make sure things work as expected.

Please correct me if I'm wrong. Regards.

Vote

J

johnp 19 years ago

You didn't mention if you're using a DCM to buffer/deskew the input clock. If you aren't using one, you have multiple sources for your output delay: a) the clock delay caused by the going through the input buffer and BUFG onto the global clock line b) the clock-to-Q delay of the flop c) the delay from the flop to the output pad.

As a previous poster mentioned, make sure you have the tools set to map the flop into the IOB to eliminate routing delays (item c) above.

Item (a) can be a fairly large.

John Providenza

n Mar 14, 4:38 am, "uvbaz" wrote:

Vote

U

uvbaz 19 years ago

Thanks,

I've try the tipps from Gabor,

It helps to improve the timing performence, to pack the output register into IOBs.
About the reset signal, i don't understand. But as Frai said, i should pay more attention to it.
Play with different IOSTANDARD is another way to improve the "clock to output". Here is the Data Sheet from XILINX about different IOSTANDARD, also about slow/fast slew-rate. :
formatting link

However, it seem to be that, I got to the end of this device XILINX Virtex-4 XC4VLX60 (It is speed grad -10, not -12). The following is the post-par static timing report from XILINX ISE.

========================================== Slack: -3.039 ns (requirement - (clock arrival + clock path + data path + uncertainty))

------------------------------------------------------------------------------ Clock delay: Tiopi 0.938 // Input buffer net (fanout=1) 0.928 Tbgcko_O 0.900 net (fanout=534) 3.009 // Because i have a lot of registers clocked by this clock

Total 5.775ns (1.838ns logic, 3.937ns route) (31.8% logic, 68.2% route)

------------------------------------------------------------------------------------ Data Delay: Tockq 0.584 net (fanout=1) 0.050 Tioop 3.630 // Output buffer

Total 4.264ns (4.214ns logic, 0.050ns route) (98.8% logic, 1.2% route) ===========================================

I think i have to re-design the code. What do you think?

Thanks Cheng

Vote

J

jean-baptiste.nouvel 19 years ago

Hello Cheng,

I have just tried your design here.

6ns from inside clock to outside of the chip is not huge.

If you want to go high up in frequency, you have two choices:

- the fpga (when writing to the outside) needs to provide the clock alonside the data. This is done for instance by running the fpga at twice the frequency then dividing this clock in the IOB by two and this forms the clock you provide to the outside world. But in your case twice as much might be a pain. This way your system is said 'source synchronous'.

- if your system is 'system synchronous' i.e. an external clock is fed to your fpga and the other chips it interfaces with, then what you can do is use a dcm inside the fpga and use clk270 to 'advance the clock (1/4Tcycle). This will help the fpga-writes case but make sure it doesn't break the fpga-read (you can do the math:

1/4Tcycle in your case is 2ns).

I hope this helps, jb

Vote

J

johnp 19 years ago

Have you considered using a DCM to remove the clock skew? You have

5.7ns of clock delay, using a DCM would cut this way back.

John Providenza

Vote

Xilinx FPGA, OFFSET OUT AFTER

Join the Discussion

Didn't find your answer?