Xilinx timing constraints incorrect?

Hello all,

I am working on a design for a Xilinx V2P50 and I am trying to diagnose possible timing issues because the hardware performance of my design does not appear to match simulation.

I have run the design through Timing Analyzer (ISE 8.2) and notice a huge number of unconstrained paths of the following format

"FROM *clk_pad* TO *register*"

where *clk_pad* is a clock pad in the design, and *register* is a register in the design that is clocked. Am I missing a necessary timing constraint to eliminate these unconstrained paths? I have period constraints for all of the clocks in the design. If these paths are "correctly" unconstrained, is there any way to eliminate them from Timing Analyzer easily (preferably via command line, not in the GUI) so that only valuable unconstrained paths appear?

Thanks for all of your help.

Reply to
paragon.john
Loading thread data ...

I'm sure there is, but I don't have a quick answer to that question. I can tell you what I would do.

  1. Reduce the number of clocks to a minimum.
  2. Synthesize a separate module for each clock and work out Fmax for each domain.
  3. Instance the modules with known-good synchronization as needed.

-- Mike Treseler

Reply to
Mike Treseler

I think those only occur when a local net is used somewhere in the clock path. Are you using a gated clock? Is the clock pad using a pin other than the one with a direct connection to a global clock buffer?

Why don't you post one of the unconstrained paths. That is, the part that should look something like:

================================================================================ Timing constraint: Unconstrained path analysis

868 items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors) Maximum delay is 6.513ns.

-------------------------------------------------------------------------------- Delay: 3.654ns (data path) Source: GIGE_CLK_N (PAD) Destination: aurora_rt2_e/PROTOCOL_ENG1/lane_0_mgt_i (HSIO) Data Path Delay: 3.654ns (Levels of Logic = 3)

Data Path: GIGE_CLK_N to aurora_rt2_e/PROTOCOL_ENG1/lane_0_mgt_i Location Delay type Delay(ns) Physical Resource Logical Resource(s) ------------------------------------------------- ------------------- B15.PADOUT Tiopp 0.312 GIGE_CLK_N GIGE_CLK_N C15.DIFFI_IN net (fanout=1) 0.000 TOP_MGT_CLK_BUF/SLAVEBUF.DIFFIN C15.I Tdiffin 0.843 GIGE_CLK_P TOP_MGT_CLK_BUF/IBUFDS BUFGMUX3P.I0 net (fanout=3) 0.435 TOP_REF_CLK2 BUFGMUX3P.O Tgi0o 0.057 TOP_USER_CLK_BUF TOP_USER_CLK_BUF GT_X3Y1.RXUSRCLK net (fanout=527) 2.007 TOP_USER_CLK -------------------------------------------------

--------------------------- Total 3.654ns (1.212ns logic,

2.442ns route) (33.2% logic, 66.8% route)
Reply to
Duane Clark

Thanks for your response.

I do have a gated clock of the following form: GATED_CLOCK

Reply to
paragon.john

--------------------------------------------------------------------------------

Correction to the above post: The gated clock should be GATED_CLOCK

Reply to
paragon.john

As long as it is not used internally, I wouldn't worry about it.

--------------------------------------------------------------------------------

I guess it reports paths that go through the DCM, perhaps because a local net is used. It doesn't look to me like it matters, and that should not be affecting the design operation. Your problem is almost certainly elsewhere. Is your period constraint on the input signal, clk_n?

I recognize that if these paths are uninteresting, you want to eliminate them from the report to see if there are any interesting paths left (I doubt you will find your problem that way, but it might help). I suppose you could eliminate the unconstrained paths with a FROM/TO constraint. You could perhaps do a global TIMESPEC "TSPADS2CLK" = FROM "PADS" TO "FFS" 2 ns; Or perhaps try to make it a bit more restricted, though I would have to experiment a bit to find the correct syntax. Perhaps something like: NET "CLK_n" TNM_NET = "CLK_N"; TIMESPEC "TSPADS2CLK" = FROM "CLK_N" TO "FFS" 2 ns;

Reply to
Duane Clark

--------------------------------------------------------------------------------

The period constraint is on clk_n's differential counterpart clk_p. Similar unconstrained paths appear for clk_p, I just happened to pick one that used clk_n.

I will try a FROM/TO constraint to remove them. Since you doubt that going down this path will help me out, do you have any insight on what may be a better way to diagnose possible timing issues that may result in a mismatch? I have looked through the warnings that the various stages of ISE spit and haven't seen any that would lend themselves to a simulation mismatch.

Thanks.

Reply to
paragon.john

The answer is usually a missing handshake, synchronizer, transition detector, clock enable or fifo. These problems are hard to see at ground level.

It is easier for me to guarantee synchronization from the top down than it is to decode cryptic timing warnings after the fact. This is the only way I know of to make good use of simulation.

-- Mike Treseler

Reply to
Mike Treseler

You really have not described what the circuit is like. Are there multiple clocks? Are signals crossing between clock domains? Does it actually work correctly at lower clock rates (you said that "the hardware performance of my design does not appear to match simulation").

Mike described the most common problems, and those match my experience too. For difficult problems, if you have not tried chipscope before, give it a try. It is quite handy (hopefully you have an accessible jtag port). If you don't already have it, you can get a trial license for it and start using it immediately. I will sometimes insert a simple error detection circuit in the FPGA that can generate a trigger for chipscope.

Reply to
Duane Clark

This a digital signal processing application done mainly in System Generator. There is one "main" clock domain where all of the critical processing is done. There are domains synchronous to this clock made using clock enables (in System Generator) and also divided clocks that are output by a DCM (in the VHDL that I wrap the sysgen design). I believe that all of this is constrained and controlled properly.

There is a asynchronous second clock domain that is used purely for command and control of the design via setting and reading registers. The signals crossing this domain would not affect the performance of the design. The type of simulation mismatch I am seeing is the "performance" of the algorithm I am implementing. The algorithm works, just not as well as I see when I run simulations.

I can't use ChipScope, unfortunately. I can get at some test signals via a logic analyzer, but obviously not with the flexibility and ease of chipscope.

Reply to
paragon.john

Why not do everything with clock enables and eliminate the divided clocks and their constraints?

Unless one of the async registers tries to tell the design to start, stop, or turn left, sending the machine into and "impossible" state that simulation can't see.

If it doesn't match the simulation exactly something is very wrong.

A test point is much better than nothing. Good luck.

-- Mike Treseler

Reply to
Mike Treseler

I have not used system generator, and I don't know what you mean when you say there is a performance mismatch between the simulation and the hardware. I assume you are you simulating the VHDL/Verilog code of the complete system, in which case there should be an exact one to one match between the sim and the system. You are saying these don't match exactly?

What I generally do in signal processing applications is to have commandable bits that steer different intermediate results to the output. I typically leave this even in the final circuit, since it adds very little complexity. One bit will send the raw input data to the output, another might send the output of the first FFT, another the output of intermediate processed FFT data, and no bits means the final processed data. That would allow narrowing down the location of the problem, and that might give a clue to what kind of problem there is. Sort of a poor man's substitute for chipscope, though.

Reply to
Duane Clark

I am using clock enables as much as possible, however, since the output of the design is connected to a piece of test equipment that requires a clock, I have to use the divided clock at the very end of the design before outputting the data (outside of any processing loops).

I don't have any way of removing this clock domain crossing. I double register the signals after they cross the domain boundary (data only crosses in one direction). Maybe there is something I am forgetting, but I thought this would be sufficient for this type of design.

Yes, yes it is.

Reply to
paragon.john

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.