XST net splitting blocks placement

I have a Spartan 3 interfaced to a TigerSHARC via its "link port", which is a 4-bit wide DDR communications interface. The receive side of this interface on the FPGA consists of two 16x4 bit FIFOs implemented as LUTRAM, and the write address of each one is driven by a 4-bit register, explicitly implemented as FD or FDC.

I added "LOC" constraints to my VHDL source for the four LUTRAMs and four write-address flip-flops, in order to control skew among the data lines and to optimize potential performance. I'm currently running this interface at 125 MHz, but hope to be able to support

250 MHz eventually. In particular, the flip-flops are packed in pairs, two to a given location.

Anyway, my problem is this: All of this has been working fine for the last two years while the rest of the FPGA was being developed. However, on the last iteration of adding some more logic to the FPGA (unrelated to the DSP link port), I suddenly started getting an error from the "Directed Packing" step of the Map process saying that it couldn't pack one of my flip-flops with the other, because the set/resets were not identical. It turns out that XST had assigned the reset of one of the flip-flops to a different copy of the global reset signal, which presumably had been created because the fanout of the reset signal had reached some threshold. Apparently, the assignment of loads to specific copies of replicated nets occurs at an earlier step, and doesn't take into account directed packing constraints.

So, my question is this: Is there an easy way, in my VHDL source file, to insure that all of my flip-flops are connected to the

*same* copy of the reset signal, without introducing additional logic into the path? The file is attached below for reference.

Thanks in advance for any suggestions!

-- Dave Tweed

============================================================================

-- dsp_link_rx.vhd

-- Receive side of a 4-bit wide ADSP-TS201 link port.

-- This is based on the design found in XAPP635, but without the block RAMs

-- used to convert the internal data path to 128 bits width. It is designed

-- to accept one quadword at a time into its LUTRAM 16x8 FIFO. The FIFO can

-- be unloaded at the system clock rate, and then another quadword can be

-- accepted from the DSP transmitter.

-- To do:

-- For now, we're not interested in the BCOMP signal, so it is ignored.

-- History:

-- 2007/06/05 DT Fix sensitivity list.

-- 2005/08/19 DT Fix problem with duplicated last byte.

-- 2005/08/17 DT Fix problem with duplicated first byte.

-- 2005/08/10 DT Eliminate generate statements so that we can label

-- primitives for placement.

-- 2005/08/08 DT Fix FIFO state machine.

-- 2005/08/02 DT Take over from HDLmaker; fix trigger mechanism.

-- 2005/06/07 DT Move the I/O buffers to the pad ring.

-- 2005/05/22 DT Start.

library IEEE; use IEEE.std_logic_1164.all; use IEEE.std_logic_arith.all; use IEEE.std_logic_unsigned.all;

-- for correct simulation of RAM16X1D library unisim; use unisim.vcomponents.all;

entity dsp_link_rx is port ( clock : in std_logic; rst : in std_logic;

-- external interface from DSP lxacko : out std_logic; lxbcompi : in std_logic; lxclk : in std_logic; lxdata : in std_logic_vector (3 downto 0);

-- internal interface (to bytestream decoder, fast version) dataout : out std_logic_vector (7 downto 0); data_en : out std_logic; ready : in std_logic ); end dsp_link_rx;

architecture BEHAVIOR of dsp_link_rx is

-- --------------------------------------------------------------------

signal high : std_logic; signal notlxclk : std_logic; signal int : std_logic_vector(7 downto 0); signal rd_addr : std_logic_vector(3 downto 0); signal wr_addrp_d : std_logic_vector(3 downto 0); signal wr_addrp : std_logic_vector(3 downto 0); signal wr_addrn_d : std_logic_vector(3 downto 0); signal wr_addrn : std_logic_vector(3 downto 0);

signal trigger_d : std_logic; signal data_en_int : std_logic; signal set_lxacko : std_logic; signal lxacko_int : std_logic;

type fifo_state_t is (fifo_empty, fifo_active, fifo_final); signal fifo_state : fifo_state_t;

attribute loc : string; attribute syn_keep : boolean;

attribute syn_keep of lxclk : signal is TRUE;

attribute loc of rp0 : label is "slice_x40y8"; attribute loc of rp1 : label is "slice_x38y10"; attribute loc of rp2 : label is "slice_x40y10"; attribute loc of rp3 : label is "slice_x38y8"; attribute loc of rn0 : label is "slice_x40y9"; attribute loc of rn1 : label is "slice_x38y11"; attribute loc of rn2 : label is "slice_x40y11"; attribute loc of rn3 : label is "slice_x38y9";

attribute loc of ff_wr_addrp0 : label is "slice_x39y9";

-- attribute loc of ff_wr_addrp1 : label is "slice_x39y9"; attribute loc of ff_wr_addrp2 : label is "slice_x39y8"; attribute loc of ff_wr_addrp3 : label is "slice_x39y8"; attribute loc of ff_wr_addrn0 : label is "slice_x41y9"; attribute loc of ff_wr_addrn1 : label is "slice_x41y9"; attribute loc of ff_wr_addrn2 : label is "slice_x41y8"; attribute loc of ff_wr_addrn3 : label is "slice_x41y8";

-- --------------------------------------------------------------------

begin

-- --------------------------------------------------------------------

high wr_addrp(0)); ff_wr_addrp1 : fdc port map (d => wr_addrp_d(1), c => lxclk, clr => rst, q => wr_addrp(1)); ff_wr_addrp2 : fdc port map (d => wr_addrp_d(2), c => lxclk, clr => rst, q => wr_addrp(2)); ff_wr_addrp3 : fdc port map (d => wr_addrp_d(3), c => lxclk, clr => rst, q => wr_addrp(3));

ff_wr_addrn0 : fd port map (d => wr_addrn_d(0), c => notlxclk, q => wr_addrn(0)); ff_wr_addrn1 : fd port map (d => wr_addrn_d(1), c => notlxclk, q => wr_addrn(1)); ff_wr_addrn2 : fd port map (d => wr_addrn_d(2), c => notlxclk, q => wr_addrn(2)); ff_wr_addrn3 : fd port map (d => wr_addrn_d(3), c => notlxclk, q => wr_addrn(3));

-- -------------------------------------------------------------------- -- The FIFO memories (LUTRAM)

rp0 : ram16x1d port map (d => lxdata(0), we => high, wclk => lxclk, a0 => wr_addrp(0), a1 => wr_addrp(1), a2 => wr_addrp(2), a3 => wr_addrp(3), dpra0 => rd_addr(0), dpra1 => rd_addr(1), dpra2 => rd_addr(2), dpra3 => rd_addr(3), spo => open, dpo => int(0) );

rp1 : ram16x1d port map (d => lxdata(1), we => high, wclk => lxclk, a0 => wr_addrp(0), a1 => wr_addrp(1), a2 => wr_addrp(2), a3 => wr_addrp(3), dpra0 => rd_addr(0), dpra1 => rd_addr(1), dpra2 => rd_addr(2), dpra3 => rd_addr(3), spo => open, dpo => int(1) );

rp2 : ram16x1d port map (d => lxdata(2), we => high, wclk => lxclk, a0 => wr_addrp(0), a1 => wr_addrp(1), a2 => wr_addrp(2), a3 => wr_addrp(3), dpra0 => rd_addr(0), dpra1 => rd_addr(1), dpra2 => rd_addr(2), dpra3 => rd_addr(3), spo => open, dpo => int(2) );

rp3 : ram16x1d port map (d => lxdata(3), we => high, wclk => lxclk, a0 => wr_addrp(0), a1 => wr_addrp(1), a2 => wr_addrp(2), a3 => wr_addrp(3), dpra0 => rd_addr(0), dpra1 => rd_addr(1), dpra2 => rd_addr(2), dpra3 => rd_addr(3), spo => open, dpo => int(3) );

rn0 : ram16x1d port map (d => lxdata(0), we => high, wclk => notlxclk, a0 => wr_addrn(0), a1 => wr_addrn(1), a2 => wr_addrn(2), a3 => wr_addrn(3), dpra0 => rd_addr(0), dpra1 => rd_addr(1), dpra2 => rd_addr(2), dpra3 => rd_addr(3), spo => open, dpo => int(4) );

rn1 : ram16x1d port map (d => lxdata(1), we => high, wclk => notlxclk, a0 => wr_addrn(0), a1 => wr_addrn(1), a2 => wr_addrn(2), a3 => wr_addrn(3), dpra0 => rd_addr(0), dpra1 => rd_addr(1), dpra2 => rd_addr(2), dpra3 => rd_addr(3), spo => open, dpo => int(5) );

rn2 : ram16x1d port map (d => lxdata(2), we => high, wclk => notlxclk, a0 => wr_addrn(0), a1 => wr_addrn(1), a2 => wr_addrn(2), a3 => wr_addrn(3), dpra0 => rd_addr(0), dpra1 => rd_addr(1), dpra2 => rd_addr(2), dpra3 => rd_addr(3), spo => open, dpo => int(6) );

rn3 : ram16x1d port map (d => lxdata(3), we => high, wclk => notlxclk, a0 => wr_addrn(0), a1 => wr_addrn(1), a2 => wr_addrn(2), a3 => wr_addrn(3), dpra0 => rd_addr(0), dpra1 => rd_addr(1), dpra2 => rd_addr(2), dpra3 => rd_addr(3), spo => open, dpo => int(7) );

-- -------------------------------------------------------------------- -- The FIFO readout state machine

-- This assumes that the readout clock is no faster than the input -- clock.

-- When the FIFO is not empty, a trigger is generated that enables -- the reading process. Once the FIFO is full, further transfers from -- the transmitting DSP are inhibited until it is completely empty -- again.

-- Data is transferred out of the FIFO if the ready signal is asserted. -- data_en is driven high when data is available from the FIFO. -- If ready is negated on a particular clock edge, that means that -- the data for the previous clock period was not accepted by the -- downstream device.

process (clock) begin if clock'event and clock = '1' then

trigger_d

Reply to
David Tweed
Loading thread data ...

Split the resets up yourself, before XST gets to them?

Assign the main reset to a second signal, and apply that to the critical ones ONLY (or all within this localised block) - you may have to apply a KEEP attribute to the second reset signal.

- Brian

Reply to
Brian Drummond

I had exactly this problem with XST. Because of the high fan out of the reset net XST is duplicating it and then cannot pack the flops.

Add these attributes to the top level :

attribute MAX_FANOUT : string; attribute MAX_FANOUT of por_whatever : signal is "10000"; attribute MAX_FANOUT of por_whatever_else : signal is "10000";

I try not to use a global reset anymore, instead each module takes a re-clocked synchronous reset and uses that on the control path where necessary. Don't reset something if you don't need to (data path usually) as it takes routing resources. Regards, /Mike

p.s. you could use a generate loop on your instantiations to save some typing.

something like

flops : for i in 0 to 3 generate begin ff_wr_addrp : fdc port map (d => wr_addrp_d(i), c => lxclk, clr => rst, q => wr_addrp(i)); end generate;

you will have to change your loc constraints of course.

Reply to
MikeJ

As a p.p.s:

XST allows indexing into arrays of constant strings for the LOC constraints, so the LOC's can be defined up top as a constant array, then referenced by an attribute just before the 'begin' of the generate, i.e.:

attribute LOC of ff_wr_addrp : label is my_loc_array(i);

( pad the strings with spaces to make them fixed length when defining the constant array )

Brian

Reply to
Brian Davis

OK, that's the sort of clue I was looking for. Although it would seem that this would prevent the net splitting in the first place, rather than allowing it to happen and restricting the assignment of loads to common sub-nets.

Good point. I usually try to follow this rule, but since these flip-flops are address counters for a FIFO, there's no other way to get them into a known state.

Well, the typing is already done :-)

Seriously, the file had generate loops to start with, but I couldn't find a way to control the assignment of instance names so that I could use the LOC constraints in a deterministic way. Is there a trick, or a convention as to how the loop index gets added to the name?

Thanks for all your help!

-- Dave

Reply to
David Tweed

Earlier XST versions were a bit broken in this regard, but 9.1 is more consistent. Have a read of the XST users guide it explains it.

I usually leave the LOCs off to start with, build it and then find them in floorplanner to work out the path. Floorplanner can write out a UCF file, so place one flop, write out to a temp.ucf file then steal the necessary line.

/Mike

Reply to
MikeJ

Many thanks for the help! I'll come back to this once I make the shift to 9.1 -- I've been holding off because I'm in the middle of a project.

-- Dave

Reply to
David Tweed

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.