Xilinx Platform Studio, build up System: "block-RAM components require the adjacent multiplier"

Hi,

after Developing connection my custom peripheral to the OPB-Bus, I tried to download my Design to the FPGA. The custom Design including the Bus needs 87% of my FPGA (Virtex2Pro30

896-7). The device summary is as follows:

===========================================================

Device utilization summary:

---------------------------

Selected Device : 2vp30ff896-7

Number of Slices: 12131 out of 13696 88% Number of Slice Flip Flops: 14472 out of 27392 52% Number of 4 input LUTs: 15191 out of 27392 55% Number of IOs: 109 Number of bonded IOBs: 96 out of 556 17% Number of BRAMs: 78 out of 136 57% Number of MULT18X18s: 136 out of 136 100% Number of GCLKs: 1 out of 16 6%

===========================================================

In the next step, I tried to download the design.

In my first try, the system consisted of the following parts:

In my last try, of consists of the following parts:

formatting link
formatting link

the Synthesis of the design aborts with the following message:

ERROR:Place:665 - The design has 106 block-RAM components of which 4 block-RAM components require the adjacent multiplier site to remain empty. This is because certain input pins of adjacent block-RAM and multiplier sites share routing ressources. In addition, the design has 136 multiplier components. Therefore, the design would require a total of 140 multiplier sites on the device. The current device has only 136 multiplier sites.

After that, I removed the RS232 from the design and tried again.

Finally, I moved all components from the PLB to the OPB Bus, where possible, that gives:

formatting link
formatting link

And the following error:

ERROR:Place:665 - The design has 84 block-RAM components of which 2 block-RAM components require the adjacent multiplier site to remain empty. This is because certain input pins of adjacent block-RAM and multiplier sites share routing ressources. In addition, the design has 136 multiplier components. Therefore, the design would require a total of 138 multiplier sites on the device. The current device has only 136 multiplier sites.

Has anybody experienced the same problem? Does anyone have a solution for that, without building a smaller design? The FPGA has 136 multipliers and 136 Block RAMs, does that mean you cannot use all multipliers when you design a complete system with PowerPCs etc?

Reply to
Peter Kampmann
Loading thread data ...

Peter Kampmann schrieb:

if your design uses 100% of the multipliers and some other ip requires the BRAM placement that requires the multiplier being empty then, it want fit.

maybe there is a way to relax the placement with some trick, try to create a design where you are using 0 BRAMs and 100% multipliers, see if that gets mapped without problems.

I think if you are not using OCM brams the PPC design should not require and BRAMs at all so all multipliers should be useable, of course if that is not the case and the use of PPC instantly reduces the amount of useable multipliers then this should be documented by Xilinx somehow

Antti

Reply to
Antti

The hint is here... only 4 blockRAMs require the multiplier site empty.

I have seen this for Spartan-3, and didn't realise it also applied to V2Pro. There are _some_ shared connections between a BRAm and a multiplier. Not all uses of the BRAM require those shared connections; indeed, most don't. Specifically, using both ports at the fullest width (32 or 36 bits per port) is a problem.

If you can identify a way to use deeper but narrower BRAM blocks (only

18 bits wide for example) for four of your BRAMs, you are OK.

If not, I notice your LUT and FF usage are both below 60%. Therefore, if you can move 4 of your multipliers into LUT fabric, you are also OK. (For example, if you are using a few of the 18*18 mults as 8*8 mults, these are an ideal candidate)

ATTRIBUTE mult_style : STRING; ATTRIBUTE mult_style of is "block"; ATTRIBUTE mult_style of is "lut";

might be useful...

- Brian

Reply to
Brian Drummond

A further comment on my previous post...

Putting this...

together with this...

suggests you can parallel two 18-wide BRAMs to make a 36-wide BRAM for the 4 problem BRAMs, without running out of resources.

- Brian

Reply to
Brian Drummond

The issue is that when the block Ram is used in the 36 bit wide mode it shares pins with the multiplier (IIRC it is the high 18 bits of the BRAM DO are connected to one of the multiplier inputs), thus if the multiplier is used it must take its inputs from those block ram bits. You can avoid the issue by using a narrower aspect ratio for the BRAM (18,9,4,2, or 1 bit wide), or by not using all the multiplier sites (each multiplier is co-located with a BRAM). Since you require all the multipliers on the chip, you don't have the luxury of being able to use the BRAM in the 36 bit wide mode. you have two choices: 1) change your BRAM usage to avoid using the 36 bit wide mode of the BRAMs, or 2) eliminate some of the multipliers from the design by either time sharing other multipliers or by performing some of the multiplies in the fabric rather than with the embedded multipliers.

Reply to
Ray Andraka

To clarify: if both the block RAM and multiplier are used at a site, and the block Ram has 36 bit outputs, the multiplier must use those outputs as its input.

Reply to
Ray Andraka

Thanks a lot for your replies, I could solve this problem by implementing some Multipliers as LUTs like Brian proposed. Now this occurs not any more, but the Routing lasts for 16 Hours until now :)

Regards, Peter

Ray Andraka schrieb:

Reply to
Peter Kampmann

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.