Question about multi write ports RAM in FPGA?

- J
- JJ
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 15, 2006 10:17 AM

You only need one fast clock at whatever freq you get near fmax, and use clock enables to move the datapath pipeline forward every 1/3 fast clock. The only thing I don't like about that is the energy needed to clock all the logic at 3x the enabled rate but it is safe. A more elaborate clock system could produce 3x and 1x with out skew too.

Synthesis will then tell you your new limit is your datapath or control logic while your BRAMs have good slack. You must have some deep logic paths which is to be expected for an early design. Perhaps you can now think about using the 3x or full clock to redesign that 100MHz logic to get it to run a bit faster using less hardware and or more pipelines.

Perhaps you have wide adders or muls, try pipelining these, other than that I can't say without knowing your architecture. I assume you can synth each block seperately to get a feel for what each block limit is. When you put them all together they will usually run slower.

Remember, in FPGA, the BRAMs are about as good as in an ASIC but the logic is proportionately say 3x slower so optimal FPGA architecture can never be the same as for ASIC. The adders are worse since you are stuck with ripple designs while real ASIC designs can use alll sort of neat look ahead or carry save or carry select, but don't waste time in that direction in FPGA.

So what is your widest adder or deepest logic path?

John

- J
- JJ
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 15, 2006 10:46 AM

There is a similar trick which used to be tought in CS for how to exchange 2 registers without using a temp reg. Since xor and mov usually both take 1 cycle but may not have same effect on CC flags.

A

- J
- John_H
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 15, 2006 2:26 PM

Repeating the response to a previous thread which included code:

Each write port has its own memory associated with it. When you're writing to one port since you can't update the others, you have to make the data "right" independent of which write port was last used. The XOR lets your latest input data store into your memory

MemA[AddrA]

- F
- fpga
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Mar 15, 2006 3:31 PM

I got the listed number, say, 287.9Mhz and 72Mhz without put the multi RAM block into the data pipeline. That's why I don't understnd why the maximum frequency of clk isn't 1/3 of that of clkx3. Because in the whole synthesis, the usage of clk is to provide the input to DCM, which generates clkx3. Will it bother you too much if I put my code here? Thanks a lot.

- F
- fpga
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Mar 16, 2006 4:42 PM

Sorry for the trouble. I have solved the problems. I havn't instantiate the DCM correctly. Now the maximum freqency of CLk is 1/3 of CLKx3.