FIR filter running out of FPGA memory in stratix ep1s60

I've got an FIR design that runs out of FPGA memory in an ep1s60 when I set the data width to 24-bit (The design fits with a data width of

16-bit). However only 13% of the total memory is used. I assume the problem is that I have lots of smaller memories, and they cannot share the same memory blocks (M512, M4K, M-RAM). Can anyone who has experienced this problem share their strategies for dealing with this.
Reply to
Wilhelm Klink
Loading thread data ...

After fitting there is the FITTER report (resource section --> fitter resource usage summary). Here you can see the usage of total memory bits and the usage of complete M4K memory blocks.

Rgds

Reply to
ALuPin

You'll need to provide more details as to how you set up the memory as well as the filter. If the sample rate is one clock per sample, then it is not really appropriate to use the memory because you are using only one location per memory (and wasting the rest).

What is the ratio of your data rate to the clock? How many taps is your filter?

Wilhelm Kl> I've got an FIR design that runs out of FPGA memory in an ep1s60 when

--

--Ray Andraka, P.E. President, the Andraka Consulting Group, Inc.

401/884-7930 Fax 401/884-7950 email snipped-for-privacy@andraka.com
formatting link

"They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759

Reply to
Ray Andraka

After viewing the fitter RAM summary details I can say the following:

The filter is a cascaded polyphase FIR. There are three stages, and the first stage is a decimate by 20, so has 20 polyphase arms. Each polyphase arm comprises of a distributed arithmetic unit. The samples are distributed as parallel data to each polyphase arm, and then serialised. My current interface at the input of each polyphase arm uses 3 large registers of size nis*data_width, where nis = number of interleaved streams = 8, and data_width = 32. 3 x 256 = 768, and multiplying this by the number of phase arms across all cascaded stages, this number gets very very large. Because 1FF = 1LE, this takes up heaps of LEs, so I decided to implement these registers in memory. In my data_width = 16 implementation (which fits in the device) these registers are 128 bits in size, and constitute a depth

1, width 128 memory. Clearly depth 1 memories will result in poor use of memory resources. We have M512 = width 18, M4K = width 36, M-RAM = width 144, so I'd expect each register to require 4 x M4Ks, or 8 x M512s. Surprisingly, according to the fitter RAM summary, one of the worst offending 128-bit registers used 54 x M4Ks and 8 x M512s.

Regardless of this problem I see that it was a BAD idea to fully parallelise the data in the input interface of each polyphase arm (seemed the easiest way at the time though).

Reply to
Wilhelm Klink

Correction: I realise that the 54 x M4Ks and 8 x M512s memory usage must be due to sharing of memories.

Reply to
Wilhelm Klink

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.