Xcell Article on 1.2Gsamples/sec FFT

Hi all, Just read an interesting article in Xilinx's xcel publication. Lots of technical detail, and no "marketing" to speak of.

formatting link

After reading this I had a couple of burning questions I'm wondering if anyone, or Ray himself, can shed some light on

1) 1.2 Gsamples/s seems like a pretty high input data rate - no doubt there are a few applications around that need it. But what about the 1.2Gsamples/sec data output rate? What systems can take the FFT outputs at this rate, and do something sensible with the data? Although the FFT engine has done a bunch of processing, it hasn't really reduced the amount of data in any way? I mean you can't go hookup 1.2Gsps to a pc based platform. Even 10 gigabit ethernet cannot transport this amount of data, let alone the cpu do much processing with it.

2)I didn't understand the comparison between the 66 Gflop fpga FFT core and the 48 GFLOP Cell processor implementation. Was the cell processor implementation processing samples at 1.2Gsps? was it also at from 32 to 2048 point transform?

Cheers Andrew

Reply to
Andrew FPGA
Loading thread data ...

formatting link

Actually, I would say there are far more fixed-point FFT cores used than this floating-point one, because the fixed-point cores can achieve even faster throughput.

If you look at the Andraka Consultant web page you will see explanations about where it is used in. In general a FFT core is not a stand alone block, but usually used in connection with other functionality. So the core is embedded in an application and from the outside you don't see that data rate anymore.

Well, actually in many cases the FFT will actually increase the data amount. If you come from real world applications, usually you have real data and can set the imaginary part to 0. Then the output of the FFT is complex.

Also, if you want to use that core in connection with a PC, you probably will not hook it up over a Ethernet connection, but use it with an FPGA on a PCI or PCI-E plug in card.

Cheers,

Guenter

Reply to
Guenter Dannoritzer

formatting link

That particular application was for image processing, the FFT was used in two passes to perform a 2D FFT of various sizes. Fast FFTs are also commonly used in communications, digital radio and SIGINT applications, all of which need to do the FFT on incoming data streams sampled at high rates. The 1.2 GS is the upper bound for this architecture in this device. The application in question needed a sustained 1.0 Gs/sec to keep up with the frame data. The FFT is surrounded with other hardware, not connected (at least on the data path) to a computer.

The cell processor was not working at 1.2Gsps, in fact it would not be able to achieve that data rate. The comparison was to show that the FPGA design could substantially out-perform the cell processor. The cell application was actually a large FFT, 512K points as I recall. The large FFT is essentially the same process as a 2D FFT except that there is a phase rotation between passes for the large FFT that is not there for the 2D FFT. While the comparison is not exactly 1:1, it is similar enough to be able to draw a valid conclusion. I have used the same floating point core to perform large FFTs instead of 2D.

Reply to
Ray Andraka

In this case, the fixed point version has about the same speed as the floating point, but with considerably less latency. A single instance of the core runs at up to 400 MHz in a -10 V4SX55 for both the floating point and fixed point versions. That speed is limited by the max clock of the DSP48 and BRAM elements. The fixed point core is smaller, which means more instances can be fit into a device for a higher overall throughput.

Reply to
Ray Andraka

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.