I am working with FFT v.3.2 from Xilinx. Some questions relate directly to that core, some do not.

Why is block floating point unavailable for Pipelined/streaming FFT? Is it because the dynamic range is too hard to predict?

a) Is the output of A/D converter fixed point (generally)? b) Then only division can introduce floating point, right?

For the output ordering, the output data can be presented either in natural order or in bit/digit reverse order. Presenting data in natural order requires additional resources. a) Does it mean that the output of radix-2 butterflies is in bit/digit reverse order? b) Also, isn't the output the frequency spectrum, so what does bit/digit reverse order represent? c) Why would anyone want the output data with the indexes that are (binary number) reversed?

In version 3.1, when using Pipelined I/O the data had to be scaled, in version 3.2 user has a choice of scaled and unscaled data. Why would someone want to use unscaled (unless you always know what your inputs are)?

I don't know what you mean by "introduce floating point". AFAIK the main reason to use floating point (if it is available) is that it avoids the problems of overflow that must be handled explicitly in fixed-point implementations. Sure, the A/D converter output may be a nice integer, but it must be combined with sines and cosines in a way that requires some kind of real number handling (fixed or floating point). It also creates possible overflows of whatever fixed-point range you have established.

Bit-reversed ordering is an unavoidable consequence of the FFT algorithm. Either you handle it within the FFT or else you handle it in the application that uses the FFT. The "additional resources" that you speak of to present a natural frequency order to the output is no more memory and only a little bit more processing time (as compared to the butterfly rounds).

I am not familiar with the specific package, but I suspect that the scaling is needed to optimize the range vs. resolution in the fixed-point implementation. It would not have any meaning in a floating-point FFT implementation.

To address question c) above: If you use the FFT + processing + IFFT as a "black box", ignoring the bit-reversal issue might save some processing time and complexity. Both the FFT and the IFFT include a bit-reversal step. So if one computes, say, the spectrum of a filter, bit-reverses this once in the design stage, one can save run-time or system complexity by using the FFT without bit-reversal and then let the missing bit-reversal of the IFFT undo the effect upon return to

time domain.

Bit-reversals are only important if a human being needs to inspect the results in frequency domain.

What about an application that requires the software to inspect the frequency spectrum, looking for peaks, interpolating between adjacent bins, and finding patterns of harmonic-related peaks. It is hard to imagine that application being written without having the frequency spectrum in its natural order, even if no human interpretation is involved.

Thank You for the answers, they are all informative.

Another question: For testing the algorithm I had to use version 2.1 from Xilinx, since their latest version, for which I can use core generator provides structural model rather than behavioral. The input to the core is converted from integer to 2's complement (in the test bench file). My input file is digital sine wave (file from Xilinx website). The output is two's complement (datasheet for version 2.1 doesn't specify whether it is in bit reversed or natural order). My input is only 42 numbers (or lines). However, the output size is ~100 times greater (and of course it is not 100x the same number, than 100x th other number, values do not seem to corellate to the expected output). Any ideas as to why and what to do about it?

Block floating point has a common exponent for the entire block of data. In order to do this, the entire data set must pass through a max function to determine the max value before a common exponent can be extracted. This is fine if you are going into memory, then through the FFT then back to memory. It doesn't work, of course, if you are doing the FFT on the data as it arrives.

floating point is there to reduce the width of the operand by separating off an exponent. It is helpful for reducing hardware data path widths and for reducing memory. It generally becomes useful at widths greater than 14 bits; single precision ieee floats have 24 bit fixed point fields modified by an exponent. Most ADC's are 16 bits or less, so the overhead for floating point (all the logic to handle the exponents) is expensive compared to the hardware savings realized by using it.

Bit reversal is an artifact of the Cooley-Tukey FFT algorithm. Getting the data back into natural order requires an additional memory buffer which adds to the latency and increases the hardware complexity. The bit reversal refers to the address sequencing, which directly affects the order the samples are presented. The bit reversed addressing means the output frequency bins are not ordered in order of increasing freqency, rather they are ordered as though the address bits have been reversed. For a 16 point FFT, instead of bins 0,1,2,3,...,15 you get bins 0,8,4,12,2,...,7,15. The bit reorder might occur elsewhere in the system, or if you are doing an FFT followed by an IFT (e.g. for a fast convolution or fast correlation), the inverse FFT can accept data in bit-reversed order and output in natural order, and the frequency domain process is independent of the order as long as all inputs are ordered the same way. By skipping the reorder, you save both latency and hardware.

The FFT results vary depending on the nature of the input. An FFT is energy conserving, meaning the sum of the input energy is redistributed into the output bins. If your input is always fairly monochromatic (i.e. very narrowband), then most of the output energy will be focused in one FFT bin, and can be up to N times the input amplitude for an N point FFT. On the other hand, if your input is broadband noise, the output energy will be fairly evenly distributed among all the output bins, which means the outputs will have amplitudes close to the input amplitudes. For larger FFT sizes, this represents a sizeable dynamic range. If you know before-hand the typical and worst case characteristics of your input, you can use scaling to essentially ignore MSBs that are never anything but extended sign bits in your application.

So what? There is no reason why a computer system should not be able to

handle the bit-reversed order. This boils down to a matter of convenience. If somebody -- either computer or human -- wants to inspect the spectrum, the frequency domain processing is no longer a black box, and it might be *safer* to use the natural (non-bit-reversed) order. The the natural order is only *nescessary* is a human user is involved.

What I meant was that it requires another trip through memory to accomplish...it can't be done without temporary storage of the samples because there is a corner-turn. Implementation, of course can be done in many ways, including using a addressable shift register if that is what tickles your fancy. In a hardware implementation, this means either an additional memory (or registers) or additional sequencing and steering to push the result back into memory used earlier in the algorithm.

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.