FFT short time fourier transform (STFT) question

- J
- Jamie
  
  Contact options for registered users
posted
12 years ago

Thu, Aug 11, 2011 5:55 AM

Hi,

I have begun to use an FFT in software, and I started thinking about how to turn the intensity over frequency FFT chart into a frequency over time chart, ie, by making a strip chart, time/frequency domain "Gabor spectrum STFT" with each XY point of the chart having an intensity based on the power at that frequency. So I was thinking to take that a step further (and push the limits of a modern CPU), instead of generating the strip chart data with FFT blocks of 1000 samples, what would happen if there was an FFT generated for each incoming sample, ie 1000 FFT's per second for a 1kHz sample rate, with each new FFT centered on the next incoming sample. If this data was then stripcharted (1000 new columns per second added to the chart) for a sine wave input of 250Hz, what would the frequency over time greyscale image look like? For a normal Gabor spectrum STFT chart (ie. 1 FFT per 1000 samples) with a 250Hz sine wave input and 1kHz sampling frequency, the chart should just show a steady intensity peak from left to right half way up the chart (500Hz Nyquist frequency would be the top of the chart). Also this might be a good way to encode/decode data for communication, in a 2-dimensional representation that is converted via FFT's to and from voltage/time for each sample, as long as the hardware can do one FFT per sample.

Gabor spectrum (STFT short time fourier transform chart:

formatting link

cheers, Jamie

- J
- Jamie
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Aug 11, 2011 6:10 AM

I think the real benefit of 1FFT per sample is it gives instant impulse response compared to 1FFT per 1000 samples. ie. if the 250Hz sine wave is changed to 225Hz, each new per sample FFT will have a moving intensity line and the slope of this line will be proportional to how fast the 250Hz was changed to 225Hz.

cheers, Jamie

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Aug 11, 2011 7:08 AM

The main effect of this if as seems likely from the question that you don't know what you are doing is that you will see variable ripple effects in the sidebands caused by the edge discontinuities at the start and end of the sampling window. You would be better off doing larger overlaps if you really need time domain resolution beyond what sequential blocking will give you. The term for it in their application is "Overlap". BTW such diagrams are often easier to interpret in false colour rather than monochrome since you can distinguish several orders of magnitude difference in signal strength more easily.

DAQARTA does this stuff out of the box and is pretty good at it too. Why reinvent the wheel?

Regards, Martin Brown

- J
- Jamie
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Aug 11, 2011 7:46 AM

I'm just using a rectangular FFT windowing function so far.

You would be better off doing larger

I think there must be a formula to show that there is increased frequency resolution available based on the amount of FFT overlap. ie. like ADC voltage oversampling gives more effective voltage resolution bits, overlapped FFT's should give more frequency bins, up to the max overlap of the FFT window size - 1 sample. (1 FFT per sample). Theoretically to add the most extra frequency bins at the highest time resolution, the FFT window size should be as large as possible (n), and the FFT overlap should be the FFT window size - 1 sample.

I take a wild guess that for non-overlap FFT with 2^x bins, adding the full n-1 overlap can theoretically increase the effective number of frequency bins to 2^(x+p) where p is some small integer related to the FFT window size n. For an FFT window size n = 2^10, p might be 5 or 10 at most. That would require some form of post processing to generate the missing extra frequency bins from all that FFT data though, similar to decimation of ADC sample rate. Also 1 FFT per sample will generate data at 1024x the rate of the sample frequency for a FFT window size of

1024, and use 1024 times more CPU than normal FFT code, but its possible to get more effective frequency bins I am pretty sure with that technique.

cheers, Jamie

- J
- Jamie
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Aug 11, 2011 8:19 AM

DAQARTA has the limitation that there is a tradeoff between frequency and temporal resolution in the spectrogram output.

I found this:

formatting link

sliding DFT, it is designed to do 1FFT per incoming sample, and once the first FFT is done, each successive FFT only takes "two real adds and one complex multiply" independent of the FFT window size! (assuming the last incremental FFT is available)

So now I just need to find an FFT decimation filter to crunch the 1 FFT per sample data down, then can increase frequency and temporal resolution on a spectrogram at the same time :D

cheers, Jamie

- J
- Jan Panteltje
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Aug 11, 2011 9:41 AM

On a sunny day (Thu, 11 Aug 2011 00:46:09 -0700) it happened Jamie wrote in :

Rectangular window will not give very meaningful results I think.

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Aug 11, 2011 9:41 AM

There is *always* a tradeoff between frequency resolution and time. They are conjugate variables in the Fourier transform, confine one and the other spreads out - this is intrinsic to the mathematics.

If you want better frequency resolution and have decent signal to noise then you can use maximum entropy which gives the optimum tradeoff of resolution against signal to noise but with some risk of unfamiliar artefacts for practitioners not skilled in Fourier interpretation.

An introduction is in Numerical Recipes (whilst I don't think much of their algorithm or code it does sort of work)

formatting link

Better algorithms do exist. You can't use them realtime unless you have an exceptionally fast processor, but offline it will give you the most precise answer for the frequencies of significant components in a time series with an accuracy that depends on their intensity.

Strong signals have a much better defined frequency than weak ones. Overfit the system noise and they will spuriously split into two.

You do realise this only calculates a single known frequency component don't you?

To do a whole spectrum this way would be O(N^2) not fast at all. Or require massive hardware parallelism (as is done for sigint devices).

Regards, Martin Brown

- T
- tlbs101
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Aug 11, 2011 10:10 PM

how

sed

the

f

r

ine

a

or

rt

n

=20

e

t

Five years ago I built a 2 GSPS real-time FFT engine using big Xilinx Virte= x FPGAs that spit out a 1024-point FFT every 1.2 ms (total frame rate was 1= .5 ms). Dozens of these FFTs were raster-scanned to a video display with i= ntensity displayed as different colors, but I am not at liberty to discuss = what was being viewed, or why. It was pretty awesome, though.

At the time, it was the 2nd-fastest of its kind in the world. The fastest = one usedVirtex-II devices, but I never heard if they actually got there sys= tem put to use, or not. Mine was put to use.

I am not so sure your idea *still* doesn't need hardware (as opposed to sof= tware) to run at the speed you describe.

Tom Pounds near Albuquerque

- J
- Jamie
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Aug 11, 2011 10:55 PM

Hi,

Thanks, I missed that line: "the kth value of the DFT". I am starting to think the best method for what I would like to do is to use parallel bandpass filters, with slight overlap. Setting a threshold trigger for delta-power over time on the output of each of these filters can give accurate time resolution of frequency events that an FFT can't do.

cheers, Jamie

- J
- Jon Kirwan
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Aug 11, 2011 11:27 PM

biquads?

formatting link

Jon

- J
- Jon Kirwan
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Aug 11, 2011 11:35 PM

That's fast, but I was doing 1024 pt, complex-in, complex-out floating point FFTs on an integer-only ADSP-2111 and memory says it took slightly over 3ms -- back in 1992 and before FFTW was around. I'd have to dig up the details to be absolutely sure, but I coded it and that's what I recall now as the rate (we needed less than 4.5ms.) I'm not so impressed that in 2006 on fancy and big FPGAs the rate would be 1.2ms (and not knowing if it handled complex in/out values.) I learned out of Brigham's Fast Fourier Transform and Its Applications. A book I definitely can recommend, even if it is old. Probably some better stuff around now with more graphics to help hammer in the points. But I've not been exposed to recent writings.

Jon

- T
- tlbs101
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Aug 11, 2011 11:43 PM

The input arrays were real, but to double-up the rate I fed two real arrays= as one complex array into the FFT engine, then I would extract the two FFT= s on the other side (I will have to dig up my design notes to get the algor= ithm, it's not something I can remember in detail right now).

Tom P.

- D
- DonMack
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Aug 12, 2011 12:29 AM

Hi,

I have begun to use an FFT in software, and I started thinking about how to turn the intensity over frequency FFT chart into a frequency over time chart, ie, by making a strip chart, time/frequency domain "Gabor spectrum STFT" with each XY point of the chart having an intensity based on the power at that frequency. So I was thinking to take that a step further (and push the limits of a modern CPU), instead of generating the strip chart data with FFT blocks of 1000 samples, what would happen if there was an FFT generated for each incoming sample, ie 1000 FFT's per second for a 1kHz sample rate, with each new FFT centered on the next incoming sample. If this data was then stripcharted (1000 new columns per second added to the chart) for a sine wave input of 250Hz, what would the frequency over time greyscale image look like? For a normal Gabor spectrum STFT chart (ie. 1 FFT per 1000 samples) with a 250Hz sine wave input and 1kHz sampling frequency, the chart should just show a steady intensity peak from left to right half way up the chart (500Hz Nyquist frequency would be the top of the chart). Also this might be a good way to encode/decode data for communication, in a 2-dimensional representation that is converted via FFT's to and from voltage/time for each sample, as long as the hardware can do one FFT per sample.

Gabor spectrum (STFT short time fourier transform chart:

formatting link

cheers, Jamie

-------------------- It won't do any good. There is an innate inverse relationship between frequency and time(f = 1/t). The Fourier transform's basis functions are not localized and cannot describe local behavior well.

Take a square wave. The Fourier transform of this has infinite bandwidth. Yet the function is one of the most basic functions that exist. The Fourier transform has extreme difficulties describing functions that have sharp transitions because sharp transitions occur over short periods of time and short periods of time have high frequencies(the f = 1/t thing).

We generally like to think of a square wave or any other period function with period p as having a frequency of 1/p. The Fourier transform can recover this fundamental frequency but really it can't. Remember we have to apply a low pass filter to it(mathematically we can take the limit though).

So, if something simple as a square wave has infinite bandwidth no matter what kinda fourier transform you use you won't be able to describe it accurately with finite bandwidth. Your sliding windowed Fourier transform can't do any better than a multiresolution windowed Fourier transform.

There are several transforms that take a time based signal and transform it in a frequency and time. The wavelet transform is one such transform.

But no matter how you slice and dice it, frequency and time are inversely correlated. You can attempt to use a basis that is more local than the fourier basis and you'll end up with something like wavelets. You can go further and attempt to use the signal itself to determine the basis and get something like empirical mode decomposition.

Ultimately you'll always end up with the same problem of trying to narrow down the frequency and time and it just can't be done.

For example, what is the frequency of some signal s(t) at time t = 0? It's a nonsensical question. Ultimately what you have to do to answer such a question is not use the point at t = 0 but an interval around that point. This allows you to "zoom out" a bit and see what the signal is doing and hence determine it's frequency. For lower frequencies you have to zoom out more because they have a larger period(p = 1/f and when f is small p is large).

The real issue at hand is we define frequency as the rate of something occurring periodically. Mathematically we actually compare it to sinusoids. That is, sinusoids is the model for frequency. But a sinusoid has no instantaneous frequency. Therefore anything that is composed of sinusoids will not have instantaneous frequencies. Wavelets have more of a localized frequency but now frequency becomes more vague(in fact wavelets do not have frequency and we are comparing apples to oranges now).

There is a way to define instantaneous frequency as a mathematical generalization:

formatting link

The problem is that it seems to be useless to interpret as a real frequency or we just haven't found out how.

i.e., with the instantaneous frequency we can actually answer the above question about the frequency of s(t) at time 0 but don't expect for it to make a lot of sense.

The main point is that you may not get much benefit but you are limited mathematically so no matter your algorithms are they can't overcome the frequency/time limitations described above. It is not inherent in the mathematics itself but in the concept of frequency.

- J
- Jon Kirwan
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Aug 12, 2011 5:35 AM

No need. I'm well aware of the technique. Used it many times. Can describe it, even.

Anyway, now I'm mostly curious if you are saying that you were doing _two_ 1024 pt real arrays in 1.2ms or if you were actually doing them in 2.4ms in a complex-in complex-out 1024 pt FFT that you packed and then disassembled. Completely different performance meanings to me, depending.

Jon

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Aug 12, 2011 9:00 AM

FPGAs that spit out a 1024-point FFT every 1.2 ms (total frame rate was 1.5 ms). Dozens of these FFTs were raster-scanned to a video display with intensity displayed as different colors, but I am not at liberty to discuss what was being viewed, or why. It was pretty awesome, though.

usedVirtex-II devices, but I never heard if they actually got there system put to use, or not. Mine was put to use.

I saw one a long time ago ~1983-4. On "loan" to a big dish radio observatory and it came with two US guards for the duration. At the time it was the fastest of its kind in the world.

software) to run at the speed you describe.

I think he will struggle. FFTW would be the first thing to try to see if the ballpark performance figures are even remotely close. Then stripping it down to a hardwired fixed length transform and choosing a CPU with appropriate cache lines to optimise a 1024 real transform. It is anybodies guess what the best factorisation will be these days.

Intels IPPS benchmarks about 20% faster than FFTW3 at these lengths.

Temporal resolution could be improved using sub blocks to preserve intermediate results and then twiddled together to avoid rework. The out of place transforms are usually faster so this should not hurt.

Regards, Martin Brown

- J
- Jamie
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Aug 12, 2011 11:03 AM

yep good guess :)

I am using this one here:

formatting link

cheers, Jamie

- T
- tlbs101
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Aug 12, 2011 5:34 PM

The system was composed of two 1 GSPS digitizers from Thompson CSF with the= companion Demux chips. They were run on opposite phase clocks. The Demux= ed outputs were fed into a Xilinx Virtex 2000 for framing and scaling (and = Gray to offset binary conversion), then sent to a Virtex 3000 series which = contained the FFT engine. =20

The first FPGA stored a frame of data in internal SRAM and passed framed da= ta as two real 1024-point arrays to the FFT engine, loaded as the {Re} and = {Im} components of one complex 1024-pt array. Then it would extract the tw= o FFTs from the output of the engine.

The FFT engine did its work in about 800 usec (IIRC). The rest of the 1.2 = ms was for extraction and data transfers. The total frame time was 1.5 ms = because another 300 us was devoted to transmitting the extracted FFT arrays= to a video processor DSP, off-board.

Now that I am remembering more about this, I recall there were four FFT eng= ines working in parallel. =20

Too many projects have come and gone since that one.

Tom Pounds near Albuquerque

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Aug 12, 2011 5:51 PM

Rick Lyons, "Understanding Digital Signal Processing", Prentice-Hall. It's up to the 3rd edition now so I can't quote copyright dates. I know he goes over just this sort of thing in there.

--
www.wescottdesign.com

- J
- Jon Kirwan
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Aug 12, 2011 9:30 PM

Now I understand much better and am more impressed, overall. Thanks very much for the detailed recollections. I really enjoyed reading this.

Jon

- J
- Jamie
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Aug 12, 2011 11:08 PM

Hi,

A single FFT done over 1024 samples, will give a linear frequency bin output on the X-axis. However theoretically the frequency bin dispersion can be logarithmic, so that there are more frequency bins closer together at higher frequency for the same timestep. A single FFT can't produce this data though as it averages over the whole sample period. That is where using multiple overlapped FFT's is a possible advantage, as a spectrogram chart calculated using overlapped FFT's should be able to show a logarithm Y axis with the frequency bins getting closer together up the Y-axis, giving more time and frequency resolution for that timestep than lower frequency bins in the same timestep.

cheers, Jamie