Help comparing two FFT approaches

R

RobertMacy 12 years ago

Years ago I thought I verified there was no difference between these two techniques, but may have done it wrong.

Trying to improve S/N, pulling signals out of noise: assume the noise is white and the signals are long term stable. FFT over a fixed length, done 100 times and all averaged together vs FFT over 100 times the fixed length.

Somehow it just seems like it should be better to do the 100 times fixed length vs doing the fixed length 100 times. I had expected it to be better, but at the time could NOT verify through simulations. Became a moot point, so moved on. However, today the system tradeoff again rears its ugly head, thus I ask the knowledgeable people here.

Vote

J

John Larkin 12 years ago

Are you using a windowing function?

A short FFT has more windowing problems than a longer one. Each short FFT will include some spurious stuff caused by truncating the ends, and 100 of them will all add up. One big long FFT sample has only one set of ends.

John Larkin Highland Technology Inc www.highlandtechnology.com jlarkin at highlandtechnology dot com Precision electronic instrumentation Picosecond-resolution Digital Delay and Pulse generators Custom timing and laser controllers Photonics and fiberoptic TTL data links VME analog, thermocouple, LVDT, synchro, tachometer Multichannel arbitrary waveform generators

Vote

O

o pere o 12 years ago

This may depend on the kind of signals involved and what you want to extract from the FFT. Consider a pure sine wave, for instance. The long FFT will give you better frequency resolution: you will get a narrow spike, i.e. better frequency resolution. Each short FFT will give you a blurred response (in the general case) and averaging won't help to improve this. Averaging the FFTs will require less resources.

It could well be that the result for those frequencies that you get with the short FFT is the same with averaging or without. A quick thought for f=0 and for f=fs/2 makes me think so... And quick octave simulation seems to confirm this!

(Tried abs((fft(x1)+fft(x2))) and abs(fft([x1,x2])) for two vectors x1 and x2)

Pere

Vote

P

Phil Hobbs 12 years ago

No performance difference, but there is a cost difference. Shift-and-add is more efficient by a factor of roughly log2(100) ~ 7, but is harder to code and harder to get right.

Binning a longer transform is easy to code and easy to get right.

So unless you're hurting for CPU cycles, I'd start with binning. Even if you need to go to shift-and-add eventually, you can use the binning code to test the shift-and-add code.

Cheers

Phil Hobbs

Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC Optics, Electro-optics, Photonics, Analog Electronics 160 North State Road #203 Briarcliff Manor NY 10510 hobbs at electrooptical dot net http://electrooptical.net

Vote

M

Martin Brown 12 years ago

You have to decide first what answer you are interested in and what frequency resolution you really need in the final result. Iff you only want the amplitude then choosing a convenient transform length and averaging the (all positive) amplitudes will get you the nicest looking results provided that the frequency resolution is adequate.

You will also have to pay attention to end effects if there is any long term drift or your answer will be dominated by the FFT of a sawtooth.

Windowing is the traditional fixup for trading resolution against end effects but you can do better at higher computational cost with the likes of maximum entropy depending on exactly what you need to do.

100x longer baseline allows you to distinguish between finer frequency differences which might or might not be important in your application. If it isn't then you pay extra for doing the longer transform.

N.logN

vs

M(N/M).log(M)

Assuming say N = 2^23, M=2^16 done 2^7 times as an example

23.2^23 vs 16.2^23

The time difference isn't huge but it favours using the shorter transform if it will give you the frequency resolution you need. It is easy to make mistakes taking short cuts for go faster stripes...

If applicable then using real to complex conjugate symmetric transform will also get you almost a factor of two performance improvement.

Regards, Martin Brown

Vote

V

Vladimir Vassilevsky 12 years ago

Think average sum versus sum of averages. If SNR is sufficiently high then both are equvalent. At low SNR, average of sum is more accurate.

Vladimir Vassilevsky DSP and Mixed Signal Designs

formatting link

Vote

J

John Larkin 12 years ago

What about truncation/windowing artifacts?

John Larkin Highland Technology, Inc jlarkin at highlandtechnology dot com http://www.highlandtechnology.com Precision electronic instrumentation Picosecond-resolution Digital Delay and Pulse generators Custom laser drivers and controllers Photonics and fiberoptic TTL data links VME thermocouple, LVDT, synchro acquisition and simulation

Vote

P

Phil Hobbs 12 years ago

Shift-and-add isn't hard to get right for power spectra. The effect of windowing depends a lot on how much you overlap the data sets. If you take say, 1024 samples, apply a von Hann window, transform, take another

1024, and so on, you're wasting a lot of information because not every sample has the chance to be in the middle of a window, where it can contribute more.

The other extreme is to compute the full windowed FFT at every sample point, so that every sample eventually occupies each position in the window function. This is called a "sliding transform".

Sampling theory comes to the rescue here. Each of the frequency channels of the sliding transform is equivalent to a FIR filter whose frequency response is the FT of the window function. This will be many times narrower than the Nyquist frequency of the original sampled data, and so the frequency channels don't have to be sampled as frequently.

A good window function will let you do a good job with an overlap of about 3/4, i.e. you compute a 1024-point windowed transform every 256 samples. Folks like Martin B, whose data points are very expensive, will probably argue for doing more computation than that, e.g. using an overlap of 15/16 or even more.

[I've spent a good chunk of this evening discussing signal processing details with a contract-engineering company working for a client of mine. They don't understand this stuff in any great depth, but at least they're willing to learn. (It would have been good to know that they didn't understand before we started, but oh well.) They're nice guys who want to do a good job, but not much fire in the belly, unfortunately.]

Cheers

Phil Hobbs

Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC Optics, Electro-optics, Photonics, Analog Electronics 160 North State Road #203 Briarcliff Manor NY 10510 hobbs at electrooptical dot net http://electrooptical.net

Vote

P

Phil Hobbs 12 years ago

If all the operations are linear, the average of a sum is the same as the sum of the average. I think you're referring to coherent vs. incoherent averaging, i.e. averaging power spectra vs taking the power spectrum of the average. That's different, for sure. I don't think the OP has told us which he's actually contemplating.

Cheers

Phil Hobbs

Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC Optics, Electro-optics, Photonics, Analog Electronics 160 North State Road #203 Briarcliff Manor NY 10510 hobbs at electrooptical dot net http://electrooptical.net

Vote

R

Robert Baer 12 years ago

Seems to me the S/N ratio would be the same for "short" periods as for "long" periods - and so there would be concern if one saw a difference. Now if you did the equivalent of bandwidth limiting and filtering...

Vote

R

Robert Baer 12 years ago

Seems to me the S/N ratio would be the same for "short" periods as for "long" periods - and so there would be concern if one saw a difference. Now if you did the equivalent of bandwidth limiting and filtering...

Vote

M

miso 12 years ago

You need to read up on periodograms and specifically the Welch method. The proper technique uses overlapping segments.

Vote

M

Martin Brown 12 years ago

Are identical to within a linear scaling factor by definition because addition and multiplication are both associative and distributive.

Utter nonsense. If it is all linear there is no difference.

The difference is between

|FFT{all data}|

and

Average( |FFT{N subsets of data}| ) useful

vs

| FFT{ Average(subsets of data) } | likely to be meaningless

It is taking the absolute modulus of the FFT that makes the process non-linear and allows independent data to pull signal out of noise.

Also for the OP you need to decide whether the interesting frequencies are the high ones or the low ones and choose your subsets accordingly.

Regards, Martin Brown

Vote

R

RobertMacy 12 years ago

Interesting. The idea of overlapping the sample strings did occur to me to try *when* the signals were changing. But, [and I find this really hard to believe] there may be some advantage to overlaying with a stable signal. It's just that looking at the FFT equations; I just don't see the advantage of either technique due to the summations that are the basis of an FFT. Nor any advantage to overlapping.

As stated in the original, the assumption is that signals are stable and the noise is white. Simulations between the two techniques yielded identification of the signals within numerical accuracies [10-15, those kinds of ranges]. Yet, recent data suggests something else going on. There is a possibility that going to the 100 times longer fft 'culls' out the 'distortion related' harmonics of the signals by increasing the resolution to the point the bins reject 'close', but unrelated, coherent energy, perhaps that is what I'm seeing.

That could be it, because the deterioration appears to go up with frequency where more and more upper harmonics exist, but the difference does not appear to go up linearly, more abruptly at the high end. But, that random increase could be just an accident of signal harmonic locations.

Vote

R

RobertMacy 12 years ago

Years ago from memory, simulating signals in white noise; there was NO difference [within numerical accuracies 10^-15 ] between the two techniques. However, you have me curious to go back and do the experiments again, this time with 'normal' S/N and with extremely low S/N. just to see if the simulations predict a difference. I predict there will be NO difference in the simulation results, BECAUSE the results were identical except for numerical accuracies. It would seem that any potential differences related to the original S/N would have shown up. ...just thinking out loud BEFORE doing all that work.

In the ACTUAL data when comparing the two techniques: The average of a series of short packet langths caused the higher frequency signals to completely disappear from the spectral display. The very long packet length the higher frequency signals were very discernible.

That seems like the opposite of your statement that average of the sum of short lengths would yield better results.

Vote

R

RobertMacy 12 years ago

Don't understand your comment. Why is it important to 'select' interesting frequencies?

At least simulations show that energy is energy and S/N does not matter whether in low freq band or high freq band. The FFT process pulls them out regardless.

Vote

M

Martin Brown 12 years ago

Think carefully about whether you are most interested in low frequency detail or high frequency detail if you intend to segment the data. There is more than one way of forming subsets of the raw data.

Concrete small example :

a b c d e f g h i j k l m n o p 16 values

taken as subsets of 4 you have the most obvious sequential

a b c d e f g h i j k l m n o p

This gives you maximum sensitivity to the alternating component.

Real implementations with windowing may overlap the data.

But you could study only lower frequencies forom the same time series by subsampling every fourth data point as

a e i m b f j n c g k o d h l p

If there is a real risk of out of band components being present (as there would be with white noise) you have to preconvove the time series with a suitable short filter kernel to control it.

This is in essence what FFTs do behind the scene for decimation in time and since you don't care about the phases if you are going to average the power spectra you can study detail in the lower frequency band more effectively this way (with some risk of aliassing).

The only question here is whether you can get to the same result faster. There is no free lunch but sometimes you can get a discount...

Regards, Martin Brown

Vote

G

George Herold 12 years ago

Fun thread! So I'm fairly ignorant when it comes to windows and FFT's. But if you were going to average 100 FFT's with no window would the edge/clipping artifacts tend to average away? (Next time I have the SRS770 fired up I'll have to try. Nothing like real data!)

Windowing seems to cause a lot of confusion. A colleague was doing FFT's of ringing exponential decays and then finally understood that he wanted no window.

Say I think it's mentioned before, but what's the classic text for DFT's (Discrete Fourier Transforms)?

George H.

Vote

G

George Herold 12 years ago

Well not completely meaningless if you make it a conditional average. So feed noise into your DSO, take the FFT. Now average the input, but trigger near the top of the signal. And then ask for the FFT. (This also takes out all the end effects so a flat top window is OK.. oh assuming that the time base zero is in the center of the display and not at the left hand side.)

Hmm (thinking out loud) I guess this depends a bit on what the noise looks like. If it's a random series of delta functions, then I'm not sure if it still 'works'. George H.

Vote

P

Phil Hobbs 12 years ago

Well, the old one on how to code FFTs is by E. Oran Brigham, but the best imo is Bracewell's "The Fourier Transform and Its Applications". Bracewell himself was a kick in the ass.

Cheers

Phil Hobbs

Vote

Help comparing two FFT approaches

Join the Discussion

Didn't find your answer?