Let's say I have an embedded system with a small amount of RAM and I want to make an FFT of sampled data. But I don't need to make the FFT of the whole frequency range. Is there some kind of shortcut one can make to achieve this and save RAM?
And let's say that the system isn't fast enough to make the FFT on the fly on the samples.
The FFT is a shortcut calculation method for doing the DFT on all frequency bins of a time sampled input stream. If you don't need all frequency bins the optimizations you can do on the FFT are very limited. You don't have to worry about FFT "on the fly" since by definition, an FFT is done in batches.
If you only need specific frequency bins, you can perform the DFT on each bin separately. This requires N steps, IIRC. An FFT takes N/2 * log_2(N). So if you need more than a small part of the frequency spectrum, the FFT will be easier anyway.
Perhaps if you told us more about the overall problem we could help you understand if this is the right way to approach the problem.
In addition to all the other comments, and if the samples are coming in and you have time as they arrive, and the number of bands is small, you could set up a digital filter for each frequency band you are interested in. I am sure Google would find something for "band pass FIR finite impulse response filter"
For N inputs K outputs, applying the DFT formula directly takes O(N K) time and O(K) memory (since you do a single pass through the inputs, you don't need to store them...you can process each input as it is read and then discard it). You can also use Goertzel, which has the same complexity but a better constant factor for the time (at the expense of some floating-point accuracy). Using this kind of method that doesn't require you to store the input is the only way to save memory over an FFT.
An FFT takes O(N log N) time (putting a particular constant factor like
1/2 in front of this is a fantasy) and O(N) memory. There are things called "pruned" FFTs to compute a few (K) outputs, which take O(N log K) time and O(N) memory. I've found pruned (1d) FFTs to be beneficial in practice only for the case where you want contiguous bins and 1