Cyclone V decimation

Hi,
the input signal is 14 bits signed@750ksps. I would like to decimate it
by a modest factor of ~3000. What would be the best way of doing it on a
Cyclone V, resource-wise? My usual approach would be a cascade of CIC
decimators followed by a FIR corrector, but since there are the DSP
blocks, I don't feel it to be the "right" (albeit correct) approach. I'm
new to the V family and lack the proper intuitions, so could someone
more versed
suggest me a good direction?
In fact, there will be 12 such channels, all going in sync,
so maybe a considerable resouce sharing can be achieved?
Best regards, Piotr
Reply to
Piotr Wyderski
Loading thread data ...
To determine the "right" approach, you need to define "right" in some engineering terms. So what aspects of the design and implementation are important to your goals?
Rick C.
Reply to
gnuarm.deletethisbit
Minimisation of resource usage, or in other words, a decimation technique that maps best onto the underlying primitives. I believe those 200+ DSP (multiply-accumulate) blocks are good for something...
Best regards, Piotr
Reply to
Piotr Wyderski
ngineering terms. So what aspects of the design and implementation are imp ortant to your goals?
If all you want is minimization of resource usage then just do CIC.
Something else makes sense only if you want very flat pass band and very sh arp transition between pass band and stop band.
The problem with using generic FIR for decimation is not computation, which for your requirements would be minimal, but storage, both for coefficients and for delay line. Decimation by 3000 would need something like 15K coeff icients for good filter shape or twice as many for very good shape. Coeffic ients storage could be cut in half due to filter's symmetry, but I am not a ware of similar trick for delay line. So, overall you will need just 1 DSP block, but 40 to 80 M10K blocks. Of course, you always can trade storage for simplicity, by building you dec imation chain as a cascade, probably sizing each stage for delay line to fi t in 1 M10K block. Then the whole chain will take 3 stages and only 6 M10K blocks and filter shape could still be excellent. Or, may be, even 2 M10K b locks if you are ready to complicate a control machine a little more by pla cing all delay lines in a common M10K and doing the same for coefficients, But it is worth an increased complexity? I am not sure. And then there is variant in the middle - cascade of 2 stages instead of 3. Then each delay line and each set of FIR taps will fit in M9K, but two del ay line wouldn't fit. So, with a bit of control acrobatics you could fit th e whole cascade in 3 M9K blocks. Still, do it only if you care about shape of the filter , but don't do it for resources alone.
Reply to
already5chosen
very sharp transition between pass band and stop band.
There is very little to no energy in the upper part of the band. The high ADC speed is there for other reasons. Therefore, CIC will be more than enough, at least in the first stages of the cascade. I don't know yet if it would be sufficient for the final stage, but this is a detail that can be tweaked in a later phase.
So I have a licensing type of a question: can I instantiate DSP blocks in Quartus Lite? I know the DSP builder is an extra paid tool, but I don't need it -- a purely Verilog instantiation would be sufficient. This block appears to have a decent accumulator, so it could relieve the ALMs otherwise needed by the register-hungry CIC.
Thank you!
Best regards, Piotr
Reply to
Piotr Wyderski
First of all, since your sample rates are pretty low, I'd see if it's possi ble to use a DSP chip instead of an FPGA. Everything is easier in software .
Everything depends on your specs, which you have not stated. Namely: what is the attenuation of the stopband, and what is the slope between the pass band and the stopband? You say there is not much in the upper frequencies, so this makes it sound like your filtering requirements are very low. If there is nothing much at all up there, you don't even need to filter. Just decimate. Take every nth sample.
The point of the CIC is to reduce the need for multipliers, but you have pl enty of multipliers and low sample rates. The CIC has big sidelobes. It m ight be better to do a cascade of FIRs each with low numbers of taps.
Reply to
Kevin Neilson
ngineering terms. So what aspects of the design and implementation are imp ortant to your goals?
Is that your only criterion? Along with the 200+ DSP blocks I would expect the chip has many thousands of LUTs and FFs. Why focus on DSP block usage ?
I don't see a problem of using the CIC decimators if they otherwise work th e way you want. A CIC filter had sharp nulls a particular points but doesn 't do so much elsewhere while being very logic and energy efficient. They are typically finished by a relatively short FIR so the aggregate delay is not so large. Doing it all in a single filter would create a much longer d elay, no?
Other than the power usage of a large decimating FIR filter, I can't think of other trade offs.
Rick C.
Reply to
gnuarm.deletethisbit
Well, basiclly, yes, it is the only degree of freedom. In other words: I can design any filtering structure that satisfies my requirements from the signal processing point of view, but not all structures are equally welcome by the FPGA, let alone an FPGA with DSP slices. Hence my question.
I've already done it with a multistage CIC alone, but the hardware was much simpler and CIC approach was the only viable one.
thousands of LUTs and FFs. Why focus on DSP block usage?
One reason is to learn them, other is the ability to use a smaller chip. A DSP block is composed of two multipliers and an accumulator. The accumulator is what a CIC needs. There will be plenty of other functions occupying that FFs.
Best regards, Piotr
Reply to
Piotr Wyderski
.
You haven't given us much to go on. As some have pointed out you can do th e decimation in multiple stages and use smaller FIR filters at each point, or use on ginormous FIR filter. In both cases a polyphase organization wil l reduce the number of calculations needed. Or you can use the CIC filter as a front end. I don't know any of the details, so I have no way of calcu lating the resource usage.
I think it is pretty obvious what the trade offs are. Squeeze here and thi s toothpaste comes out there. Squeeze there and other toothpaste comes out somewhere else.
To know where to squeeze and how hard the numbers are important.
Rick C.
Reply to
gnuarm.deletethisbit
As an afterthought: given the number of channels, their relative slow speed and the requirement of lockstep processing, perhaps a bit-serial CIC would be a good idea?
Other parts of the design can benefit greatly from massive application of this approach and it would be a powerful cerebral decalcifier. I think it is worth doing even if just to learn it makes no sense.
Thank you all for your help!
Best regards, Piotr
Reply to
Piotr Wyderski
When I have looked at performing bit serial calculations I've found it to not be a large savings of logic and often using more FFs. If you use some form of RAM, either distributed or block, the FF savings can be good. I su ppose the Xilinx LUT shift registers come in handy for this. I think they are still the only ones doing that.
I suppose once you get your head wrapped around the bit serial thing, it ca n be easy to do. It can make it a bit harder to extend the precision at ea ch stage since that means the bit count changes and so the timing.
Rick C.
Reply to
gnuarm.deletethisbit
This may be a better question over at comp.dsp.
That said, and given what you've said in other responses, your best answer may be to use a polyphase decimating FIR filter. In effect, you'd use a 12000 tap FIR filter, but only 4 taps of it at a time.
Understanding Digital Signal Processing (Lyons, 2011) has a good enough treatment on the subject for a general purpose DSP book. Multirate Digital Signal Processing (Crochiere and Rabiner, 1983) has an excellent and extremely rigorous treatment on the subject, but is out-of-print and a far less general book in general.
--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com 
Email address domain is currently out of order.  See above to fix.
Reply to
Rob Gaddi
You could also use halfband FIR filters, they are really efficient. Again, I really recommed Rick Lyon DSP book, it is a really good book, it is not t oo mathy. Basically a 16-tap halfband filter will only use 4 multipliers i nstead of 16.
Assuming you decimate by 2048 i.e 2^11, you would need abut 44 multipliers. Furthermore, you can time-multiplex and reuse the multipliers, so you coul d probably get by using one hardware multiplier per stage for a total of 11 multipliers.
Reply to
Benjamin Couillard
mandag den 25. februar 2019 kl. 22.38.02 UTC+1 skrev Benjamin Couillard:
a
m
, I really recommed Rick Lyon DSP book, it is a really good book, it is not too mathy. Basically a 16-tap halfband filter will only use 4 multipliers instead of 16.
s. Furthermore, you can time-multiplex and reuse the multipliers, so you co uld probably get by using one hardware multiplier per stage for a total of 11 multipliers.
with each stage running at half the rate of the previous it should be possible to stagger the calculations so you only need (slightly less) than twice the first stage
1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1.... -2---2---2---2---2---2---2---2---.... ---3-------3-------3-------3-----.... -------4---------------4---------....
Reply to
lasselangwadtchristensen
You are right, several initial attempts indicate that the savings are minor if I apply time multiplexing carefully. It was a refreshing experience, though, so no time wasted.
The large decimation factor implies the final bandwidth is narrow, so even a very modest 4-stage decimating by 4 CIC filter has about 100dB of attenuation around the +/-20kHz DC image frequencies. There will be considerable aliasing above that, but I'm going to filter it out anyway later, so why bother. The subsequent filters will work at a much lower data rate, so I can bump up their order or even change their topology to something other than a CIC.
Lesson learned: narrow-band CIC attenuation doesn't depend on the filter order considerably. Obvious when you think about it, but for some reason it wasn't.
OK, I have my answer, thank you all for your contribution!
Best regards, Piotr
Reply to
Piotr Wyderski

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.