Cyclone V decimation

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Hi,

the input signal is 14 bits signed@750ksps. I would like to decimate it  
by a modest factor of ~3000. What would be the best way of doing it on a  
Cyclone V, resource-wise? My usual approach would be a cascade of CIC
decimators followed by a FIR corrector, but since there are the DSP  
blocks, I don't feel it to be the "right" (albeit correct) approach. I'm  
new to the V family and lack the proper intuitions, so could someone  
more versed
suggest me a good direction?

In fact, there will be 12 such channels, all going in sync,
so maybe a considerable resouce sharing can be achieved?

    Best regards, Piotr

Re: Cyclone V decimation
On Saturday, February 23, 2019 at 2:32:04 AM UTC-5, Piotr Wyderski wrote:
Quoted text here. Click to load it

To determine the "right" approach, you need to define "right" in some engineering terms.  So what aspects of the design and implementation are important to your goals?  

Rick C.  

Re: Cyclone V decimation
snipped-for-privacy@gmail.com wrote:

Quoted text here. Click to load it

Minimisation of resource usage, or in other words, a decimation  
technique that maps best onto the underlying primitives. I believe
those 200+ DSP (multiply-accumulate) blocks are good for something...

    Best regards, Piotr

Re: Cyclone V decimation
On Saturday, February 23, 2019 at 6:17:28 PM UTC+2, Piotr Wyderski wrote:
Quoted text here. Click to load it
ngineering terms.  So what aspects of the design and implementation are imp
ortant to your goals?
Quoted text here. Click to load it

If all you want is minimization of resource usage then just do CIC.

Something else makes sense only if you want very flat pass band and very sh
arp transition between pass band and stop band.

The problem with using generic FIR for decimation is not computation, which
 for your requirements would be minimal, but storage, both for coefficients
 and for delay line. Decimation by 3000 would need something like 15K coeff
icients for good filter shape or twice as many for very good shape. Coeffic
ients storage could be cut in half due to filter's symmetry, but I am not a
ware of similar trick for delay line. So, overall you will need just 1 DSP  
block, but 40 to 80 M10K blocks.
Of course, you always can trade storage for simplicity, by building you dec
imation chain as a cascade, probably sizing each stage for delay line to fi
t in 1 M10K block. Then the whole chain will take 3 stages and only 6 M10K  
blocks and filter shape could still be excellent. Or, may be, even 2 M10K b
locks if you are ready to complicate a control machine a little more by pla
cing all delay lines in a common M10K and doing the same for coefficients,  
 But it is worth an increased complexity? I am not sure.
And then there is variant in the middle - cascade of 2 stages instead of 3.
 Then each delay line and each set of FIR taps will fit in M9K, but two del
ay line wouldn't fit. So, with a bit of control acrobatics you could fit th
e whole cascade in 3 M9K blocks. Still, do it only if you care about shape  
of the filter , but don't do it for resources alone.


Re: Cyclone V decimation
snipped-for-privacy@yahoo.com wrote:

Quoted text here. Click to load it
very sharp transition between pass band and stop band.

There is very little to no energy in the upper part of the band. The  
high ADC speed is there for other reasons. Therefore, CIC will be more
than enough, at least in the first stages of the cascade. I don't know
yet if it would be sufficient for the final stage, but this is a detail
that can be tweaked in a later phase.

So I have a licensing type of a question: can I instantiate DSP blocks
in Quartus Lite? I know the DSP builder is an extra paid tool, but I  
don't need it -- a purely Verilog instantiation would be sufficient.
This block appears to have a decent accumulator, so it could relieve the  
ALMs otherwise needed by the register-hungry CIC.

Thank you!

    Best regards, Piotr




Re: Cyclone V decimation
First of all, since your sample rates are pretty low, I'd see if it's possi
ble to use a DSP chip instead of an FPGA.  Everything is easier in software
.

Everything depends on your specs, which you have not stated.  Namely:  what
 is the attenuation of the stopband, and what is the slope between the pass
band and the stopband?  You say there is not much in the upper frequencies,
 so this makes it sound like your filtering requirements are very low.  If  
there is nothing much at all up there, you don't even need to filter.  Just
 decimate.  Take every nth sample.

The point of the CIC is to reduce the need for multipliers, but you have pl
enty of multipliers and low sample rates.  The CIC has big sidelobes.  It m
ight be better to do a cascade of FIRs each with low numbers of taps.

Re: Cyclone V decimation
snipped-for-privacy@yahoo.com wrote:

Quoted text here. Click to load it

As an afterthought: given the number of channels, their relative slow  
speed and the requirement of lockstep processing, perhaps a bit-serial
CIC would be a good idea?

Other parts of the design can benefit greatly from massive application  
of this approach and it would be a powerful cerebral decalcifier. I think
it is worth doing even if just to learn it makes no sense.

Thank you all for your help!

    Best regards, Piotr

Re: Cyclone V decimation
On Monday, February 25, 2019 at 2:36:33 AM UTC-5, Piotr Wyderski wrote:
Quoted text here. Click to load it

When  I have looked at performing bit serial calculations I've found it to  
not be a large savings of logic and often using more FFs.  If you use some  
form of RAM, either distributed or block, the FF savings can be good.  I su
ppose the Xilinx LUT shift registers come in handy for this.  I think they  
are still the only ones doing that.  

I suppose once you get your head wrapped around the bit serial thing, it ca
n be easy to do.  It can make it a bit harder to extend the precision at ea
ch stage since that means the bit count changes and so the timing.  

Rick C.

Re: Cyclone V decimation
snipped-for-privacy@gmail.com wrote:

Quoted text here. Click to load it

You are right, several initial attempts indicate that the savings are  
minor if I apply time multiplexing carefully. It was a refreshing
experience, though, so no time wasted.

The large decimation factor implies the final bandwidth is narrow, so  
even a very modest 4-stage decimating by 4 CIC filter has about 100dB
of attenuation around the +/-20kHz DC image frequencies. There will be
considerable aliasing above that, but I'm going to filter it out anyway
later, so why bother. The subsequent filters will work at a much lower  
data rate, so I can bump up their order or even change their topology to
something other than a CIC.

Lesson learned: narrow-band CIC attenuation doesn't depend on the filter  
order considerably. Obvious when you think about it, but for some reason
it wasn't.

OK, I have my answer, thank you all for your contribution!

    Best regards, Piotr

Re: Cyclone V decimation
On Saturday, February 23, 2019 at 11:17:28 AM UTC-5, Piotr Wyderski wrote:
Quoted text here. Click to load it
ngineering terms.  So what aspects of the design and implementation are imp
ortant to your goals?
Quoted text here. Click to load it

Is that your only criterion?  Along with the 200+ DSP blocks I would expect
 the chip has many thousands of LUTs and FFs.  Why focus on DSP block usage
?  

I don't see a problem of using the CIC decimators if they otherwise work th
e way you want.  A CIC filter had sharp nulls a particular points but doesn
't do so much elsewhere while being very logic and energy efficient.  They  
are typically finished by a relatively short FIR so the aggregate delay is  
not so large.  Doing it all in a single filter would create a much longer d
elay, no?  

Other than the power usage of a large decimating FIR filter, I can't think  
of other trade offs.  

Rick C.

Re: Cyclone V decimation
snipped-for-privacy@gmail.com wrote:

Quoted text here. Click to load it

Well, basiclly, yes, it is the only degree of freedom. In other words:
I can design any filtering structure that satisfies my requirements from  
the signal processing point of view, but not all structures are equally
welcome by the FPGA, let alone an FPGA with DSP slices. Hence my question.

I've already done it with a multistage CIC alone, but the hardware
was much simpler and CIC approach was the only viable one.

 > Along with the 200+ DSP blocks I would expect the chip has many  
thousands of LUTs and FFs.  Why focus on DSP block usage?

One reason is to learn them, other is the ability to use a smaller chip.  
A DSP block is composed of two multipliers and an accumulator. The  
accumulator is what a CIC needs. There will be plenty of other functions  
occupying that FFs.

    Best regards, Piotr

Re: Cyclone V decimation
On Sunday, February 24, 2019 at 1:23:21 AM UTC-5, Piotr Wyderski wrote:
Quoted text here. Click to load it
  
Quoted text here. Click to load it
.
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it


You haven't given us much to go on.  As some have pointed out you can do th
e decimation in multiple stages and use smaller FIR filters at each point,  
or use on ginormous FIR filter.  In both cases a polyphase organization wil
l reduce the number of calculations needed.  Or you can use the CIC filter  
as a front end.  I don't know any of the details, so I have no way of calcu
lating the resource usage.  

I think it is pretty obvious what the trade offs are.  Squeeze here and thi
s toothpaste comes out there.  Squeeze there and other toothpaste comes out
 somewhere else.  

To know where to squeeze and how hard the numbers are important.  

Rick C.

Re: Cyclone V decimation
On 2/22/19 11:31 PM, Piotr Wyderski wrote:
Quoted text here. Click to load it


This may be a better question over at comp.dsp.

That said, and given what you've said in other responses, your best  
answer may be to use a polyphase decimating FIR filter.  In effect,  
you'd use a 12000 tap FIR filter, but only 4 taps of it at a time.

Understanding Digital Signal Processing (Lyons, 2011) has a good enough  
treatment on the subject for a general purpose DSP book.  Multirate  
Digital Signal Processing (Crochiere and Rabiner, 1983) has an excellent  
and extremely rigorous treatment on the subject, but is out-of-print and  
a far less general book in general.

--  
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Re: Cyclone V decimation


Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it

You could also use halfband FIR filters, they are really efficient. Again,  
I really recommed Rick Lyon DSP book, it is a really good book, it is not t
oo mathy. Basically a 16-tap halfband filter will only use 4  multipliers i
nstead of 16.

Assuming you decimate by 2048 i.e 2^11, you would need abut 44 multipliers.
 Furthermore, you can time-multiplex and reuse the multipliers, so you coul
d probably get by using one hardware multiplier per stage for a total of 11
 multipliers.

Re: Cyclone V decimation
mandag den 25. februar 2019 kl. 22.38.02 UTC+1 skrev Benjamin Couillard:


Quoted text here. Click to load it
  
Quoted text here. Click to load it
a  
Quoted text here. Click to load it
m  
Quoted text here. Click to load it
, I really recommed Rick Lyon DSP book, it is a really good book, it is not
 too mathy. Basically a 16-tap halfband filter will only use 4  multipliers
 instead of 16.
Quoted text here. Click to load it
s. Furthermore, you can time-multiplex and reuse the multipliers, so you co
uld probably get by using one hardware multiplier per stage for a total of  
11 multipliers.

with each stage running at half the rate of the previous it should be  
possible to stagger the calculations so you only need (slightly less)  
than twice the first stage

1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1....
-2---2---2---2---2---2---2---2---....
---3-------3-------3-------3-----....
-------4---------------4---------....

Site Timeline