Fixedpoint Multiply/Accumulate in DSP48

Hi,

am a little confused as far as the capabilities of the DSP48 go. I would like to implement a 18x35 MACC in (hopefully) only two DSP48. The 18 bit coefficient is a 0.18 fixed point number. I.e. what I really want to implement is ((A18 x B36)>>17)+C48

Apparently I overlooked that the DSP48 slice only allows for common C inputs which means that I can not split C appropriatly accross the two adders.

What am I missing? Do I really need to implement the adder in LUTs?

Kolja Sulimma

Reply to
comp.arch.fpga
Loading thread data ...

Kolja, You can easily make an 18x85 multiplier with two DSP48s (see fig 1-20 of

formatting link
but to do an accumulation you will have to add a third DSP48, using it only as an accumulator. (I wouldn't do this in fabric. See fig. 5-2 of the same doc to see the concept used in a semiparallel FIR.) You could do it in two DSP48s if you can spare a cycle, for example, when it's time to clear the accumulator. This is done by doing separate multiply-accumulates on the MSB and LSB DSP48s, and then at the end of the accumulation period, on the spare cycle, summing the two together by changing the opcode mux.

-Kevin

Reply to
Kevin Neilson

Hi, thanks for that. It got crowded in the chip so I kept banging my head against the user guide schematics and came up with the following. To compute A18*B35 + C35 I chain two DSP48 to form a 18x35 multiplier. Then I set C48

Reply to
comp.arch.fpga

Kolja, I'm not sure exactly what you are describing. You could take the 30 bits from the upper (msb) slice's P output and leftshift 17 bits and route them to the C input so they can be summed with AxB from the lower slice, but then you still can't accumulate. When you multiply, you have to use both the X and Y muxes for the multiplier's partial sums, so then you are left with the Z mux, which you can use to add in either C, PCIN, or P. So you can't multiply, accumulate, and add in a third thing at the same time.

I'm not sure what you mean by your suggestion about the mux to split the C between the two slices. The C input already goes to both slices. (By the way, the DSP48E in the V5 has independent C inputs for each slice. And it does a 25x18 multiply.) -Kevin

Reply to
Kevin Neilson

I don't want to do P = a*b + c + p I just want P = a*b* + c

I can't add C in the MSB tile, because the Z mux is used by PCIN. I can add C in the lower tile but I can't reach the full dynamic of the multipliers result because only 30 bits will end up in the upper tile.

6 extra bits would allow me to add with the same dynamic as the multiplier result and 17 extra bits would allow me to add up the full P outputs dynamic.

Does the Virtex-5 DSP slice also resolve the MUX bottleneck?

Kolja

Reply to
comp.arch.fpga

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.