Distributed Arithmetic Architecture - LUT Contents

I am trying understand how a distributed arithmetic design can achieve a density of 1 LUT(4 input) per four taps per input data bit. I have read the

formatting link
tutorial and a lot of the many previous posts on distributed arithmetic but still cannot see it....

I understand how the scaling accumulator implements a bit serial multiply and I see how the partial product summation is moved to be in fornt of the scaling accumulator. What I can't see is how the partial products for four taps can be implemented in a single 4 input LUT? (I realise that a LUT = 16x1 RAM, in Xilinx anyway)

To caculate the partial product for four taps and a single bit position of our input data then we need to add four bits? If all four bits are

1's then our sum results in 3 bits (or 2bits and a carry out). How can a single LUT4 represent that? A single LUT has only 1 output bit....
Reply to
Andrew FPGA
Loading thread data ...

Ok, consider the case where you have a single tap. You'd need to compute a 1 bit by n-bit partial product for each bit in the serial input, and then you sum those partial products with a scaling accumulator. In that case, the one bit input is gating the coefficient, so that if it is '1' you get the coefficient out (1x coefficient=coefficient). If it is '0' then you get '0' out in all the bits. To do this you have a 1 input, n output logic function (n outputs to handle the n bits in the coefficient). This is equivalent to n AND gates.

Now onto the 4 tap version. In this case you have the sum of 4 of these

1xN functions. If a tap input bit is '1', then the corresponding coefficient is added to the output., if '0' then the coefficient is not added (ie, you add either 1x or 0x the coefficient for each of the inputs). If all 4 input bits are '0's, then you have 0*c0 + 0*c1 + 0*c2 + 0*c3=0. If only one input bit is a '1', then the n bit output is equal to the corresponding coefficient. If you have two input bits '1', then the n output bits are the sum of the two coefficients corresponding to those inputs. Do you see then, that there are 16 possible combinations of inputs, and that the 4 input bits form a 4 bit address into the LUT?

I'm guessing what you were missing is that the DA-LUT is n bits wide, ie it is comprised of n 4-LUTs.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com  
http://www.andraka.com  

 "They that give up essential liberty to obtain a little 
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759
Reply to
Ray Andraka

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.