Unique uses for the DSP48

K

Kevin Neilson 7 years ago

I've tried to figure out how to use the Xilinx DSP48s for Galois arithmetic , but they really aren't that useful for that. The new ones can do a 96-bi t unary XOR, which can be used for GF(2) matrix multiplication, but the mul tipliers themselves aren't of much use for Galois math. I wondered what un usual uses (besides FIR filters or integer matrix multipliers) people have used these for. Here are some of mine:

- Transposers (shifting rows up a DSP column in A/B, latching into P, and s hifting columsn serially out of P using the pattern matcher)

- Barrel shifters (Not that good for wide buses, though)

- Modulo by a constant (using Barrett's Reduction)

- GF(2) bit-by-vector multiply-accumulate (using the ALU as an XOR)

Vote

R

Rick C 7 years ago

ic, but they really aren't that useful for that. The new ones can do a 96- bit unary XOR, which can be used for GF(2) matrix multiplication, but the m ultipliers themselves aren't of much use for Galois math. I wondered what unusual uses (besides FIR filters or integer matrix multipliers) people hav e used these for. Here are some of mine:

shifting columsn serially out of P using the pattern matcher)

I'm no expert, but I believe Galois filters use modulo 2 arithmetic without carries, so multipliers are not what you want. Since there is no carry, t here is no need to use any special features. The typical fabric logic will do the job quite nicely.

Certainly barrel shifters would be useful. Don't know what Barrett's Reduc tion is.

I don't think GF(2) would be useful since you can't use a multiplier withou t the carry as far as I know. Am I mistaken?

Rick C. - Get 1,000 miles of free Supercharging - Tesla referral code - https://ts.la/richard11209

Vote

G

gtwrek 7 years ago

I've used one to implement a 3-input median filter (12-bit max inputs). Used the SIMD modes to evaluate each comparison between the three inputs.

Regards,

Mark

Vote

K

Kevin Neilson 7 years ago

I have to do a lot of GF(2) multiplications, which end up being a vector ti mes a matrix, with the matrix being sometimes 128x128 bits or bigger. Ther e are no carries, like you say. Mostly it's an AND of the vector and a col umn of the matrix and then an XOR of that, and that is done for each column . It can use up a lot of LUTs and the routing can get congested, especiall y when the synthesizer tries to share terms. The XOR is a tree with 3-4 le vels of logic.

I think some processors can do a "carryless" multiply for this, so the colu mns are added but no carries are taken to the next column. You end up with a result that is wider than the field, so you have to do a reduction stage , but it's still advantageous. If you could disable the carries in the DSP

48s, you could use them in the GF(2) multipliers, but unfortunately there's no such option.

Vote

D

David Brown 7 years ago

Do you really mean GF(2) ? That is just single bits. Addition is XOR, multiplication is AND.

If you meant something like GF(2^8), which is popular in RAID and other forward error correction systems on byte-wide data, then I can't think of any smart way to handle multiplication in an FPGA. Multiplying by 2 is easy enough, but arbitrary multiplication is done using log tables in software. If there is a nice hardware method, I'd love to know.

Vote

A

Allan Herriman 7 years ago

I think he does mean GF(2). Here's the user guide:

Allan

Vote

K

Kevin Neilson 7 years ago

I normally work in fields like GF(2^8) or GF(2^11) (or, for AES, GF(2^128)) but all the operations decompose into GF(2) operations. For example, to d o a x b = c in GF(2^8), all these values are 8-bit vectors. You can expa nd b into an 8x8-bit matrix B in which b is the bottom row, b*alpha is the next row up, and b*alpha**7 is the top row, where alpha is the primitive el ement of the field. Then you multiply a x B using GF(2) to get the bits of c. So all the operations of a characteristic-2 field (based on a power of 2) can be broken down into GF(2) operations.

I always break down matrices in larger fields to GF(2) matrices to make it easier on the synthesizer. It can get bogged down otherwise.

Vote

D

David Brown 7 years ago

Thanks for that information. I did not know GF(2^8) calculations could be done by matrices over GF(2) like that, but it makes sense. I have only used the field in software - it is key to RAID6 and higher RAID versions - and there you do the multiplication and division using log tables.

Vote

K

Kevin Neilson 7 years ago

I could see how the log tables might be better in software, but in hardware a pair of log/antilog lookup tables (and a mod 2**m-1 adder) will normally be bigger and slower than a hardware multiplier, especially as m gets bigg er (where the field is GF(2**m)). I do use lookup tables for reciprocals, but above some m, tables become inefficient for that as well.

Vote

Unique uses for the DSP48

Join the Discussion

Didn't find your answer?