Optimizing out a divide on altera cyclone fpga

Ok, hi all, I'm new to fpgas and am having some fun with an Altera UP3 kit.

In the app I'm developing I have a component that I use 8 times in parallel and the problem is that in the logic of this component are two divide by 5 performed on integer variables. I can't use bit shifting obviously, but is there cheap in terms of LEs way to do a divide by 5 other than with a divide? This is killing me because the divides use something like 1500 LEs which is almost 2X larger than the rest of the logic.

If there's no other reasonable way I think I can rearchitect it and make it a divide by 4 and thus make available the use of bit shifting.

Any help appreciated, thanks.

Here's a dump of all the crap quartus seems to be adding for the divide:

Info: Found 1 design units, including 1 entities, in source file ../../../../../../../altera/quartus50sp1/libraries/megafunctions/lpm_divide.tdf Info: Found entity 1: lpm_divide Info: Found 1 design units, including 1 entities, in source file db/lpm_divide_smf.tdf Info: Found entity 1: lpm_divide_smf Info: Found 1 design units, including 1 entities, in source file db/sign_div_unsign_uig.tdf Info: Found entity 1: sign_div_unsign_uig Info: Found 1 design units, including 1 entities, in source file db/alt_u_div_1od.tdf Info: Found entity 1: alt_u_div_1od Info: Found 1 design units, including 1 entities, in source file db/add_sub_ke8.tdf Info: Found entity 1: add_sub_ke8 Info: Found 1 design units, including 1 entities, in source file db/add_sub_le8.tdf Info: Found entity 1: add_sub_le8 Info: Found 1 design units, including 1 entities, in source file db/add_sub_me8.tdf Info: Found entity 1: add_sub_me8 Info: Found 1 design units, including 1 entities, in source file db/add_sub_ne8.tdf Info: Found entity 1: add_sub_ne8 Info: Found 1 design units, including 1 entities, in source file db/add_sub_la8.tdf Info: Found entity 1: add_sub_la8

Reply to
Shanon Fernald
Loading thread data ...

Shannon :

Divide by 5 can expressed as (multiply by a constant, then divide by 2^N), obviously the divide by 2^N is a shift. So depending on your necessary accuracy you can :

  • 3 / 16 (divides 5.3333) * 13 / 64 (divides 4.92) * 51 / 256 (divides 5.02) etc

To implement this, you multiply by your chosen constant then discard N bits. A multiplier much smaller than a divider

Actually an even smaller way would be to do * 3.25 / 16. ie. Y = (X + X + X

  • (X >> 2)) >> 4 (this is a 4.92 divisor)

I'm sure there are better ideas

Gary

Reply to
Gary Pace

This is often easier to understand when expressed in binary.

1/5 equals 0.00110011001100110011 etc.

0.00110011 etc equals 11 multiplied by 0.000100010001 etc.

0.000100010001 etc equals 10001 multiplied by 0.0000000100000001 etc.

This suggests the following result (in C syntax):

Y1 = (X + X

Reply to
allanherriman

You guys are awesome. Thanks. I will try these ideas tonight!

Reply to
Shanon Fernald

Gary, your algorithm works out on the trusty calculator, but Allan, your algorithm "does not compute" :) Am I missing something or could you walk me through this?

Y1 = (X + X

Reply to
Shanon Fernald

there is an implied shift of the radix point. Also, I think it may be more intuitive to replace '' in the above algorithm although in this case both are correct. The main thing you are missing is that the implied shift.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
 Click to see the full signature
Reply to
Ray Andraka

"there is an implied shift of the radix point."

I'm still not quite understanding this, but I found a good tutorial on radix points and bit shifting, so hopefully it will make sense after reviewing this.

In any case, I implemented Gary's algorithm and it works, with perhaps a slight glitch at the edge case, still looking into that. But the new version saved about 1100 LEs. Very nice.

Reply to
Shanon Fernald

I'm sorry, I though the implied shift was obvious. The important thing is that repeating patterns of bits in a multiplicand result in simple chains of two input adders.

In the above example, Y1 is X multiplied by 11 (= 3 decimal) Y2 is X multiplied by 110011 (= 51 decimal) Y3 is X multiplied by 11001100110011 (= 13107 decimal) Y4 is X multiplied by 110011001100110011001100110011 (= 858993459 decimal)

(Y1 >> 4) is X multiplied by 0.0011 (= 0.1875 decimal) (Y2 >> 8) is X multiplied by 0.00110011 (= 0.19921875 decimal) (Y3 >> 16) is X multiplied by 0.0011001100110011 (= 0.199996948242188 decimal) (Y4 >> 32) is X multiplied by 0.00110011001100110011001100110011 (=

0.199999999953434 decimal)

It's really just moving the binary point around. I get used to thinking in 'fixed point' with implied binary points all over the place. The underlying arithmetic is based on integers. Ray likes to think of his digits as being to the right of the point; I like them to the left. I think Ray's view may be more prevalent amongst designers (hence the comment about right shift being more intuitive).

Reply to
allanherriman

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.