How do I scale a 9-b signed 2's complement data by 17/sqrt(21)?

- J
- John_H
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 31, 2006 1:21 PM

References:

The 115/31 was the strangest idea offered. If you need the result in a single clock, please look *seriously* at the simple multiplier. These are designed as library elements for very fast results and can easily accommodate your "one clock cycle" requirement.

If your clock is 20 MHz, doing the 115/31 might be reasonable but it sure isn't single-clock friendly!

Another consideration: does this value get used somewhere that you can algebraically manipulate the values so a /31 or /sqrt(21) can be "pulled in" to other number manipulation?

PLEASE consider the multiplier.

- J
- John_H
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 31, 2006 1:21 PM

References:

The 115/31 was the strangest idea offered. If you need the result in a single clock, please look *seriously* at the simple multiplier. These are designed as library elements for very fast results and can easily accommodate your "one clock cycle" requirement.

If your clock is 20 MHz, doing the 115/31 might be reasonable but it sure isn't single-clock friendly!

Another consideration: does this value get used somewhere that you can algebraically manipulate the values so a /31 or /sqrt(21) can be "pulled in" to other number manipulation?

PLEASE consider the multiplier.

- J
- John_H
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 31, 2006 1:21 PM

References:

The 115/31 was the strangest idea offered. If you need the result in a single clock, please look *seriously* at the simple multiplier. These are designed as library elements for very fast results and can easily accommodate your "one clock cycle" requirement.

If your clock is 20 MHz, doing the 115/31 might be reasonable but it sure isn't single-clock friendly!

Another consideration: does this value get used somewhere that you can algebraically manipulate the values so a /31 or /sqrt(21) can be "pulled in" to other number manipulation?

PLEASE consider the multiplier.

- J
- John_H
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 31, 2006 1:21 PM

References:

The 115/31 was the strangest idea offered. If you need the result in a single clock, please look *seriously* at the simple multiplier. These are designed as library elements for very fast results and can easily accommodate your "one clock cycle" requirement.

If your clock is 20 MHz, doing the 115/31 might be reasonable but it sure isn't single-clock friendly!

Another consideration: does this value get used somewhere that you can algebraically manipulate the values so a /31 or /sqrt(21) can be "pulled in" to other number manipulation?

PLEASE consider the multiplier.

- J
- John_H
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 31, 2006 1:21 PM

References:

The 115/31 was the strangest idea offered. If you need the result in a single clock, please look *seriously* at the simple multiplier. These are designed as library elements for very fast results and can easily accommodate your "one clock cycle" requirement.

If your clock is 20 MHz, doing the 115/31 might be reasonable but it sure isn't single-clock friendly!

Another consideration: does this value get used somewhere that you can algebraically manipulate the values so a /31 or /sqrt(21) can be "pulled in" to other number manipulation?

PLEASE consider the multiplier.

- J
- John_H
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 31, 2006 1:21 PM

References:

The 115/31 was the strangest idea offered. If you need the result in a single clock, please look *seriously* at the simple multiplier. These are designed as library elements for very fast results and can easily accommodate your "one clock cycle" requirement.

If your clock is 20 MHz, doing the 115/31 might be reasonable but it sure isn't single-clock friendly!

Another consideration: does this value get used somewhere that you can algebraically manipulate the values so a /31 or /sqrt(21) can be "pulled in" to other number manipulation?

PLEASE consider the multiplier.

- S
- Stephen Craven
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 31, 2006 2:29 PM

I may be missing something - I didn't read all 60 posts by John, but couldn't a LUT implementation work here?

If he only needs limited precision in the final answer and his input data is at 9-bits, that's a 2^9 = 512 entry LUT. Assuming 16-bit output accuracy, we've got 1kB of data that will fit in a single BRAM.

- S
- Stephen Craven
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 31, 2006 5:06 PM

I did miss something. He wants an ASIC implementation.

Does anyone know how a ROM-based LUT approach would compare to an arithmetic-based approach in terms of area? I suppose one would just need a single transistor per ROM bit plus the associated row / column mux / decoders.

Even if it is larger in area, it might be advantageous from a speed / simplicity standpoint.

- K
- Kolja Sulimma
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 31, 2006 5:46 PM

Mr. Ken schrieb:

Note that a division or multiplication by a power of 2 is free in hardware. Also note, that multiplication by a constant is a lot cheaper and faster than general multiplication.

Lets see:

(X*59)/16 = (X*64 - X*4-X)/16

So X*59/16 uses only two adders.

(X*119)/32 = (X*128-X*8-X)/32 Two adders, even more precision.

Y= X*4 + X*32 (X*15195)/4096 = X*16*1024 - Y*32 - Y Three adders. Very high precision:

I doubt you can beat this with any other approach.

Kolja Sulimma

- K
- Kolja Sulimma
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 31, 2006 5:46 PM

References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Lines: 29 Message-ID: Organization: Arcor NNTP-Posting-Date: 31 May 2006 19:46:49 MEST NNTP-Posting-Host: 3ec7147e.newsread2.arcor-online.net X-Trace: DXC=XUTigJE^R`D[T26?78JQ5U85hF6f;DjW\KbG]kaMHVA=iV best precision possible. Without considering clipping and range issues, I

Note that a division or multiplication by a power of 2 is free in hardware. Also note, that multiplication by a constant is a lot cheaper and faster than general multiplication.

Lets see:

(X*59)/16 = (X*64 - X*4-X)/16

So X*59/16 uses only two adders.

(X*119)/32 = (X*128-X*8-X)/32 Two adders, even more precision.

Y= X*4 + X*32 (X*15195)/4096 = X*16*1024 - Y*32 - Y Three adders. Very high precision:

I doubt you can beat this with any other approach.

Kolja Sulimma

- J
- John_H
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 31, 2006 6:13 PM

My apologies, folks. I'll try to reinstall my newsreader software at home in an attempt to avoid the posting flood. I just hope it's not the ISP that's giving me the problem.

LUTs might be the bast answer!

- P
- Peter Alfke
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, May 31, 2006 9:00 PM

- M
- Mr. Ken
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, Jun 1, 2006 2:53 AM

My clock is only 3.92MHz, and i design in a 0.15um process, timing won't be an issue here. Yeah, 1/31 can be factored into other number multiplications, but again, it will affect precision there. It's all a matter of compromise between different choice. Thank you for the ideas.

- M
- Mr. Ken
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, Jun 1, 2006 2:59 AM

Yeah, in the implementation, this technique is used for all my multipliers, since I have another set of scaling as well, like 17/sqrt(10), 17/sqrt(20), etc. I will make use of this saving.

Thank you for your input.

- H
- Herman Dullink
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, Jun 1, 2006 5:02 AM

If best precision is what you need, just add more bits. Multiply by 237/64, 475/128, 950/256 or 1899/512.

H.

- K
- Kolja Sulimma
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, Jun 1, 2006 6:36 AM

Mr. Ken schrieb:

If you multiply the same X by different scales, you can share intermediate results between the constant coefficient multipliers.

Kolja Sulimma

- M
- Mr. Ken
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, Jun 1, 2006 6:48 AM

multipliers,

use

True. Intermediate results will be shared.

After studying the following document, I realized that dividing by constant can be implemented by same technique.

formatting link

- J
- John_H
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 16, 2006 6:46 PM

Use more bits. If you were looking at simple shift for the division you're on the right track but you need more digits such as 3799/10241.

If ASICs have dedicated multipliers as a simple element, you probably have what you need with a multiplier.

If you have loads of time, a bit-serial approach can give you tiny.

If you want abstruse, you can do a 115/31 where the divide by 31 is a bunch of 5-bit adds and a few conditionals around the digit 31 (and a bit of latency).

Where do you want to go?

- J
- John_H
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 16, 2006 6:46 PM

References:

The 115/31 was the strangest idea offered. If you need the result in a single clock, please look *seriously* at the simple multiplier. These are designed as library elements for very fast results and can easily accommodate your "one clock cycle" requirement.

If your clock is 20 MHz, doing the 115/31 might be reasonable but it sure isn't single-clock friendly!

Another consideration: does this value get used somewhere that you can algebraically manipulate the values so a /31 or /sqrt(21) can be "pulled in" to other number manipulation?

PLEASE consider the multiplier.

- R
- Ray Andraka
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 16, 2006 7:48 PM

17/sqrt(21) is a constant= 3.7097..., which represented as a binary nubmer is 011.1011010111 when rounded off to 10 bits right of the radix point. You said your input is 9 bits, so you already have an error of +/- 1/2 of the LSB weight. In most cases it doesn't make sense to multiply by any more precision than you have in the input. Rounding your constant to 9 bits (and treating as unsigned) gives: 11.1011011. = 2^1 + 2^0 + 2^-1 +2^-3 + 2^-4 + 2^-6 + 2^-8 =3.7109375

This can be done with 4 adders arranged in 3 layers to sum shifted terms of input N:

a= N + N*2; b= a + 8*a; c= 8*N - N; d= b + 64*c; result = d/128;

This will give you pretty much the fastest logic solution assuming no fast memories.

Your input is only 9 bits. If you are doing it in an FPGA, just program a block ram as a look-up table ot (0:512)*17/sqrt(21) and be done with it. If block RAMs are at a premium, and your FPGA has embedded multipliers, use the embedded multiply to multiply the 9 bit input by

0x1DB (or more bits if you so desire) and be done with it.