# Fixed-point Math help

• posted

Before I start Goolging my brains out I thought I would ask the group:

I'm looking for any information to on implementing float-point algorithms in fixed-point math.

Thanks.

• posted

You've searched Embedded.com?

If not:

Why? Do you really mean floating point, or do you mean fixed-point non-integer? If you really mean floating point, what toolset are you using that doesn't have a perfectly good floating point library to use

-- or are you coding in assembly and not C?

For the most part all the processors I've used in the last 8 years have had decent floating point support. The exceptions have been the 186 using the Borland tools that required a patch from US Software, and the

196 with the old Intel tools that crashed the ICE any time you attempted floating point operations.
```--
Tim Wescott
Wescott Design Services```
• posted

Don't do that. Understand the problems/algorithms and then implement them directly in fixed point. Sometimes you really do need to carry the scale around with every computation, but that's acutally pretty unusual. More usually, you can work with some notion of "full scale" and a corresponding "noise floor" and just leave it at that. If you try to translate directly from floating point you're unlikely to really understand the numeric issues and the resulting code will be inefficient as a result.

Cheers,

```--
Andrew```
• posted

group:

algorithms in

try

basic idea is to look at the maximum/minium values of your floating point inputs, and the maximum/minimum values of each computation, and move the "decimal" point as needed for each computation to prevent overflow AND preserve the maximum resolution. If your using a large enough date type (32 bits) you may be able to get away with keeping the "decimal" point fixed throughout the calculations. If you are using a

16 bit processor, many calculations can be done with 16 bits, stuff like integrations you will need 32 bits.
• posted

For high volume applications, fixed point is usually chosen over floating point because of the reduced die size, cost, and power requirements of the processor. I think 90% of DSP processor sold are still fixed point for this reason. Emulation of floating point math on a fixed point processor is usually not an option as the throughput is reduced by 100x or more. In this case, even if the application still could run with the reduced throughput, it may still be converted to fixed point math so that the processor clock could be dropped from say,

40Mhz to 1 Mhz (reduce power, lower EMI etc). For low volume applications, is easier to use floating point.
• posted

I rarely need floating point either. Someone else has already raised the issue of the speed hit when you do a soft floating point calculation on hardware without floating point support. However, if you really do need to do it then take a look at some Forth floating point code. That is usually quite good as a basis (mind you, I am already part way there as I use Forth for most of my systems anyway. Still, the techniques should be translatable).

In my current project I am using 48bit intermediaries on a 16 bit machine in order to avoid delving into floating point calculations. It is still a speed win and maintains the accuracy I need in the result.

```--
********************************************************************
Paul E. Bennett ....................```
• posted

I'm not sure how good forth floating point code is, but for the floating point routines I am using on a 16 DSP fixed point processor, this is what I'm getting (clock cycles) for fixed vs floating point math

Addition: 16 bit Fixed 1 cycle, Single Precision Float 122 Cycles Subtraction: 1 cycle fixed , 124 cycles float Multiplication 1 cycle fixed, 109 cycles float Division 16 cycles fixed, 361 cycles float

this is custom floating point assembly code optimized for the processor from the manufacturer. So I'm basically getting over 100x performance boost when using fixed point, really hard to throw away that improvement, though I still use floating point for non critical and debug purposes.

• posted

While I agree that doing floating point addition and subtraction in software can be quite time consuming due to the denormalisation and normalisation phases, I really do not understand, how the multiplication can take that long. Basically you just multiply the mantissa and add the exponents.

This should not take too long, unless the mantissa size is larger than the integer register size. On a 16 bit integer processor, it would be sensible to use a floating point format with 8 bit exponent and 16 bit mantissa.

Paul

• posted

Those sort of numbers are almost certainly for IEEE-conformant floating point emulation. So you have full subroutine call overhead, packing and unpacking the 32-bit (or 64-bit) IEEE format on a 16-bit DSP that wasn't necessarily designed for such operations, and then taking care of the special cases (denorms, NaNs and Infs). That would be likely to be very ugly on most 16-bit fixed point DSPs.

I don't think that the C standard stipulates IEEE arithmetic yet, does it? Many users probably expect it, though.

```--
Andrew```
• posted

... snip ...

Doesn't Forth have some other form for handling many of these problems, something like rational fractions is jiggling my memory.

```--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
Available for consulting/temporary embedded and systems.```
• posted

That is the thing about Forth. If you need it you either find it in the Forth Scientific Library or, more likely, you end up rolling your own. My

48bit intermediary is quite necessary to maintain the accuracy up until the point I do the cube root. It is all back to 16 bits then.
```--
********************************************************************
Paul E. Bennett ....................```
• posted

than

bit

The floating point functions are IEEE-754 compliant (32 bit not 64 bit), with signed zero, signed infinity, NaN (Not a Number) and denormal support and operated in the "round to nearest" mode. I supposed if I rolled my own floating point format as you suggest the multiply would be much faster.

• posted

I've used floating point on production code for one of four reasons:

1. For startup, to read parameters out of EEPROM and calculate the appropriate fixed-point parameters in a way that is easily maintainable.

1. For scientific instruments with complicated math and without a need for said math to happen quickly.

2. In secondary processes that were not time critical, but where maintainability was enhanced by using floating point.

1. Because, much to my surprise, floating point on a TI '2812 is only a few times slower (rather than 100x) than fixed point math.

```--
Tim Wescott
Wescott Design Services```
• posted
1. Because, much to my surprise, floating point on a TI '2812 is only a

Sorry but there is no way floating point emulation on a TI 2812 is only a few times slower then fixed, if so TI wouldn't need to make floating point DSP's anymore! Floating point on the 2812 (or any fixed point dsp) is about 100 times slower then fixed point.

• posted

I _do_ have to fess up that I have only compared it to an integer package that includes bounds checking, which seriously slows down the integer computation -- and we didn't use it on that processor for anything other than what I've already advocated; the "real" computation happened in fixed point.

```--
Tim Wescott
Wescott Design Services```
• posted

Il 14 Dec 2004 10:05:09 -0800, bungalow_steve ha scritto:

Maybe it's 100 times slower in the worst case, for example a MAC operation done in assembly with the MAC-specific hardware or in C with double precision math. But for "generic" operations, if you compare fixed-point

32-bit C code and float C code, the "few times slower" statement looks very familiar to me. The same applies with C5400 family.
```--
asd```
• posted

This is my experience -- and I failed to point out that I was comparing MAC-less integer arithmetic to floating point. With a MAC, of course, integer arithmetic is way faster.

```--
Tim Wescott
Wescott Design Services```
• posted

No, I'm talking about a simple add is 100 times slower. Your saying floating point is a "few times slower" then fixed point. Ok, I assume a C5400 performs a 16 bit add in 1 cycle, so your saying in 2 to 3 cycles (i.e., few times slower) it can perform the overhead of a subroutine call, denormalize/normalize and take care of all the special conditions and return a 32 bit result? Sorry, I can't see it, do you have an assembly listing of a C5400 floating point add routine?

• posted

TI doesn't publish floating point emulation specs for its fixed point processors, that I know of anyhow, it depends on the complier, If you have a C complier just write a simple fixed point rountine and compare it to a emulated floating point

here is benchmarks for a microchip 16 bit dsp (single cycle integer multiplys/adds) that I posted earlier, probably similar in performance to a 16 bit TI chip

• posted

No, I'm asking _you_ for the benchmarks that _you_ are using to back up your ever so strongly voiced opinions.

_I_ know how fast the damn processor is -- when I benchmarked it against my fixed point math package I nearly fell out of my chair. It's ratio of floating point math vs. fixed point math is between 20x and 50x better than a Pentium.

With a fixed point math package that does 1r15 arithmetic, in ANSI C, with saturation, a Pentium is about 20-50 times faster than it is with floating point. The '2812 runs neck and neck. It certainly doesn't do floating point as fast as pure integer math, but it certainly _does_ knock the socks off of anything else I've had occasion to use.

Frankly I would have responded as you did if I hadn't done the experiment myself.

So, upon what benchmarks are you basing your claim?

```--
Tim Wescott
Wescott Design Services```

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.