# floating point to fixed point conversion

• posted

hello guys,

I need some help from you. I am doing a DSP project and for that I need to do some C coding for the conversion of sample data which is in floating point representation to fixed point representation. the sample data is in floating point like

2.296968

-0.448350

-2.779426

My DSP algorithm is implemented in C and is supposed to be using fixed point representation. The above data is intended to be converted to fixed integer format.I request you to help me out regarding this conversion.I will be very glad if u give me some hints or algorithms for this conversion.

• posted

If you must post the same question to multiple newsgroups please cross-post.

```--
Tim Wescott
Wescott Design Services```
• posted

I will use single precision.

As you may or may not know the IEEE754 format is as follows:

SEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMM

S = Sign bit (0 = +, 1 = -) E = 8-bit biased exponent (Bias = 0x7F) M = Fractional portion/ Significand

The IEEE754 has a single implied integer bit of 1 (which is excluded from the mantissa).

Really conversion from FP to Fixed point will be shifting and maybe negation as well (only whole part, when negating, DO NOT touch the fractional portion of the fixed point value).

Basically, add the implied integer bit(mask off mantissa and or with

0x800000) Zero extend the result to 32-bits (ideally larger since you will risk losing some integer bits, If the values that you have given represent the range of FP values expected than 32-bits will be sufficient ).

Shift left this value, and decrement the exponent with each shift, if unbiased exponent is positive.

Shift right this value, and increment the exponent with each shift, if unbiased exponent is negative. Repeat until the exponent = 0 (Remember to remove bias) Take bits 31 - 24 as your integer portion and the bits below that as your integer.

Bits 23-0 will be your fractional portion. Say if you are using 16.16 fixed point you will have to truncate the fractional portion. So the Leat significant byte of the fractional portion will have to be discarded.

Last you will need to test the sign value to determine if negation of the whole portion should take place.

I don't know how efficient this algorithm is or if there are any mistakes, hopefully someone will point this out. If I was given this task, that is how I would attack it.

-Isaac

• posted

Depending on what precision you need, the simplest way is just to multiply the floating point numbers by an integer constant, then do all you maths processing in integers. To convert back to the floating point values just divide by that same integer constant. Think of it like changing your units of measurement, e.g. doing your calculations in millimetres instead of metres. For example, you could use 32-bit integers as your fixed point numbers - multiply your floating point numbers by 65536 (or shift 16 bits) to get the fixed point numbers, then divide by 65536 to get back.

- Charles

• posted

This will work and will be a lot easier if your target has the capability (or instruction) to convert a fp value to an integer value (like x86).

• posted

It's not that simple. If you start with metres and you want to calculate 1m x 1m, the result is obvous. If you now converts 1m to 1000mm first and simply do 1000mm x 1000mm, the result is not quite what you want. DSP's solve this by swithing the MAC into either integer or fractional mode, where the latter shifts the result one bit left after each multiply.

So you have to follow some rules of thinking. If you convert your floats to, say 1.15 fixed point and you multiply two of these, the result is 2.30, possibly truncated to 2.14 (16 bits). As long as you keep this in mind, you can multiply anything like this.

If you look at the numbers the OP gave:

2.296968 can be represented as 2.14 and -0.448350 in 1.15. The result of the multiplication will be in 3.29 format. Additions do not have this effect.

So as long as you keep the resulting format in mind, you can indeed multiply each float by for instance 2^16 to convert them int fixed-point numbers.

the

No, you must devide by 65535 * 65536 Shift the input 16 bits left, shift the output *17* bits right.

Meindert

• posted

multiply

just

units

1m

to,

you

the

multiply

You're dead right - I got confused because I was working on a calculation at the time where only one of the numebrs being multiplied has been scaled by

16 bits :( The difficult part about doing it yourself (with no support from your processor) is that you have to go through each calculation and check that you are scaling the output correctly each time and that your calculations are not overflowing the integer size.

- Charles

• posted

at

But the good part of this is when you get the hang (and the discipline) of it, you can do the calculations in any fixed point representation you like. You can even treat numbers in one format at a certain stage in the calculation and then move on just "thinking" them in a different format in the next stage.

Meindert

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.