conversion problem

function int_to_float (x: signed) return signed

I wish to convert the input signed vector to Single precision 32 bit floating point number. I do not know what the input string length is going to be. Output string length is fixed.

I know that i can in worst case, minimum and maximum values of integer. So, my worst case input = 64 bits.

Since, it is a signed number

1) how do I check for the sign, just MSB of input or MSB nibble or ...?

2) The inary point is after the LSB since its an integer. How do I normailse this number as it is signed and it could be sign extended. Is it advisale to convert this signed stream to integer first and then to binary.

I have to use the formula to be used is ( (-1)^sign)*(Base^exp)*Significand

Do we need to consider rounding in this case?

I am really confused.

Reply to
FPGA
Loading thread data ...

No, VHDL integers are only 32 bits.

If x is of type integer or x is of type ieee.numeric_std.signed then you should...

if x < 0 then... -- Do whatever you want to do when x is negative.

To do what??

KJ

Reply to
KJ

[...]

So it would seem. Normally I'm fairly impatient with people who bring standard homework problems here, but at least you've been honest and you've made some suggestions. So I'll try to make some suggestions in return.

I think you may be a little confused about the way numbers are represented in binary. Let's start by looking at a really simple *unsigned* integer - try 11:

11 decimal == 001011 binary

Obviously it doesn't matter how many leading zeros you have, so I've used a total of 6 bits just for the example.

Each binary digit represents a power of 2:

11 decimal = 0 0 1 0 1 1 [32] [16] [8] [4] [2] [1]

I guess you know that. Now let's go to SIGNED numbers. There are many, many possible ways to represent signed values in binary, but the method that's by far the most common for integers, and is used by "signed" data in the numeric_std package, is twos complement. In this form, the MOST SIGNIFICANT bit is negated:

+11 decimal = 0 0 1 0 1 1 [-32] [16] [8] [4] [2] [1] -30 decimal = 1 0 0 0 1 0

And, in general, any signed number less than 0 will have its most significant bit set to 1.

For floating-point representation, however, most people (you included) use SIGN-AND-MAGNITUDE representation. In this form, we keep one bit - the sign bit - as a flag to say whether the number is negative or positive. Otherwise, we represent the number as a positive number:

+11 decimal = 0 0 1 0 1 1 [SIGN] [16] [8] [4] [2] [1] -11 decimal = 1 0 1 0 1 1

OK, so that's integer representation sorted out. Now let's think about floating-point.

Have you looked at

formatting link

??? It seems to me to be splendidly clear.

--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
 Click to see the full signature
Reply to
Jonathan Bromley

Thank you Janathan. Your explaination helped a lot.

I would like to know the bit width of my result(signed bit stream "interpreted as integer") when i convert single precision number 32 - bit to a signed output(integer range). I understand that in worst case, iwould require 32 bit register to represent all possible integers. Can we instead make this output of variable width(depending on magnitude of input) than keeping it 32 bit. Which would be the best approach?

Your advise is appreciated

Reply to
FPGA

If you understand how a single precision number is representing in binary you would know how many bits you need as well.

No.

KJ

Reply to
KJ

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.