I wish to convert the input signed vector to Single precision 32 bit floating point number. I do not know what the input string length is going to be. Output string length is fixed.
I know that i can in worst case, minimum and maximum values of integer. So, my worst case input = 64 bits.
Since, it is a signed number
1) how do I check for the sign, just MSB of input or MSB nibble or ...?
2) The inary point is after the LSB since its an integer. How do I normailse this number as it is signed and it could be sign extended. Is it advisale to convert this signed stream to integer first and then to binary.
I have to use the formula to be used is ( (-1)^sign)*(Base^exp)*Significand
So it would seem. Normally I'm fairly impatient with people who bring standard homework problems here, but at least you've been honest and you've made some suggestions. So I'll try to make some suggestions in return.
I think you may be a little confused about the way numbers are represented in binary. Let's start by looking at a really simple *unsigned* integer - try 11:
11 decimal == 001011 binary
Obviously it doesn't matter how many leading zeros you have, so I've used a total of 6 bits just for the example.
I guess you know that. Now let's go to SIGNED numbers. There are many, many possible ways to represent signed values in binary, but the method that's by far the most common for integers, and is used by "signed" data in the numeric_std package, is twos complement. In this form, the MOST SIGNIFICANT bit is negated:
And, in general, any signed number less than 0 will have its most significant bit set to 1.
For floating-point representation, however, most people (you included) use SIGN-AND-MAGNITUDE representation. In this form, we keep one bit - the sign bit - as a flag to say whether the number is negative or positive. Otherwise, we represent the number as a positive number:
Thank you Janathan. Your explaination helped a lot.
I would like to know the bit width of my result(signed bit stream "interpreted as integer") when i convert single precision number 32 - bit to a signed output(integer range). I understand that in worst case, iwould require 32 bit register to represent all possible integers. Can we instead make this output of variable width(depending on magnitude of input) than keeping it 32 bit. Which would be the best approach?
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.