Fixed Point Arithmetic is a PITA!

I was working on doing various calculations in fixed point math, but the so urces provide equations that are intended to be done in floating point. Th ere are squares and coefficients with 10^-6 and other items that make using fixed point a PITA. So I'm going to use floating point in the FPGA. It's actually not so hard. Many years ago I worked on a machine that did 32 bi t floating point math in hardware, so I know the basics. It's a bit odd th at addition takes more steps than multiplication as the exponents must be t he same value, so the lesser exponent must be adjusted up and the mantissa adjusted down before the add. Multiplication can be done without this adju stment. Both must be normalized after the operation, but the result of the multiply only ever needs adjusting by one bit position while the addition may need normalizing by the length of the mantissa.

The FPGA I'm working with has various limitation in the DSP MAC units that makes it a bit harder to use than I would like. The adder can be set for a ddition or subtraction, but not switched in real time. So I'm ditching the accumulator and only using the multiplier with an add/sub chain from the F PGA fabric. With that the floating point ops can be done in multiple cycle s each.

In the end this will be easier than trying to adjust all the calculations a nd coefficients to keep the fixed point calculations in a range that preser ves significant bits. In particular a flow sensor that requires a square r oot calculation involved a lot of detailed coefficients for various compens ation factors could have been very problematic in fixed point.

Fortunately the clock speed is not all that fast so it won't require the fu ll speed of the DSP block and external logic can be used to implement all t he operations of floating point. The calculations aren't all that involved and there are many clock cycles available so it won't be hard to configure this. I'm a bit disappointed the DSP block wasn't more useful. I found t hat in many configurations the accumulator output did not include the msb o f the actual register, but rather it either duplicated the next to msb or w as set to zero depending on the signed/unsigned setting of the output. Why calculate it and then not make it available? The user can always choose t o use it or not on their own, but if you don't give the choice...?

--

Rick C. 

- Get 1,000 miles of free Supercharging 
- Tesla referral code - https://ts.la/richard11209
Reply to
Rick C
Loading thread data ...

I'd be more inclined to read your article if there wasn't a Tesla add at the bottom. Any way you can turn that off?

I find those something 'free' ads annoying as they are designed to catch one's eye - and I am trying to read the article...

John

--
Free, get a new tax free life, see the mysteries of the orient, (see  
what I mean?)
Reply to
John Robertson

I don't know what to tell you. If you read posts from the bottom up, my entire post will likely not make sense to you. I'm just sayin'...

All I can say is... Wow! But I'm sure you won't read this post either.

--

Rick C. 

+ Get 1,000 miles of free Supercharging 
+ Tesla referral code - https://ts.la/richard11209
Reply to
Rick C

There's no law you have to use one or the other, if only one part of the math is challenging to implement in fixed-point then just do that one in floating point.

It's probably not worth the trouble when you have clock cycles/power to burn, but on e.g. a more resource constrained device if you only use floating point in one place the compiler/optimizer will only pull in what's needed to do that one calculation and not the whole floating point library/core.

Do you know the fast inverse square root?

Reply to
bitrex

Not just your calculations; astronomy, physics, statistics just does NOT always fit the fixed point (or even floating point) standard of your machine.

I really miss VAX/VMS with G_float and H_float... but even that goes belly-up sometimes. Grit your teeth, and normalize everything, before you try to use _Numerical_Recipes_in_XXX_, too.

Reply to
whit3rd

What to you need the subtraction for ? Just add negative values. To make a negative value, simply invert all bits (1's complement) and then add 1 (2's complement).

Alternatively, set the ALU to subtraction and use similar tricks for addition, as in the very old CADET (Can't Add, Does Not Even Try :-) computer.

Are you designing airplane or turbine wings that require a high sampling rate flow measurement. In other applications the sample rate requirement is not that high, so even a simple 8 bitter UC would do.

The conversion between int and float is costly so try to avoid them. Of course, if the hardware has FindFirstBitSet instructions and multi position arithmetic shifts with shift count in a register, the conversion can be fast.

Since there appears to be quite a lot of bits available, use some fixed point arithmetic,

formatting link
i.e. some bits are allocated to integer part and some bits allocated to fractional bits. Addition does not require normalization and in multiplication you need to discard a fixed number of bits from the double word result (possible add 1 to the retained last bit, if the discarded part was large (rounding).

Reply to
upsidedown

Provided that you keep track of the magnitudes integer or fixed point implementation is much faster. I have worked on problems where the scaling was critical even on REAL*4 floating point. Otherwise some of the terms would overflow or denormalise during the calculations.

The first approach should always be to look for a normalisation of the problem scaling that allows a pure integer implementation.

Another one that is handy (and missing from that review) depending on how fast your hardware divide is the Pade approximation for sqrt.

sqrt(1+x) = 1 + 2x/(4+x)

-0.5 < x < 1 it is better than 1%

At the extremes it gives sqrt(0.5) = 0.7142 and sqrt(2) = 1.4

Tweaks to the coefficients can get it closer still or use it over a narrower range of 0.25 either side of x=0). Sqrt(1.21) = 1.09976 That might be good enough already. A single NR loop will get you more than enough digits for most measured sensor data.

The next highest order one is even more accurate and might well be amenable to FPGA use since it is very 2^n based

sqrt(1+x) = 1 + x*(4+x)/(8+4x) 0.7083' & 1.416'

--
Regards, 
Martin Brown
Reply to
Martin Brown

Am 16.01.21 um 09:28 schrieb bitrex:

That's behind the paywall. Is it this trick from the game Quake 3?

From my notes, don't ask me where the numbers come from.

1/wurzel x Trick aus Quake 3 kommt beim Normieren von Vektoren vor:

float InvSqrt (float x) { float xhalf = 0.5f*x; int i = *(int*)&x; i = 0x5f3759df - (i>>1); x = *(float*)&i; x = x*(1.5f - xhalf*x*x); return x; }

cheers, Gerhard

Reply to
Gerhard Hoffmann

I think it was from Quake 1 circa 1996, when a lot of consumer PCs still didn't have 3D-capable dedicated GPU; Pentium was also relatively new and SSE/MMX instruction set hadn't been introduced yet so no help there either initially.

The application was calculating per-pixel surface normals for e.g. then taking dot products with the ambient lighting vectors and shading each pixel appropriately. Had to be done on the general-purpose CPU if no GPU was available and since it had to be computed for every pixel for every frame it had to be fast.

By the time Quake 3 came out most PCs had some fashion of onboard pipelined GPU for pixel and vertex "shader" computations so the trick was redundant for PC graphics

Reply to
bitrex

Wow-there is a blast from the past. The IBM1620 - you had to make sure the core heater had the memory up to temperature before using.

Reply to
Dennis

What was so critical core memory temperatures ? Later core memories did not need such preheat.

In very old computers using mercury filled tubes as acoustic delay lines as recirculating shift registers, both the absolute temperature and equal temperature between mercury tubes was critical. The speed of sound in mercury varies with temperature. in order to get the serial memory words out of two or more shift registers at the same time, the tubes had to be at the same temperature.

--

Apparently in the 1950/60's it was popular to have an indirect bit in 
every memory word. If a memory address load contained the indirect bit 
set, the low bits of the memory address is used for a new memory 
address access, which again could contain the indirect it set and so 
on. Putting the address part pointing to itself and setting the 
indirect bit caused a single instruction infinite loop :-). 

This was possible on IBM 1620 but also on Honeywell DDP-x16 series 
computers. Since a core memory read is actually a ReadModifyWrite, 
each memory reference warms the core in that location. Using self 
referencing pointers and letting it run too long would permanently 
destroy that core memory location on DDP.. Some DDP models had 
hardware protection against self referencing indirect access. 

Did the IBM 1620 have such problems with self referencing indirect 
pointers ?
Reply to
upsidedown

Apparently the core material was temperature sensitive, so was heated to a bit above room temperature (40C ?).

I only used a 1620 briefly using FORTRAN, so I am not familiar with the architecture details. It was replaced with an IBM 360 shortly after I started programming, which is what I learned my early programming on (FORTRAN and assembler).

My research advisor had used A 1620 (mod I apparently) extensively. They had even replaced the the lookup tables for the arithmetic ops to create new instructions to assist in x-ray crystallography. He had also used an IBM 650 and 7094. He was not impressed with the 360 floating point.

Reply to
Dennis

This is what made the Elliott 803 so sensitive for temperature and the three-phase clock frequency in the computer. The registers were delay lines and most of the logic was memory cores.

There was usually also another bit, own-page/zero-page. If the own page was selected, the top bits of the address register were used as left by the instruction fetch, giving an address on the same page as the instruction. If the zero page was selected, the top address bits were cleared, giving an address on the page starting at zero.

IIRC, the things that overheated were not the cores, but the core line drivers. The cores are arranged in a x-y matrix, and each matrix plane represents one bit. There is a third line through each core, common to all cores in a bit plane. The line is usually called the Z line, and it is used for reading out the results and setting the proper bit value on write. All three groups of drivers have to be fast and at the same time handle substantial voltages and currents, which put the chips at he limits of the technology available qat the time.

The indirect addressing was very common in the (mini)comouters of the era, HP 21 series, Digital PDP-8, Data Gewneral Nova, Honeywell DDP-112 and others.

On PDP-11, there was a bit in the addressing mode for selscting indirect addressing, but the pointer did not have an indirect bit, as the addressing in byte-based instead of based on 16 bit words.

--

-TV
Reply to
Tauno Voipio

Some famous person said "If you really have to use floating point, you don't understand the problem."

I wrote a math library for the 68K, called Zform, in signed format

32.32. That's as good as float for practical applications, and is a lot faster. Z-to-Int conversion time is zero. Add/sub are fast because there's no need to normalize. It was saturating too, in all cases, so would cheerfully divide by zero or add two huge numbers. Great for control loops.
--

John Larkin      Highland Technology, Inc 

The best designs are necessarily accidental.
Reply to
jlarkin

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.