Fixed-point Math help

Las · 2004-12-12T23:00:10+00:00

Before I start Goolging my brains out I thought I would ask the group:I'm looking for any information to on implementing float-point algorithms in fixed-point math.Thanks.

B

bungalow_steve 21 years ago

point

you

compare

performance

up

against

ratio

with

do

The topic at hand is relative performance between emulated floating point vs fixed point math executed on the same processor, not sure why your talking about Pentiums vs whatever, that totally irrelevant to the discussion. I can only guess you didn't understand the .pdf file I referenced, or we are talking about two different topics. The reference says that a floating point add takes 122 cycles, vs 1 cycle for fixed point. This is one example of the 100 to 1 ratio I'm talking about. My personal experience with Analog Devices fixed point DSP's indicates two orders of magnitude difference between fixed point and emulated floating point, but I just thought you would be more interested in a published benchmark that backs up my claim, thats all.

Vote

J

Jonathan Kirwan 21 years ago

I tend to agree that one shouldn't rely on P-II comparisons, for example, as a means of comparing integer vs floating point performance when discussing the general turf of embedded processors. Nor should one compare apples and oranges.

The OP was asking about how to IMPLEMENT floating point routines using fixed point, by the way, and this branch here is decidedly moved away from anything helpful there -- though perhaps still interesting. I'm not sure how Tim's segue comment addressed this (seems to me it was arguably on topic to suggest searching google, but otherwise boils down to telling someone that they don't need to know how to implement floating point because the support is already there 'so why ask' at all...)

I agree with your comment, "fixed point is usually chosen over floating point because of the reduced die size, cost, and power requirements of the processor." In our case, cost was certainly a consideration though not a large one (at first.) However, power requirements and dissipation were vital issues for us as well as getting the necessary processing done on time (of course.) I don't think die size directly mattered, though I'm sure that an excessively large package would have been a problem then. Package size *is* more important for some applications I work on, so that puts a more direct pressure on the die, of course.

I cannot agree with your rejoinder, though, that "Emulation of floating point math on a fixed point processor is usually not an option as the throughput is reduced by 100x or more." First off, my very first application on the ADSP-21xx from Analog Devices dealt with values that are important to maintain over a very wide dynamic range. At least 16 bits of precision had to be maintained for some

6-8 orders of magnitude, for example. Some kind of floating point features were essential, even while using a very cool running DSP like these were at the time. (What kept us from using TI integer DSPs at the time was a different issue that was inescapable with their parts and required for the application.) So floating point on a fixed point processor wasn't only an option, it was vital.

One of the things that is glossed over in your comment here is that floating point processing doesn't have to be used 24/7 by the application. If it were always the case that the CPU was bottlenecked doing floating point continually, then yes -- for a given performance level you'd probably be better off with a DSP supporting floating point if you needed floating point in that way, rather than using a super-high speed integer processor and emulating it at a similar rate. But if what you need is modest bursts of floating point operations as well as very low power requirements and low cost, etc., and when you could well use the boost of a MAC or fully combinatorial barrel shifter to help it along, then an integer DSP is probably quite reasonable. The price of adding floating point in hardware is usually a continuous drain and excessive power consumption, if you don't need it all the time, and that adds unnecessary cost both to the processor and all of the surrounding circuitry and dissipation support required. Further, if your application requires a wide dynamic range, some kind of floating point support remains.

So, floating point is not only an option on integer processors.. sometimes, it is a requirement for them.

But you've make me slightly curious. I use ADSP-21xx integer DSPs routinely (not moved up to BlackFin, though) and my experience using the barrel shifter with integer operations for floating point purposes hasn't been as bad as 100:1 versus fixed point for operations that reasonably might be considered similar in precision (but not in dynamic range, of course.) But I write my own code and do NOT use libraries nor do I use C, and I use the full capability of packing instructions. Can you be precise about what you are comparing here so I can consider some specific cases just for my own sake?

Jon

Vote

J

Jonathan Kirwan 21 years ago

No one has directly addressed this, I think. Probably because there is a lot of truth in the idea of "integer ADC --> processing --> integer DAC" means you shouldn't insert FP into the "processing" block if you can avoid it. Most applications just don't have the wide dynamic range that floating point supports and there are "gotchas" with using floating point that require care to use well. And why add it, if everything coming in is integer and everything going out is integer? Just stay in integer... if you can.

But if you are really interested in learning how, Analog Devices has a book on implementing floating point with their ADSP-21xx processors that I believe may be able to be downloaded. You can also examine the floating point formats commonly used, along with a tedious description about special cases, in Intel's documentation on their processors, available from Intel's web site. Some of the older DEC manuals included details on implementing floating point, too. (In the earlier days, teaching programmers about floating point details was important as it was a required skill for everyday programmers. That's far less true, today.)

I don't have a convenient web site in mind, but Tim Wescott's suggestion of using google is probably a good one. Use "floating point" and "normalization" and "denormalization" and "exponent" and "mantissa" and "hidden bit" and perhaps the four common operations to help track something down. This is more your 'do diligence,' until you've done this yourself and can explain why it's not getting you there.

The basic idea is that you have an exponent (signed, twos-complement) and an unsigned mantissa (with a possible hidden bit for non-zero values) and a separate sign bit. These can be packed in any format you like or is convenient to you. Each of these is an integer. There is no explicit radix (decimal) point, but it is usually assumed for the mantissa at any convenient place and the exponent then adjusts this, left or right, for - or + values of the exponent. The mantissa is usually stored 'normalized' which means that it is shifted until the leading bit is always a '1' (which is always possible unless the value is actually zero, but that is easily detected.) Some formats simply throw away the leading bit, because it is always '1', and put it back when needed in order to add one apparent bit of precision.

The rest is just software. Try a paper exercise and see where it takes you. That's a good start, if you plan to try and implement something yourself. Another choice would be to examine library code -- again, search google.

Jon

Vote

A

Andrew Reilly 21 years ago

If you're doing floating point on a fixed point DSP, for dynamic range reasons, and you have no particular reason to comply with IEEE floating point formats, why would you bother with an unsigned mantissa or implied leading bit? Is it because you knew that you absolutely needed that extra bit of precision? One can go a long way with a simple two-word format: mantissa and exponent, with nothing special about the mantissa, so that the chip's normal signed multiplies and adds work fine. (I never used it, but I believe that the Motorola C compiler for the 56000 series used this format. At least, one of the debuggers knew how to display memory blocks in that format...)

Many (most?) DSP processors have "normalize" or "count leading zeros/leading ones" instructions too, which makes the normalization/alignment process a bit of a slam-dunk.

There are some fairly good introductions on the web (some in pdf, from memory), but I'm afraid that I don't have them handy. The suggestions to google are good ones.

Here are a couple of other random suggestions:

If your need for floating point (for dynamic range reasons) is on the real-time critical path, so it has to be time/power efficient, you can often get away with what's known as "block floating point". That is, a collection of calculations, (the passes of an FFT, for example) might usefully share a single exponent. That doesn't give you quite as even a dynamic range/precision trade-off as conventional floating point, but it makes the bulk of the work look more like fixed point, while still having some of the dynamic range advantage.

One application area that I am familiar with that requires vast dynamic range is anything that does pattern matching with hidden-markov models (or similar). Most of the fixed-point DSP implementations of these algorithms meet the precision/range trade-off by performing the arithmetic in the log domain. This requires log() and exp() functions to get in and out, but the win can be large if a large amount of processing has to take place in between. [The use of log arithmetic also helps to explain a virtue of Viterbi searches, as opposed to forward/backward or the like: additions are replaced by maximums.]

Hope some of these rambles help. Or at least offer some more search terms to help narrow down google's focus.

Cheers,

Andrew

Vote

J

Jonathan Kirwan 21 years ago

Actually, I don't use hidden bit notation, at all. Everything explicit. The mantissas are signed (or unsigned) as needed, and I don't use a separate sign bit in some other field. I was talking generally at that point and hinting at common format standards.

Yup. On the ADSP-21xx processors I referenced there is a fully combinatorial

32-bit barrel shifter with the ability to find the leading '1' in a single cycle and report the required shift.

Jon

Vote

C

CBFalconer 21 years ago

... snip ...

You can find a complete example in the Dr. Dobbs Journal archives. I published a complete system for the 8080 there about 25 years ago. It's purpose was to supply dynamic range, and used a 16 bit significand with an 8 bit exponent. The result was much faster than anything else available at the time, because it could all be done in registers, and in addition was re-entrant. The system included i/o procedures, transcendentals, etc. and had over/underflow detection. The system underwent minor revisions and continuous use in the ten years or so since publication, and processed the majority of tests in a 1000 bed hospital for much longer. I.E. it was reliable and accurate.

Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net) Available for consulting/temporary embedded and systems. USE worldnet address!

Vote

T

Tim Wescott 21 years ago

You misunderstand. Please actually read my posts before you argue with things I did not say.

That's _your_ topic at hand, and I understand you, I for the most part I agree with you. If you read my retraction where I remembered (and fessed up) that I was comparing a whole integer package that does slow things down with floating point you'd realize that.

_My_ topic is that however it's done the '2812 in specific is _very_ good at emulated floating point -- probably not 1:1, but I believe it's way better than 100:1. That's why I didn't waste my time reading the paper about the DsPIC (unless you're trying to point out that it's relative performance is as good as the '2812? Do you have benchmarks?).

I quoted the speedup (or lack of slowdown) between the '2812 and the Pentium because the Pentium has your floating point hardware AND IT IS SLOWER than the '2812.

So far you've quoted the ADI part and the Microchip part, but you haven't addressed _my_ topic, which is that the '2812 IN SPECIFIC has better floating point vs. integer performance than anything else I've personally worked with -- including the Pentium, which has floating point hardware and should blow it away.

Tim Wescott Wescott Design Services http://www.wescottdesign.com

Vote

B

bungalow_steve 21 years ago

And why add it, if everything coming in is integer and everything going out is

The problem is what if you developed/tested all your floating point control laws code on a pc based simulator (e.g., Matlab/Simulink) and now want to put that code between an A/D and D/A in an embedded product. Easier just to drop it into a processor that supports the identical floating point format as the simulator then to rewrite it in fixed point math.

Vote

S

Spehro Pefhany 21 years ago

Cheap, fast(easy), good: pick any two.

Best regards, Spehro Pefhany

"it's the network..." "The Journey is the reward" speff@interlog.com Info for manufacturers: http://www.trexon.com Embedded software/hardware/analog Info for designers: http://www.speff.com

Vote

B

bungalow_steve 21 years ago

and

in

reward"

formatting link

I think of it more as a nonrecurring vs recurring expense tradeoff, rewrite the code in fixed point, save on recurring costs, don't rewrite, save on non recurring costs. Old business problem.

Vote

T

Tim Wescott 21 years ago

For light DSP on conventional processors I usually end up writing a little fixed-point math package that (a) allows multiplication as fractional numbers and (b) automatically saturates additions and subtractions. This is usually a much better fit for implementing filters and the like, yet is much faster than floating point on most machines (including the Pentium, oddly enough).

It's easy enough to simulate the effects of the fixed-point math in Simulink or whatnot by the judicious use of quantification. If you get ambitious enough you can even build a filter library in C or C++, with a matching library of blocks in your simulation program.

Tim Wescott Wescott Design Services http://www.wescottdesign.com

Vote

J

Jonathan Kirwan 21 years ago

It would be wonderful if that were on the web somewhere or in a .DOC from a site you have. Do you have separate rights to it? Or can you modify it and regain the right to make it publicly available?

Jon

Vote

J

Jonathan Kirwan 21 years ago

There's nothing wrong in capitalizing on your prior efforts. If you've already tested your algorithms extensively using floating point *and* if you will be using a floating point package that EXACTLY duplicates the behavior you tested on the PC then you are in luck and it makes sense. But your comment seems just a little cavalier to me, so I'll expand on my point.

What many people do NOT realize is that floating point operation behavior introduces unexpected (to some) issues, such as the fact that A*(B-C) is not the same as A*B-A*C in floating point while it IS the same in integer domains. Floating point is ALSO not the same in behavior from one implementation to another, many implementations introducing unexpected behaviors in the lower order bits that differ widely; including non-monotonicities in various functions of various sizes, different support for features, and hidden rounding controls that either aren't well documented and/or aren't even properly initialized at startup and will have to be tracked down in the target.

Just supporting an "identical floating point format" is often enough NOT CLOSE to supporting identical behaviors in operating on those formats. And though it may look good at first blush, you may discover problems well after you've started to ship product. And then it can become very expensive to remedy.

I spend a LOT of time carefully studying any floating point software package before I dare to rely on it and I use what I learn about its behavior in my design analysis for the algorithms. This extra work is one reason I often avoid floating point entirely. It's hard to do and it takes time to verify the impact upon equations and to assure one's self through theory that the bounds of the errors are in the worst case entirely acceptable.

I can't say this about anyone here in particular, but it has been the result of my long experience in this that programmers I've been exposed to simply do not have the training (self- or otherwise) to do the analysis carefully, are almost completely ignorant of the pitfalls in using floating point and are blindly cavalier and overconfident about its safety, and that they instead depend on "random" testing of the entire system to assure themselves that things are okay.

Of course, sometimes it is a trivial result that the use of the floating point is entirely safe -- for example, say, in converting an integer fixed point internal Fahrenheit value into a rounded/truncated Celsius value for display (there are all-integer ways, but FP is ... convenient and readable.) But quite often equations are developed and handed over to programmers by physicists or engineers who are NOT cognizant of FP "gotchas" and implemented by programmers who are also NOT cognizant of them and things "slip through the cracks" between those wielding equations in a perfect mathematical world and those converting them to coded sequential operations in an embedded processor.

So I tend to be one of those suggesting that unless you are competent at FP that you stay away from it. And if your application requires a wide dynamic range and you need to maintain a similar precision throughout, then by all means use floating point but think carefully about its use. I consider getting the algorithms working well on a PC a reasonably good idea to ferret out and eliminate certain kinds of important errors. But I also do NOT think, assuming that the same format (and having the same number of bits does NOT mean the 'same format') is used, that it is then a necessary consequence that all is right with the world. There are just too many vagaries to contend with and in the floating point domain, they can rise up and bite you where you least expect it if you haven't been bothered to think through them to eliminate their threats.

(1) if your application isn't floating-point heavy and most of it is in dealing with the external world, timing, stuff like that... then perhaps the risks of being ignorant are lower. But still, (a) it probably isn't really that much work to just avoid the floating point altogether and (b) your foot print in memory will benefit from not linking FP code, too, and (c) any testing you did on the PC for those floating point operations should really be just duplicated on the target where you *know* it applies (and it won't be so difficult to do, since they are a small part of the overall application);

(2) if your application is floating-point heavy, then the risks of some difference in handling is just that much more likely to accumulate into a serious problem. For all that testing you did on the PC, which was decidedly convenient and important to do, it still cannot be assumed to operate exactly the same on your target (unless the target is a Pentium, I suppose.) And you'll need to either carefully think through the differences or else duplicate the work, perhaps.

Floating point is one of those very convenient Ginsu knives that you see doing such wonderful things in the hands of a skilled practitioner. But in the hands of someone who is unskilled and cavalier and ignorant of its dangers, while it will often do what is expected and thus feed overconfidence in its use, it's also much more likely to cut off one's finger as chop carrots.

Of course, time to market, etc., will impact choices and risks taken. Your mileage may vary, etc. Just a word to the wise, is all.

Jon

Vote

N

Nicholas O. Lindan 21 years ago

"CBFalconer" wrote

I found a lot of older (and some current) medical applications use floating point BCD for results calculation. Data was processed serially a nibble/digit at a time, as in a 4-bit calculator CPU.

The math-pac was always a home brew and buggy - but then in consulting all you get to see are other folk's bugs. Nobody hires a consultant to come in an fix a success (although if I did government work I am sure that would change).

I have never had a client give a rational reason for using BCD. Lots of paranoia, but nothing rational.

Nicholas O. Lindan, Cleveland, Ohio Consulting Engineer: Electronics; Informatics; Photonics. Remove spaces etc. to reply: n o lindan at net com dot com psst.. want to buy an f-stop timer? nolindan.com/da/fstop/

Vote

N

Nicholas O. Lindan 21 years ago

Equivalent to saying to the client "How would you like your project: late, over budget or buggy?"

-- Nicholas O. Lindan, Cleveland, Ohio Consulting Engineer: Electronics; Informatics; Photonics. Remove spaces etc. to reply: n o lindan at net com dot com psst.. want to buy an f-stop timer? nolindan.com/da/fstop/

Vote

N

Nicholas O. Lindan 21 years ago

"Tim Wescott" wrote

When the filter parameters are constant the fastest method is hardcoding the math in assembly as sequences of shifts and adds.

Nicholas O. Lindan, Cleveland, Ohio Consulting Engineer: Electronics; Informatics; Photonics. Remove spaces etc. to reply: n o lindan at net com dot com psst.. want to buy an f-stop timer? nolindan.com/da/fstop/

Vote

J

Jim Stewart 21 years ago

If I were to guess, I'd say it comes down to visualization or perhaps mental laziness. Everyone can make the jump from ten fingers to a BCD digit. And not having to write the conversion routines saves half the work.

Vote

T

Tim Wescott 21 years ago

True, and if I were working on products at high enough volumes, or small enough processors to justify it that's just what I'd do. In fact that's how my first few experiences with doing DSP in conventional processors went.

Aside from those early experiences I've always worked on things that ship a few hundred units a year at best, and in the context of a large SW engineering team of whom I'm the most accomplished at DSP. Given those conditions it's a better economic tradeoff to buy a faster processor and code in a high-level language -- that keeps the engineering time down and gives me the hope that I can do other things with my time than write numeric processing code.

On a DSP the way I've done it is to write a fast vector dot product and a fast matrix multiply in assembly, then wrap that with C or C++ to generate the gain vectors (or matrices). I get nearly all the ease-of-use of the higher level language and nearly all the speed of assembly, which is pretty nice.

Tim Wescott Wescott Design Services http://www.wescottdesign.com

Vote

C

CBFalconer 21 years ago

I never gave up any rights - when originally published DDJ did not pay anything. When they published their later book of "DDJ for the year ..." I gave them further permission to reprint. I lost my sources some years ago in a disk crash, although I had promulgated them to some others before then. Now all I have is a hard copy listing of the version used in my Pascal system, and possibly faulty copies typed by a French gentleman in comp.os.cpm.

Whether DDJ has it available in anything other than scans I do not know. There is no great demand for 8080 assembly code today, especially since the Rabbit doesn't even implement the full 8080 instruction set. I believe one of the critical things it misses is the XTHL instruction. So does the 8086, thus making it impossible to preserve all registers at all times.

Talk in this thread of emulating FP processors seems ridiculous. The FP processors themselves were attempts to speed up the FP routines. Other methods included hardware instructions to ease justification, multiplication, division, etc. Some systems even broke up division by having a dividestep instruction. For example, as eventually (not in the DDJ issue) implemented in my system,

16x16 -> 32 multiplication was done by two 8 x 16 -> 24 bit operations, and a summation. This was about 50% faster.

If there is any real demand I can put it up for download on my page, in the form I received it from Arobase. i.e. totally unverified. I do not have facilities for scanning my hard copy.

Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net) Available for consulting/temporary embedded and systems. USE worldnet address!

Vote

B

bungalow_steve 21 years ago

Yes it seems we are both talking about different topics, limitations of newsgroup communication I suppose. See ya.

Vote

Fixed-point Math help

Join the Discussion

Didn't find your answer?