fixed point math library?

Hi all,

I use Newlib for an embedded system that support fixed point format. Newlib math library use IEEE 754,so I have to fix them. After porting a few math functions, I found things are not easy. I tried to look for open source fixed point math library but found nothing. Anyone can halp me?

Thanks a lot.

Reply to
ovn
Loading thread data ...

Hmmm, there isn't a magic switch to convert from floating point computation to fixed point. In general you need to consider the precision and range requirements of all your variables. You have to know the values ranges well.

You choose a scaling factor for each variable or variable class that prevents overflow for the largest magnitude values and preserves sufficient accuracy for all values. The scaling factor can change from one variable to another. To perform arithmetic between numbers with different scaling factors, you need to do adjustments: align operands for addition and subtraction, do appropriate scaling after (possibly before) multiplication and division. You typically need more precision for a temporary result than for the final value. The scaling of the result is also determined by the scaling for the variable in which it is stored.

I seem to recall that an ISO C subcommittee proposed a set of extensions for embedded use. Included in the proposal was a fixed point arithmetic facility, but it only had a single fixed scale factor, which makes it fairly limited, in my opinion. I don't know of any implementation of those proposals, but you could search.

FWIW, I once used a compiler with built-in fixed point arithmetic in which you could specify the scaling for each variable. It was TI Pascal, some 27 years ago. The bugs I found told me that it hadn't been used much. I believe that PL/I also had binary fixed point.

--
Thad
Reply to
Thad Smith

Ada (

formatting link
) has built-in support for fixed point math. Ada is supported by the GNAT part of the open source GCC:
formatting link

Reply to
Colin Paul Gloster

This is a real weakness in the embedded systems community, and I share your pain. Especially for low-end micros, these building blocks should be available.

I also have never found such a thing.

The closest thing I've seen is the GMP. Some algorithms I use have been obtained from examining the GMP. But I have to trim them for fixed-size operands.

Dave.

Reply to
David T. Ashley

Crenshaw's "Math Toolkit for Real-Time Programming" might be helpful.

Regards, -=Dave

--
Change is inevitable, progress is not.
Reply to
Dave Hansen

You might also look to David Hanson's "C Interfaces and Implementations," where he develops an arbitrary precision library and explains the process, as he does so in three major steps.

For embedded systems, not uncommonly, the problem isn't so much about having access to such arbitrary precision libraries, though. GMP is more for impressive numbers of digits, well beyond the usual double value. It appears to include arbitrary precision integers and rational numbers, too.

(I've developed a rational fraction library for my own use, relying in part on GCD to keep them in lowest terms, where possible, and using continued fractions to develop nearby rational values where the exact one cannot be represented. But I can't recall being tempted to use the package in an embedded application.)

Considering more than just integers, but not as much (memory/execution time) as doubles and floats, for embedded use usually means getting integer performance, little or no library footprint and instead inline integer coding, but with the expressive simplicity of using doubles. I'm not sure that even that ISO/IEC JTC1 SC22 WG14 N1021/DTR 18037 of C gets what's often desired. And I haven't ever used a compiler supporting that, anyway.

For example:

f = c * 1.8 + 32.0; /* convert degrees Celsius to Fahrenheit */

But let's say you want to avoid floating point library overheads and would like fairly fast execution and only need things in tenth-degrees for either Fahrenheit or Celsius. So you'd like a fixed format for both of them and you'd like to write as simply as above, yet get all the speed and tiny size from using integers. Often, this just means mentally deciding that 'c' is in tenths and 'f' is, too:

f = (c * 18 + 3200) / 10;

This uses an integer 'c' and integer 'f'. Let's assume 16-bit default width, here. So now the problem might come from the following -- in this application, the largest 'c value will be 500 degrees Celsius. So now, we can see that if it is stored in tenths it will be up to 5000. And if we multiply by 18, the result will be 90000 before we get a chance to divide by 10; not to mention the 3200 that must be added. So this is a problem.

One might then think to just:

f = (int) ((c * 18L + 3200L) / 10L);

But now you've invoked a conversion of 'c' to a long, a long-by-long multiplication, and a long-by-long division. What is really wanted, though, is a integer-by-integer multiplication producing a long, followed by a 32-bit numerator divided by a 16-bit integer operation, which is actually quite easy and fast and very sensible for an application like this. But it is a mixed size operation and the c language doesn't support the semantic concept, except through library calls.

Which brings us to library calls. And then, unfortunately, the loss of the simplicity of algebraic expressions and the ease with which we can read and maintain them. The order of calls becomes important, etc.

Which brings this all back to wanting a compiler to accept:

f = c * 1.8 + 32.0; /* convert degrees Celsius to Fahrenheit */

But where it somehow manages to "know" what we really want, if we are good enough to specify the 'f' and 'c' as being in a fixed format. But I've yet to see a crafted analysis on exactly how a compiler might do all of the internal analysis to get there. Perhaps someone can point me to it?

To make this issue more pointed, here is an exact equation I've needed to use:

| WN*BN | | ----- * [2*(Q+12)] + 0.5 | | WD*BD |

In this case, WN, BD, WD, BN, and Q are all variables. And WN and WD and Q may range almost the full span of 16-bit integers, while BN and BD are probably limited to 8-bit ranges. I need to compute an accurately rounded value (which is why the +1 at the end.) Typical (and real) values might be: WN=3125, WD=2176, BN=17, BD=10. Q will range from no less than 1 to something up to about 40,000.

But let's say that these actual values aren't quite always as big as

16-bits, so that I can simplify the point a little bit. In other words, allow me to say that I, as a programmer, know a priori that the product of WN and BN will fit in a 16-bit unsigned integer, in all cases. Let's also say I happen know also know that WD*BD similarly will always fit inside an unsigned integer, in all cases. I can also, let's say, assert that 2Q+24 always fits within an unsigned 16-bit integer. I know all these things, up front.

So, what I'd really like to have done is to perform a 16x8 multiply (or maybe a 16x16) into a 16-bit result, then perform a 16x16 multiply into a 32-bit result (which would now be my 32-bit numerator.) I'd then like to perform a 32-bit by 16-bit divide (which is fast and produces both a quotient and a remainder value.) I then examine the remainder and round, as appropriate. Very simple, very fast. No floating point, no complex libraries. Just a simple multiplication routine and a simple division routine, with Q and R as results.

There is more, though. What if this calculation might be used where precision is important to maintain. Let's say I need to maintain a fixed number of bits of precision and would like a separate exponent tracked, which may range from 2^-2 to 2^+2, hypothetically? What about overflow, too? Now, my 32-bit by 16-bit divide routine needs to pre-check its inputs to make sure that there won't be overflow (easily checked, but necessary) and to also shift both the numerator and denominator appropriately so as to maximize the significant bits in the quotient, while returning the net shift count, as well?

All this can be done in integer formats on processors without floating point support in the hardware and without any real need for a generalized IEEE 754 support package. It's very much faster than floating point, yet quite useful and supporting a wider dynamic range than one might normally expect with fixed point.

Fixed point, especially when taken as not necessarily only fixed point but everything integer that approaches floating point but with close control over the exact operations used, is interesting. I'd like to actually see some careful thinking on it and how it might be done in a useful, tight fashion that brings in only what is needed, when it is needed, with ease of understanding.

Jon

Reply to
Jonathan Kirwan

There's no reason why a compiler couldn't be made to analyze the expression above, and optimize it to use 16x16->32 bit multiplication, and 32/16->16 bit division. I've seen compilers call dedicated library functions for divisions by 10, which is a similar optimization technique.

Reply to
Artenz

The c compiler (assuming c for a moment) would need #pragma's in order to understand the range allowed for the variables. Otherwise, it may not be able to properly guess about it.

I'd like to see a clear enunciation of the algorithms applied to the DAGs for this. I can imagine something, but it requires more information provided by the programmer.

Jon

Reply to
Jonathan Kirwan

Not necessarily. If the compiler sees a 32x32 bit multiplication where both operands are created by promoting a 16 bit to 32 bit, the compiler can safely replace the 32x32 bit by 16x16->32 bit. This would capture your example. In fact, I would expect a compiler to optimize the multiplication by 18 by replacing it with c*16 + c*2 on platforms where this is faster.

Of course, if you only effectively use 10 bits of your variable 'c', a C compiler would typically not be able to infer that without #pragma's.

Reply to
Artenz

I suppose it could. Do you know of a c compiler that does this? My understanding is that the standard is explicit about the result type in cases like this and changing that would change the meaning of the compiler.

Also, though, and my other examples capture the broader question which you fail to deal with here, it's possible for these values to be combined so that there is a 16x16x16 in the numerator, but where it is known by the programmer that the result fits in a 32-bit result. The compiler cannot know this. Same point for the denominator where there may be a 16x16, but where the result is known a priori to fit in a 16 result.

And then there are other questions I also brought up.

The point being that a generalized approach, though I suspect it could be carefully considered and worked through, hasn't really been done with close attention to the needs of folks working on 8-bit CPUs.

It could, of course. If that value is a constant. My problem with this argument is that it focuses on a few narrow cases without really considering the broad issues that David brought up in lamenting a lack. A few special cases just doesn't cut it.

A lot of issues pull us right back to that.

Jon

Reply to
Jonathan Kirwan

As long as a compiler produces the same results as the standard requires, it is allowed to perform any transformations on the code. That said, I'm not aware of any compilers inferring 16x16->32 bit multiplication, though, and certainly not more exotic variants, like

14x18->32

I agree. However, I wonder if generating optimized version of the multiplication functions is a good idea. If you're working with a number of different precisions in one project, this would require multiple library functions, adding to the code size. On most 8 bit projects I've worked on, code size was a bigger concern than speed. YMMV, of course.

The market for such compilers on 8 bit platforms is probably limited too. Dedicated hard-core developers can implement (parts of) the code in assembly, and the growing group of lazy C programmers may not care enough to notice.

Given the shrinking geometries, it makes less and less sense to use 8 bit CPUs anyway. When they were first invented, a transistor was 10 microns big. Right now, 90 nm is getting popular, which means that you can now fit a simple 8 bit CPU in the area of a single old transistor. The cost differential between 8 and 32 bit, based on real silicon area, is close to zero. Given that, I don't expect compiler/language designers to spend significant efforts on optimizing a dying platform.

Reply to
Artenz

I think the key here is "same results." The promotions are part of the result.

Well then, that's your argument for David, I suppose. There's no market for it.

I carefully think through the numerics of the code I wrote -- or try to. This means that I do write my own customized math libraries, as needed. I have existing sources which guide me and save time, of course. Which brings us to David's comment and a desire for something that may be missing and useful for embedded applications in the "tiny arena."

And I don't, not at all, agree with your tendency to regard 32 bit CPUs as nearly equivalent from all meaningful perspectives (or soon to be.) Power consumption and die size are important factors to me. I need to be able to hybridize modules using dice, for example -- small is VERY IMPORTANT and, when these are cooled, so is power dissipation, as that directly impacts the size and cost and utility of the final hybrid. The "cost differential" as you too glibly put it above is very wide in such cases. And flash and RAM take up very large areas, by the way. I think the ALU is rather tiny, for most I've examined, by comparison. But perhaps an expert on this can clarify the details.

There are other reasonings, but it won't serve to debate them. I simply tell you that I'm not using any 32-bit CPUs right now (other than my PC, of course ;)

Jon

Reply to
Jonathan Kirwan

Certainly. The compiler can only optimize if the promotions don't care, or are cast away by converting the result to a smaller type. For example, the standard says that char's are to be promoted to int's before arithmetic operations, but (char) (a + b), where a and b are char's can be implemented by an 8 bit addition, with equivalent results.

Note that with "32 bit" I don't necessarily mean ARM, or other RISC CPUs. These cores are fairly big (although you can fit several modern ARM cores in a square mm), because they've been optimized for speed, not size. I think there would be a market for 32 bit cores better optimized for embedded needs, using a more compact instruction set, less pipelining, no memory management....etc... if you could do a 32 bit add or multiply in 1 compact instruction, that would actually save memory and power compared to the dozen or so instruction you'd need on a simple 8 bit core, and with modern geometries, you could fit the whole thing on 25x25 microns. For most applications, that's so small compared to overhead of I/O area, packaging, or other functions that are embedded on same die, that it's pretty much negligable.

Reply to
Artenz

That's been on my mind through all this discussion. Another is nested functions. Something I really wish I had support for, in c. It would make coroutines rather trivial to implement.

So, are we ready for a community project to develop a better embedded mouse-trap? ;)

Jon

Reply to
Jonathan Kirwan

Sure. Go ahead and invent Pascal and Modula. Oh, and Ada while you're at it. ;)

--
Grant Edwards                   grante             Yow!  What PROGRAM are
                                  at               they watching?
                               visi.com
Reply to
Grant Edwards

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.