I'm looking for recommendations a library that provides a software only, fixed-point math library for an ARM9.
We have a task that must run at high priority, that eat up to 65% of the CPU time under certain circumstances.
The algorithm uses floating point math on our particular ARM9 which has no floating point coprocessors at all.
The only real limitations are that we can use a commercial, paid for implementation, or an LGPL (or similar) licensed product that would not require open-sourcing our entire embedded application. We can't use anything under GPL or equivalent that would require open-sourcing the rest of the product source.
I'll be happy to hear about anything you know, best are pointers to packages that you have used yourself and know are fast and accurate.
A quibble about your licensing requirements - the LGPL also requires you to release your source as LGPL (or at least linkable object code plus lots of info) unless you are using dynamic linking. There is seldom a real difference between the GPL and the LGPL in embedded systems. As with any request for help or information, describe the problem ("I want code that doesn't affect the licensing of the rest of my code"), not what you think is the answer to the problem.
Regarding the real issue, you'll have to give us some more information. Are you looking for simple arithmetic functions, or do you need things like trig functions? Are you looking for fixed point or would it make more sense to use integer arithmetic? Can you make use of table lookups? Do you know your accuracy requirements? Would you be happy with a faster software floating point library? Software FP libraries vary wildly in speed - if you let us know what you are using at the moment, someone might suggest a faster alternative. Have you considered alternative algorithms? Have you worked through your code to make sure it is reasonably optimal (such as avoiding accidental doubles)? Are you using C, C++, or another language?
On Sat, 28 Mar 2009 16:11:31 +0100, David Brown wrote in comp.arch.embedded:
Mea culpa, and I've said the same to many others on usenet over the years.
Yes, I am looking for a solution that does affect the licensing of the existing code. We are quite willing to look at purchasing commercial libraries, and have done so in some cases in the past. There is no Not Invented Here issue.
I have said the bit about providing more information many times on usenet.
So I'll be as specific as I can be.
The application runs on top of an in-house developed RT micro kernel. It is mostly C, with a small amount of ARM assembly language. Neither the kernel nor the OS use any ARM9 specific features. At the time it was first developed, we were using an older tool chain that only supported through ARM7.
The specific task that is causing a problem performs two tasks:
Validating motion requests based on the current sensors.
Validating motion requests based on current positions of multiple moveable and non-moveable parts of the system, then if the motion is allowed to start, projecting the current motions a short amount of time in the future, to detect upcoming collisions and halting motions before the collision happens.
The prediction of motions in progress for collision avoidance must run a number of times per second to be effective. If it were run less frequently, it would need to project further into the future and would generate more false positives, getting into the user's way.
The algorithm involves calculating polygons, then evaluating the nearest point between them. It uses the C float type, not double. There is already a large look-up table containing the sine and cosine values for every 0.1 degrees between 0 and 360 degrees. The resolution of all the moving axes at this level if 0.1 degree or millimeter.
I haven't reviewed the existing code, but as certain parts of the system move close to each other it starts to eat up more execution time, perhaps because it as putting progressively more points into the polygons to yield higher resolution results as they get closer.
The trig table, originally put in the code for this task, are now used by several other tasks where they are not a performance problem, so they would have to say. There is plenty of room in the flash and the external SDRAM where the flash contents are copied at power up, to put in a table of 3600 sine/cosine pairs in 32-bit fixed point, with whatever split.
I am quite experienced using fixed point on TI 2812 DSPs, although there not only a library provided by TI, it is also speeded by hardware support on the DSP.
The problem surfaced when we changed from the original commercial tool chain to another commercial tool chain that is much better in several respects. It turns out that certain floating point operations are slower in the library supplied with the new compiler. I found out a few days ago that my coworkers working on this never thought to contact the now tool chain vendor about the library performance issue, although the vendor has an excellent reputation for good support.
It sounds like you've already covered the low-hanging fruit with your tables and having been careful about only using single-precision floats (are you also careful to write your float constants in the form 3.1415f to avoid these causing unwanted promotions to double?). Have you also checked your compiler flags - in gcc, the flags -ffast-math,
-fsingle-precision-constant and -fshort-double can make a big difference. Your compiler vendor may have some advice if you ask them.
I don't know anything about your algorithm (and that's a lot more detail than you want to post here!), but it is important to look through it in detail and talk it over with others to see if there are better ways of solving the same problem. Algorithmic changes can make a much larger impact than anything else.
Talking about physical objects, fixed point is often quite adequate. If your smallest dimension is 10 nm, measuring and calculating in nm will loose mathematical precision, but no physical precision that is not there. (An impressive demonstration is the designing of the SEAForth chip by Chuck Moore.)
Probably you knew that. So what is the problem in expressing everything in appropriate physical quantities, (uV, m/us ) and chug along? If the problem lies in math functions, you may use a package at my side below to calculate the socalled Chebychev coefficients for any function you care to specify (This may be even a special function, e.g. log (1+alpha(sin(x)) may be one function.) Applying them requires only evaluating a polynomial in fixed point. You can trim the order of the polynomial to not have more precision than you need.
There are no license issues as you are using the results of a program.
(Fair to say, C. polynomials are nowadays trumped by a quotient of polynomials, especially near poles of the function. They are still quite practical, especially if one has a pragmatic attitude oneself. E.g. for a tangent with poles at pi/4, you would use sin/cos or limit the range to -pi/8,pi/8)
Second general remark. If you find yourself using sin and cos a lot, you may discover that rewriting as matrix equations gets rid of that, and as a bonus generates less boundary cases.
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.