Fixed-point algorithm for exponential function fitting for an embedded application

- E
- Eric Meurville
  
  Contact options for registered users
posted
18 years ago

Tue, Jul 19, 2005 4:11 PM

Hello,

We are seeking a "light" algorithm performing exponential function (y=A*exp(-k*x)) fitting. A typical curve is composed of 200 16-bits integers. As the code should be executed by a MCU, fixed-point is preferable but if floating-point is required for accuracy reasons, it could also be implemented if the amount of calculation is compatible with a reasonable processing time (few hundreds ms).

Did someone already implement such an algorithm or could provide us with references? Of course, a piece of C code would be ideal.

Thanks and regards,

--
Eric Meurville

- T
- Tauno Voipio
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Jul 19, 2005 5:22 PM

First, take logarithms of the data points, so your fit will be

ln y = ln A - k * x

The logarithm function is a PITA to compute well, the MacLaurin series converges painfully slowly, so a lookup table with interpolation is in order.

Now the fit problem is just a simple one-variable linear fit.

HTH

--

Tauno Voipio
tauno voipio (at) iki fi

- V
- Vadim Borshchev
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Jul 19, 2005 5:45 PM

CORDIC is fast and slim. It uses lookup table and is easily implementable for fixed-point numbers.

Vadim

- D
- David Kastrup
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Jul 19, 2005 6:05 PM

Since when? The recipe is normalize, square, normalize, square, normalize...

The first normalization gives the bits before the binary point, subsequent normalizations deliver one bit each (they have to shift either by 0 positions or by 1).

Well, that is not really a least squares fit then.

--
David Kastrup, Kriemhildstr. 15, 44793 Bochum

- S
- Scott Moore
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Jul 19, 2005 7:00 PM

The solution is the same for all fixed point calculations. Create a program that uses floating point and makes a table in fixed point, then use a fixed point lookup to that.

- T
- Tauno Voipio
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Jul 19, 2005 7:44 PM

And get quickly out of fixed-point dynamic range ...

That's correct - but it's also quite PITA compared to the tabular method.

It depends on the initial data. It's true that the deviations will be weighted differently from true least-squares fit.

On the other hand, a pure least-squares fit for an exponential function will be an iterative process, where the simple log fit (as above) gives a very good starting point.

--

Tauno Voipio
tauno voipio (at) iki fi

- D
- David Kastrup
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Jul 19, 2005 7:52 PM

What about "normalize" did you not understand?

--
David Kastrup, Kriemhildstr. 15, 44793 Bochum

- J
- Jonathan Kirwan
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Jul 19, 2005 8:04 PM

I'm just guessing here, but you are interested in linearizing your data and applying a least squares fit to the result? There are a number of important considerations outside of just the log function, if so.

A sampled array, possibly summed -- okay. What is the dynamic range of these integers, in practice? (A small range would possibly make the log() algorithm easier, but I don't expect an answer here that will make much practical difference.)

Knowing the MCU itself would help, by the way, rather than keeping it abstract. However, you can easily compute the log() of a 16-bit number in less than a millsecond on many of them.

Yes, I have. But before I waste a lot of time trying to discuss a wide span of information only some of which you really care about, I'd like to know:

(1) The smallest and largest 16-bit integer that must be dealt with. [Here, I assume for example, that you do NOT need to deal with computing the ln(0). But what other restrictions on range can you offer?] (2) The desired fixed-point integer output format you want -- for example, ln(65535)=11.09, so you may need a 5.11 format. But I don't know if you need to support 5 bits of integer result. For example, if your range doesn't go above 2980 then you may be able to use a 4.12 format. I assume you do not need a sign.

Basically, what is your input range and what precision do you need on the output.

Jon

- T
- Timothy Little
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Jul 19, 2005 9:38 PM

The downside is that this procedure typically weights goodness of fit to the small values much more strongly than the large ones. Combined with quantization, this can be a recipe for disaster.

It's even worse if the original values have some amount of random noise -- the noise could conceivably make one of the original data points zero or even negative.

I'd want to know a lot more about the problem domain before recommending such an unstable procedure.

- Tim

- E
- Eric Meurville
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Jul 20, 2005 10:08 AM

Hello Jonathan,

(1) The data to be fitted come from a Hall effect sensor from iC-Haus

formatting link

which delivers angular positions of a rotating magnet. The output A of this sensor is connected to the Input Capture Module of a dsPIC30F from Microchip able to measure time between edges (every edge, every falling or rising edge, every 4th edge or every 16th edge). Presently, the modules delivers 64 durations (time between 2 consecutive edges) per rotor turn and values are unsigned integers between 192 and 57600 (16-bit timer for rotation speed between 1 and 300 Hz, Fosc = 7.3728 MHz x 16, Prescaler timer = 8 and Hall sensor resolution = 64).

The magnet rotation is an exponential decay as after exciting it with an external rotating magnetic field till 300 Hz, we cut the excitation and the magnet being immersed in a viscous solution sees its rotation speed slow down exponentially till .

(2)

5.11 format seems to be suitable as the range of ln calculation is between 192 and 57600.

Regards,

Eric.

- J
- Jonathan Kirwan
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Jul 20, 2005 5:30 PM

So, it is to be fitted.

Is it then the case that:

1/x(t) = A * e^[-k*t] + C

where you are measuring the interval, x(t)? If so, I take it that the viscosity is 'k'. So:

1/x(t) = A * e^[-k*t] + C -- magically assume that C=0, for reasons I don't understand -- 1/x(t) = A * e^[-k*t] ln(1/x(t)) = -k*t + ln(A) -ln(x(t)) = -k*t + ln(A) ln(x(t)) = k*t - ln(A)

And where you will use a linear least squares fit, assuming that there is no error in your 't' ordinate and all the error is along the x(t) direction (the usual, very simple least squares technique, with obvious and simple partials to solve to yield the algorithm.)

Or am I just way off?

But what is C and how do you remove it, before taking ln()? If you do plan to use least squares in the log domain, how do you plan to remove this offset? Have you also considered how the measurement uncertainty itself changes as a function of time after casting the data into the log domain?

Okay.

I haven't looked at the dsPIC (assuming that this is the processor to which you plan to add your code.) But does it have a barrel shifter available?

Jon

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Jul 20, 2005 5:47 PM

It isn't hard to do the proper weighting. I know of many implementations that don't, though. It is a little harder in fixed point, but with 16 bit logs you should be able to do the fit in 32 bit fixed point, though I didn't try it.

If you know the uncertainties you can include that in the weight calculation. As far as this, Bevington is my favorite reference. I would probably just remove the negative or zero points, but you do have to watch for them.

-- glen

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Jul 20, 2005 5:54 PM

My favorite reference for fixed point exp() and log() is Knuth's "Metafont: The Program". That is in Pascal, but there are C versions of Metafont, some done through machine source translation. In any case, the algorithm should be described well enough that you can write C code from it.

-- glen

- J
- Jonathan Kirwan
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Jul 20, 2005 6:25 PM

The effect can be relatively easily managed, in practice. If V_t is the signal and the noise is Sn_t, and it is the case that |Sn_t| > noise -- the noise could conceivably make one of the original data

Haven't read that reference. But it isn't uncommon to include an intentional bias in the electronics to keep all values positive. It can be measured accurately, as well with appropriate design, so that the subtraction applied is accurate and doesn't drift over time.

Also, in any case, the exponential decay isn't taken (or shouldn't be) so long that the offset is actually neared or reached. There is also an optimal duration of sampling, a function of the expected tau, which arises when analyzing the stochastics. The repeatability of the measured tau is improved by either judicious a priori selection of this period or feedback that adjusts it on the basis of prior measurements, allowing it to settle after a few repeats.

Jon

- E
- eric.meurville
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Jul 20, 2005 8:02 PM

010006030503090902000604 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit

Please, see my answers inserted.

Jonathan Kirwan a écrit:

OK with that except that actually k is the damping factor which proportional to viscosity. But this is a detail.

- E
- eric.meurville
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Jul 20, 2005 8:04 PM

070207040202090703010609 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit

Please, see my answers inserted.

Jonathan Kirwan a écrit:

OK with that except that actually k is the damping factor which proportional to viscosity. But this is a detail.

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Jul 20, 2005 8:28 PM

If I am reading my 25 year old assembly code for log2(x) correctly, a perfectly usable levelled Tschebychev approximation in 1..2, that executes in 9 millisecs on a 2 Mhz 8080, is:

t = (x - sqrt(2)) / (x + sqrt(2)); t2 = t * t; ln2 = t * (a1 * t2 + a0) + 0.5;

with a1 = 0.9835 and a0 = 2.8852. The value of sqrt(2) is fairly well known. The result was at least 4 digits accurate, in an arithmetic system using 16 bit significands. The error is obviously zero at x == sqrt(2).

For fitting purposes the log base doesn't matter, and using base 2 the characteristic is formed by normalizing and counting shifts.

You have the magic numbers, a few minutes with a calculator will untangle any misconceptions. I am too lazy.

--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
     USE worldnet address!

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Jul 20, 2005 10:53 PM

Please do not use html or mime. Usenet is a text mechanism. Deleted as a security measure.

--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
     USE worldnet address!

- J
- Jonathan Kirwan
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Jul 20, 2005 11:17 PM

Which, how to remove C or the fact that your exponential tail data will incorrectly dominate your result, unless you use a correcting weighting factor?

If you imagine that you are taking sampled measurements of some real underlying physical process, there is some error associated there. If you take the same measurement 10 times you will not get the exact same data set every time. The measured points contain added noise, which is unpredictable. In electronic systems, this noise is usually the aggregate of many individual Poisson events, which yields a Gaussian distribution.

For example, let's ignore this "count" stuff you are working with and consider a more traditional measurement of decaying volts. Say the noise has a standard deviation of 0.1 volt. In electronics systems, it is usually the case that this 0.1V deviation is the same whether you are measuring it at a 10V signal level or at a 1V signal level. So when you sample a portion of the decay, as it falls from say 10V to

1V, your expected 1-sigma noise across all the samples is still +/-0.1V, whether the measurement was 10V, 5V, or 1V.

Now, when you take the ln() of your data set, the impact of this noise upon the resulting log-values will no longer be equal in terms of %. a 0.1V figure against 10V will not show the same deviation in the log domain as will a 0.1V figure against 1V. This fact can greatly impact the way a naive least square fit routine calculates the slope. It will incorrectly over-emphasize the data points at the tail of the curve and they will dominate the calculated slope. I already posted to this thread, elsewhere, that a simple weighting factor, based on the square of the raw data, can be applied to the least square fitting routine in order to treat the variations in the log domain correctly.

Essentially, it amounts to this: uncertainty in the measurement domain acquires a new signal-dependence in its uncertainty, when placed into the log-domain. If the uncertainty is constant in the measurement domain and is also small compared to the signal itself, then you can use 1st order Taylor's to approximate

ln( V_t +- Sn_t ) = ln( V_t ) +- ( Sn_t / V_t )

And here you can see that the relative noise in the log domain is scaled by (1/V_t) -- in other words, you need to adjust how you weight the error in each point, based on the measurement itself.

Actually, closer thinking on a variety of issues can help. This is only a hand-waving pointer in one general direction. There are many other subtle issues to consider, such as selecting the right measurement window (you should easily realize that too little data will be bad and that too much data, mostly in the nearly useless tail, will similarly not help, and that this implies an optimal middle ground to seek), etc. But that's what I was referring to.

Jon

- J
- Jonathan Kirwan
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Jul 21, 2005 12:23 AM

I think this idea is worth investigating by the OP. The fact that it has a barrel shifter means its floating point may be fast enough.

If not, the idea should be to completely avoid using the floating point package. Also because this dsPIC has a barrel shifter and, most likely, a MAC, it provides incredible opportunities for speed in ln() code designed for it and avoiding the more general purpose floating point package. So if your suggestion isn't good enough (and I haven't looked at the errors in the least bits to see, nor evaluated their impacts on the resulting least squares fitting of the slope to judge how important they may be) in terms of speed or accuracy, the door remains open for an all-integer evaluation.

Kind of:

1/y = A*e^(-k*t) -log y = -k*t*log(e) + log(A) log y = [k*log(e)]*t - log(A)

Note that the slope computed is k*log(e). I'm sure they can calibrate things either way, though. So with a different table relating viscosities to the slope, they get to the same place.

The basics of the algorithm I use is:

y = ln( x ) = ln( x*(2^s) ) - ln( (2^s) ) = ln( x*(2^s) ) - s * ln( 2 )

's' is a binary shift used to normalize x, so that the precision of the result is maintained. You store the ln( 2 ) in fixed format, which makes it easy to multiply by s using only integer operations.

The ln() function is then computed on a normalized value that is always guaranteed to be in the range of 1-2. With fixed-point integer constants, an integer MAC, and a barrel shifter, it's very fast and only a few lines of code to handle the needed 5 constants beyond the ln(2) value. I use 6 constants and similar number of MAC operations. On the dsPIC, I'm sure it will be MUCH less time than dinging around with generic floating point operations, even though they will be also powered by the barrel shifter.

One can use the exact same algorithm with the barrel shifter and MAC and just supply a different set of constants to get several different log bases, too.

But in any case, I think the OP might look at your example and see if it meets the need and is fast enough. It's less work to code, that is for sure.

Jon