# Re: Turn a divide into a multiply

#### Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

•  Subject
• Author
• Posted on

What are the DAC resolutions - 8 bits in the PCA, or more external ?
I can see the above eqn would have rounding errors, but if you tuned
both the step size, and the 1ms time, a better slope-fit would result.

Work out your numeric limits, and what the tolerances are, and you get
a handle on the number of bits of precision needed.

The 8051 gives 16 bit results in both MUL and DIV[REM], so you might
get enough from that with some carefull scaling.

Divide routines will give a remainder, but it is not always preserved.
I recall we changed our Mod51 libraries to keep the remainder intact,
usefull for formatted writes - allowed  N MOD 10 and N DIV 10 in a
single call.

-jg

Re: Turn a divide into a multiply

Have you considered a tweaked Bresenham algorithm?

Using the x axis as time, you could compute the next DAC reload value on the
fly every millisecond with this approach, removing the 'jaggies' and
eliminating the floating point entirely.

http://www.cs.helsinki.fi/group/goa/mallinnus/lines/bresenh.html
http://www.cs.unc.edu/~hoff/projects/comp235/bresline/perform0.html

Kelly

Re: Turn a divide into a multiply
Instead of dividing for every step, create multi-byte fractions and just

--
#include <standard.disclaimer>
_
Kevin D Quitt  USA 91387-4454         96.37% of all statistics are made up
We've slightly trimmed the long signature. Click to see the full one.
Re: Turn a divide into a multiply

timer
direction.

It's an external 16 bit SPI DAC (yes, it's a bloody \$18 part!) and we're
using the entire 16 bit range... unfortunately. ;-(

It will have a few rounding errors, however the DAC has 2-3 bits of
precision that we don't care about, so it gets absorbed. Timing is critical,
but it isn't *THAT* critical, and even a .8 fixed point step is within the
range of acceptability.

16.8 Is OK, actually, but 16.16 would be better. Know of a 32x8 divide? I
can just shift the 16 bit value left 16 bits, divide, and my low 16 bit is
my decimal. I just hope it's under 300usec...

Unfortunately, it's not high enough resolution by itself.

-->Neil

Re: Turn a divide into a multiply

You only need 16.0 and some rounding control as you sequence.

There is at least one 16 by 8 bit divide with remainder to be found on
www.8052.com (think it was in math51.zip).

Says it runs in 232us average, 276 worst case at 12MHz.

You can run a Bresenham algorithm on the slope of the remainder to
determine if your DAC (inc/dec)rement is the quotient or quotient +1.

Re: Turn a divide into a multiply

There would be no cumulative error. You need to understand the difference
between a Bresenham algorithm and a simple incremental algorithm using
fractions.

Re: Turn a divide into a multiply

Like at this

#include <stdlib>
#include <assert>
#include <iostream>

int main(int argc, char* argv[]) {

using namespace std;

assert(argc == 4);

int starty = atoi(argv[1]);
int endy = atoi(argv[2]);
int dx = atoi(argv[3]);

int dy = endy - starty;
int qy = dy / dx;
int ry = abs(dy % dx);
int oy = starty;

int d = 2 * ry - dx;
int d0 = 2 * ry;
int d1 = 2 * (ry - dx);
int o1 = qy + (dy > 0 ? 1 : -1);

for(int i = 1; i <= dx; i++) {
if(d < 0) {
d += d0;
oy += qy;
} else {
d += d1;
oy += o1;
}
cout << i << "\t"  << oy << endl;
}
return 0;
}

Re: Turn a divide into a multiply

The beauty of this algorithm is that you need an integer division only. The
fractional part is handled by the Bresenham-like error-accumulation part of
the loop. Why do you need 16.8 precision for this?

Andras Tantos

Re: Turn a divide into a multiply
says...

Not familiar w your processor but ... If you've got the memory space...
will a lookup table work?

FWIW
Chris.

Re: Turn a divide into a multiply
Ok, here is an answer : DAC ramping fro many value to any value in discrete
time steps, working in all cases, WITHOUT ANY DIVISION and with only
multiplications by two, integers only of course !!! Derived from the
Bresenham algorithm :

#include "stdafx.h"
#define abs(x) ((x)>=0 ? (x) : -(x))
#define sign(x) ((x)>=0 ? 1 : -1)

int main(int argc, char* argv[])
{
int xa;
int ya;
int xb;
int yb;

xa=0; //start time
xb10%;//stop time
ya10%0;//start dac value
yb23%40;//stop dac value

int dx,dy,x,y,s1,s2,e,temp,swap,i;
int xprec;

xprec=xa-1;

dy=abs(yb-ya);
dx=abs(xb-xa);
s1=sign(xb-xa);
s2=sign(yb-ya);
x=xa;
y=ya;
if (dy>dx)
{
temp=dx;
dx=dy;
dy=temp;
swap=1;
}
else swap=0;
e=2*dy-dx;
for(i=1;i<=dx;i++)
{
if (x!=xprec)
{
printf("DAC value at t=%3d : %3d\n",x,y);
xprec=x;
}
while (e>=0)
{
if (swap==1) x+=s1;
else y+=s2;
e-=2*dx;
}
if (swap==1) y+=s2;
else x+=s1;
e+=2*dy;
}
x++;
printf("DAC value at t=%3d : %3d\n",x,y);

return 0;
}

Exemples of output :

10 time steps, DAC from 100 to 2340 :
DAC value at t=  0 : 100
DAC value at t=  1 : 212
DAC value at t=  2 : 436
DAC value at t=  3 : 660
DAC value at t=  4 : 884
DAC value at t=  5 : 1108
DAC value at t=  6 : 1332
DAC value at t=  7 : 1556
DAC value at t=  8 : 1780
DAC value at t=  9 : 2004
DAC value at t= 10 : 2228
DAC value at t= 11 : 2340

10 time steps, DAC from 2340 to 2337
DAC value at t=  0 : 2340
DAC value at t=  1 : 2340
DAC value at t=  2 : 2339
DAC value at t=  3 : 2339
DAC value at t=  4 : 2339
DAC value at t=  5 : 2338
DAC value at t=  6 : 2338
DAC value at t=  7 : 2338
DAC value at t=  8 : 2338
DAC value at t=  9 : 2337
DAC value at t= 11 : 2337

Nice, isn't it ?

Robert Lacoste - ALCIOM : The mixed signals experts
http://www.alciom.com

snipped-for-privacy@corp.supernews.com...

killing
CPU

Re: Turn a divide into a multiply
On Fri, 1 Aug 2003 11:31:31 +0200, "Robert Lacoste"

Hmm.  Some notes:

(1)  Where did "t10%" go in the second group?

(2)  The 'while(e>=0)' can be replaced with 'if(e>=0)', as it
either takes place, or not, but a single subtraction is
always sufficient for 'e'.

(3)  The problem with the algorithm is that the for() loop will
continue over and over and over for long stretches before
(x!=xprec), when the DAC values are widely separated (which
may often be the case) as compared with the time tick.  The
loop continues until 'x' changes and this may be a while.

And worse, still, it will happen just about every time,
too.  Once you pay the price, you pay it again and again.

I don't think this will save the OP anything over the single
division at the beginning that he's stuck with, now.  The
looping is worse, I suspect, than the problem it is supposed to
solve.

This is why I didn't comment earlier about his problem.  I
recognized that this isn't just a simple hack of Bresenham, to
fix the slow timing.  Bresenham flips everything in the first 45
degrees, switching independent variables around.  This is great
for drawing pixels and it avoids the need for division because
of limiting the range to TAN(angle)<=1.  Figuring a faster
division algorithm or converting the division in that place to a
multiplication and paying for that at a different place where
the time can be afforded, might be a solution.  A gcd() step
with only add/subtract might also help a little, but probably
often not much at all.

Anyway, it *is* an interesting problem.

Jon

Re: Turn a divide into a multiply
On Fri, 01 Aug 2003 18:48:03 GMT, Jonathan Kirwan

(4)  At t=1 in your first group, the DAC output value is 212.
This is a step of 112.  At t=2, the DAC value is 436 --
a step of 224.  It's smoother, later on.  A similar thing
happens at the end, as well.  This is a natural consequence
of Bresenham, but it's a little odd for a DAC drive.

Re: Turn a divide into a multiply
On Fri, 1 Aug 2003 10:58:59 -0700, "Neil Bradley"

Neil, it's a modification of Bresenham to repress output while
the dependent variable remains unchanged.  Brute force and
probably what Kelly was hinting at.  It will loop once for each
DAC count it is skipping past.  And you will pay that loop for
each time step.  It will probably eat up your CPU time more than
the single division at the outset, unless your DAC values have a
rather limited range of change.

Jon

Re: Turn a divide into a multiply

Yeah, I noticed that when I coded it up. But using up cumulatively MORE time
over a period of time is OK. And it's really not that bad - about 10-20usec
each step. I just need faster response, so this is just what the doctor
ordered for my particular application.

-->Neil

Re: Turn a divide into a multiply
On Fri, 1 Aug 2003 16:08:45 -0700, "Neil Bradley"

Well, if I understand your "10-20us" statement, it's per
iteration.  If you are stepping from DAC20%0 to DAC30%00 in say
10 intervals, that about 2800/10 or 280 iterations per DAC
update event.  At 10-20us per, that is a LOT of time to chew up.

Just keep it in mind.

Jon

Re: Turn a divide into a multiply

time
10-20usec

It's using 30usec every millisecond since my timer resolution is set to 1ms.
So it may eat up a lot of CPU time, but it'll be over a long period.

-->Neil

Re: Turn a divide into a multiply
On Sat, 2 Aug 2003 01:33:36 -0700, "Neil Bradley"

Oh.  I guess I didn't understand.  I'm not sure I do, yet.

You have a 1ms timer event.  You need to go from a starting DAC
value to an ending DAC value over X milliseconds, where X is not
a constant.  Yes?  And that each millisecond, you post a new DAC
value to the DAC, on your way there.  And then I gathered that
the method you were using before is "bresenham-like."  So in
that case, your division would only be required exactly once,
not each millisecond step.  Now, I thought I understood this to
be a pain because even that once, taking place between two
millisecond events, ate up 1/2 your time and you couldn't afford
even that expense even if it was only once.

Now, let's say you want to proceed from 200 to 3000 in 10ms
using the brute force modification (which I'm surprised you
needed exposed to you in this way, since it was a straight
forward change) to Bresenham.  [I'm assuming that sometimes you
might want to transition in 10ms, this way, or at least that it
isn't out of the question.]  So now, your code decides to
proceed to calculate the next DAC value from the starting DAC
value.  This takes 140 iterations of the loop to get to this
first value, roughly.  (Going to the second value will take
about twice that, or 280 iterations of the loop.)  Are you
telling me that this takes 30us?

If so, that's great.  But I'm just a little curious.

Jon

Re: Turn a divide into a multiply
On Sun, 3 Aug 2003 00:01:57 -0700, "Neil Bradley"

Ah hah!!  Just as I suspected.  Okay.  All is clear, now.

Jon

Re: Turn a divide into a multiply
On Sun, 3 Aug 2003 00:01:57 -0700, "Neil Bradley"

670us on a 18MHz RD2 '51 running in X2 mode means 670*3 = 2010
instruction cycles.  That seems a little excessive.

Actually, it's rather standard to avoid all divisions.  This is
done by dividing up the unit-circle into 8 octants and folding.
That way, the slope is always less that one and the dependent
variable advances at a rate <= to the independent variable rate.
Division simply isn't needed in that case.  Never has been, so
if you've been seeing it, then it's probably not Bresenham.
Bresenham is based on recurrances and error accumulations are
shifted show as to avoid even a division by 2.

I think it's the 'for()' looping on the mostly false 'if
(x!=xprec)' that is doing it.  The while() should either execute
zero or one time.

Anyway, you should look more at your division routine.  Consider

Jon

Re: Turn a divide into a multiply
On Sun, 03 Aug 2003 08:50:21 GMT, Jonathan Kirwan

In assembler, of course.  That is, if you want to keep your old
algorithm.

Do you know how it's done?  Assume that your 32-bit numerator is
placed in two 16-bit words labeled 'a' for the high word and 'q'
for the low word.  Also, assume that 'd' is your 16-bit divisor.

Some of the semantics simply aren't present in C.

First off, you need to check for overflow.  This is done this
way, upon entering your division routine:

if (a >= d)
Overflow ();

This is an unsigned comparison, by the way.

If the upper half of your 32-bit value is at or above your
bits.  So that test quickly eliminates this consideration.  It
also pretty much excludes 'd' being 0, because with an unsigned
comparison, if d=0 then you are sure that 'a' must be at least =
to 'd', if not greater.

So, now that that is out of the way, you need to perform some
subtractions.  But since different processors handle their carry
in different ways for subtraction (for example, the MSP430 sets
or clears the carry in the opposite way that the x86 processor
does for subtractions and comparisons), it's handy to set up a
value which is the two's complement of 'd':

nd= 1 + ~d;

Now, in assembly on some processors, this might just be "neg."
In C, with unsigned ints, the exact way is to complement and
then increment, as shown above.

Now, loop just 16 times:

for (i= 0; i < 16; ++i) {

and the body where the good stuff gets done,

carry= LeftShift (&a, &q);
if (!carry)
else
if (carry) {
++q;
a= r;
}

Or, the above could be re-written:

if (LeftShift (&a, &q))
++q;
a= r;
} else if (Add (a, nd, &r)) {
++q;
a= r;
}

The LeftShift() function shifts the two parameter values,
assuming that the first is the high-order 16 bits and the second
is the low-order, shifting in a zero into the least significant
bit of the second parameter, and shifting out the most
significant bit of the first parameter into the return value.

the third and returning a '1' if a carry occurred.

...and then the end of loop:

}

In assembly, it's very easy to capture the carry out from the
left shift operation.  In C, not so easy.  Also, in assembly, it
is very easy to capture the carry out from the Add() operation
of (a+nd)-->r.

I simply cannot imagine why that is taking 670us on your CPU.
At 670us, this is about 42us per loop or 126 instruction cycles.

I count 38 cycles per loop, just doing a quick hand coding of
the main loop shown above, but in assembly code.  That means 608
cycles in the body and the loop conditional.  There is some set
up, I suppose.  But you could inline this code to avoid the
worst of it.  There's 3 cycles to init the loop counter.  I'm
sure there are some other cycles you could toss in there.  But
it should come out to about 210us per division operation, with
your 18MHz X2-type RD2 '51 device.

I'll post my code example if you want it.

Jon