#### Do you have a question? Post it now! No Registration Necessary

**posted on**

- Jim Granville

July 29, 2003, 10:29 pm

What are the DAC resolutions - 8 bits in the PCA, or more external ?

I can see the above eqn would have rounding errors, but if you tuned

both the step size, and the 1ms time, a better slope-fit would result.

Work out your numeric limits, and what the tolerances are, and you get

a handle on the number of bits of precision needed.

The 8051 gives 16 bit results in both MUL and DIV[REM], so you might

get enough from that with some carefull scaling.

Divide routines will give a remainder, but it is not always preserved.

I recall we changed our Mod51 libraries to keep the remainder intact,

usefull for formatted writes - allowed N MOD 10 and N DIV 10 in a

single call.

-jg

Re: Turn a divide into a multiply

Have you considered a tweaked Bresenham algorithm?

Using the x axis as time, you could compute the next DAC reload value on the

fly every millisecond with this approach, removing the 'jaggies' and

eliminating the floating point entirely.

http://www.cs.helsinki.fi/group/goa/mallinnus/lines/bresenh.html

http://www.cs.unc.edu/~hoff/projects/comp235/bresline/perform0.html

Kelly

Re: Turn a divide into a multiply

timer

direction.

It's an external 16 bit SPI DAC (yes, it's a bloody $18 part!) and we're

using the entire 16 bit range... unfortunately. ;-(

It will have a few rounding errors, however the DAC has 2-3 bits of

precision that we don't care about, so it gets absorbed. Timing is critical,

but it isn't

***THAT***critical, and even a .8 fixed point step is within the

range of acceptability.

16.8 Is OK, actually, but 16.16 would be better. Know of a 32x8 divide? I

can just shift the 16 bit value left 16 bits, divide, and my low 16 bit is

my decimal. I just hope it's under 300usec...

Unfortunately, it's not high enough resolution by itself.

-->Neil

Re: Turn a divide into a multiply

You only need 16.0 and some rounding control as you sequence.

There is at least one 16 by 8 bit divide with remainder to be found on

www.8052.com (think it was in math51.zip).

Says it runs in 232us average, 276 worst case at 12MHz.

You can run a Bresenham algorithm on the slope of the remainder to

determine if your DAC (inc/dec)rement is the quotient or quotient +1.

Re: Turn a divide into a multiply

Like at this

#include <stdlib>

#include <assert>

#include <iostream>

int main(int argc, char* argv[]) {

using namespace std;

assert(argc == 4);

int starty = atoi(argv[1]);

int endy = atoi(argv[2]);

int dx = atoi(argv[3]);

int dy = endy - starty;

int qy = dy / dx;

int ry = abs(dy % dx);

int oy = starty;

int d = 2 * ry - dx;

int d0 = 2 * ry;

int d1 = 2 * (ry - dx);

int o1 = qy + (dy > 0 ? 1 : -1);

for(int i = 1; i <= dx; i++) {

if(d < 0) {

d += d0;

oy += qy;

} else {

d += d1;

oy += o1;

}

cout << i << "\t" << oy << endl;

}

return 0;

}

Re: Turn a divide into a multiply

Ok, here is an answer : DAC ramping fro many value to any value in discrete

time steps, working in all cases, WITHOUT ANY DIVISION and with only

multiplications by two, integers only of course !!! Derived from the

Bresenham algorithm :

#include "stdafx.h"

#define abs(x) ((x)>=0 ? (x) : -(x))

#define sign(x) ((x)>=0 ? 1 : -1)

int main(int argc, char* argv[])

{

int xa;

int ya;

int xb;

int yb;

xa=0; //start time

xb10%;//stop time

ya10%0;//start dac value

yb23%40;//stop dac value

int dx,dy,x,y,s1,s2,e,temp,swap,i;

int xprec;

xprec=xa-1;

dy=abs(yb-ya);

dx=abs(xb-xa);

s1=sign(xb-xa);

s2=sign(yb-ya);

x=xa;

y=ya;

if (dy>dx)

{

temp=dx;

dx=dy;

dy=temp;

swap=1;

}

else swap=0;

e=2*dy-dx;

for(i=1;i<=dx;i++)

{

if (x!=xprec)

{

printf("DAC value at t=%3d : %3d\n",x,y);

xprec=x;

}

while (e>=0)

{

if (swap==1) x+=s1;

else y+=s2;

e-=2*dx;

}

if (swap==1) y+=s2;

else x+=s1;

e+=2*dy;

}

x++;

printf("DAC value at t=%3d : %3d\n",x,y);

return 0;

}

Exemples of output :

10 time steps, DAC from 100 to 2340 :

DAC value at t= 0 : 100

DAC value at t= 1 : 212

DAC value at t= 2 : 436

DAC value at t= 3 : 660

DAC value at t= 4 : 884

DAC value at t= 5 : 1108

DAC value at t= 6 : 1332

DAC value at t= 7 : 1556

DAC value at t= 8 : 1780

DAC value at t= 9 : 2004

DAC value at t= 10 : 2228

DAC value at t= 11 : 2340

10 time steps, DAC from 2340 to 2337

DAC value at t= 0 : 2340

DAC value at t= 1 : 2340

DAC value at t= 2 : 2339

DAC value at t= 3 : 2339

DAC value at t= 4 : 2339

DAC value at t= 5 : 2338

DAC value at t= 6 : 2338

DAC value at t= 7 : 2338

DAC value at t= 8 : 2338

DAC value at t= 9 : 2337

DAC value at t= 11 : 2337

Nice, isn't it ?

Robert Lacoste - ALCIOM : The mixed signals experts

http://www.alciom.com

snipped-for-privacy@corp.supernews.com...

killing

CPU

time steps, working in all cases, WITHOUT ANY DIVISION and with only

multiplications by two, integers only of course !!! Derived from the

Bresenham algorithm :

#include "stdafx.h"

#define abs(x) ((x)>=0 ? (x) : -(x))

#define sign(x) ((x)>=0 ? 1 : -1)

int main(int argc, char* argv[])

{

int xa;

int ya;

int xb;

int yb;

xa=0; //start time

xb10%;//stop time

ya10%0;//start dac value

yb23%40;//stop dac value

int dx,dy,x,y,s1,s2,e,temp,swap,i;

int xprec;

xprec=xa-1;

dy=abs(yb-ya);

dx=abs(xb-xa);

s1=sign(xb-xa);

s2=sign(yb-ya);

x=xa;

y=ya;

if (dy>dx)

{

temp=dx;

dx=dy;

dy=temp;

swap=1;

}

else swap=0;

e=2*dy-dx;

for(i=1;i<=dx;i++)

{

if (x!=xprec)

{

printf("DAC value at t=%3d : %3d\n",x,y);

xprec=x;

}

while (e>=0)

{

if (swap==1) x+=s1;

else y+=s2;

e-=2*dx;

}

if (swap==1) y+=s2;

else x+=s1;

e+=2*dy;

}

x++;

printf("DAC value at t=%3d : %3d\n",x,y);

return 0;

}

Exemples of output :

10 time steps, DAC from 100 to 2340 :

DAC value at t= 0 : 100

DAC value at t= 1 : 212

DAC value at t= 2 : 436

DAC value at t= 3 : 660

DAC value at t= 4 : 884

DAC value at t= 5 : 1108

DAC value at t= 6 : 1332

DAC value at t= 7 : 1556

DAC value at t= 8 : 1780

DAC value at t= 9 : 2004

DAC value at t= 10 : 2228

DAC value at t= 11 : 2340

10 time steps, DAC from 2340 to 2337

DAC value at t= 0 : 2340

DAC value at t= 1 : 2340

DAC value at t= 2 : 2339

DAC value at t= 3 : 2339

DAC value at t= 4 : 2339

DAC value at t= 5 : 2338

DAC value at t= 6 : 2338

DAC value at t= 7 : 2338

DAC value at t= 8 : 2338

DAC value at t= 9 : 2337

DAC value at t= 11 : 2337

Nice, isn't it ?

Robert Lacoste - ALCIOM : The mixed signals experts

http://www.alciom.com

snipped-for-privacy@corp.supernews.com...

killing

CPU

Re: Turn a divide into a multiply

On Fri, 1 Aug 2003 11:31:31 +0200, "Robert Lacoste"

Hmm. Some notes:

(1) Where did "t10%" go in the second group?

(2) The 'while(e>=0)' can be replaced with 'if(e>=0)', as it

either takes place, or not, but a single subtraction is

always sufficient for 'e'.

(3) The problem with the algorithm is that the for() loop will

continue over and over and over for long stretches before

(x!=xprec), when the DAC values are widely separated (which

may often be the case) as compared with the time tick. The

loop continues until 'x' changes and this may be a while.

And worse, still, it will happen just about every time,

too. Once you pay the price, you pay it again and again.

I don't think this will save the OP anything over the single

division at the beginning that he's stuck with, now. The

looping is worse, I suspect, than the problem it is supposed to

solve.

This is why I didn't comment earlier about his problem. I

recognized that this isn't just a simple hack of Bresenham, to

fix the slow timing. Bresenham flips everything in the first 45

degrees, switching independent variables around. This is great

for drawing pixels and it avoids the need for division because

of limiting the range to TAN(angle)<=1. Figuring a faster

division algorithm or converting the division in that place to a

multiplication and paying for that at a different place where

the time can be afforded, might be a solution. A gcd() step

with only add/subtract might also help a little, but probably

often not much at all.

Anyway, it

Jon

Hmm. Some notes:

(1) Where did "t10%" go in the second group?

(2) The 'while(e>=0)' can be replaced with 'if(e>=0)', as it

either takes place, or not, but a single subtraction is

always sufficient for 'e'.

(3) The problem with the algorithm is that the for() loop will

continue over and over and over for long stretches before

(x!=xprec), when the DAC values are widely separated (which

may often be the case) as compared with the time tick. The

loop continues until 'x' changes and this may be a while.

And worse, still, it will happen just about every time,

too. Once you pay the price, you pay it again and again.

I don't think this will save the OP anything over the single

division at the beginning that he's stuck with, now. The

looping is worse, I suspect, than the problem it is supposed to

solve.

This is why I didn't comment earlier about his problem. I

recognized that this isn't just a simple hack of Bresenham, to

fix the slow timing. Bresenham flips everything in the first 45

degrees, switching independent variables around. This is great

for drawing pixels and it avoids the need for division because

of limiting the range to TAN(angle)<=1. Figuring a faster

division algorithm or converting the division in that place to a

multiplication and paying for that at a different place where

the time can be afforded, might be a solution. A gcd() step

with only add/subtract might also help a little, but probably

often not much at all.

Anyway, it

***is***an interesting problem.Jon

Re: Turn a divide into a multiply

(4) At t=1 in your first group, the DAC output value is 212.

This is a step of 112. At t=2, the DAC value is 436 --

a step of 224. It's smoother, later on. A similar thing

happens at the end, as well. This is a natural consequence

of Bresenham, but it's a little odd for a DAC drive.

Re: Turn a divide into a multiply

On Fri, 1 Aug 2003 10:58:59 -0700, "Neil Bradley"

Neil, it's a modification of Bresenham to repress output while

the dependent variable remains unchanged. Brute force and

probably what Kelly was hinting at. It will loop once for each

DAC count it is skipping past. And you will pay that loop for

each time step. It will probably eat up your CPU time more than

the single division at the outset, unless your DAC values have a

rather limited range of change.

Jon

Neil, it's a modification of Bresenham to repress output while

the dependent variable remains unchanged. Brute force and

probably what Kelly was hinting at. It will loop once for each

DAC count it is skipping past. And you will pay that loop for

each time step. It will probably eat up your CPU time more than

the single division at the outset, unless your DAC values have a

rather limited range of change.

Jon

Re: Turn a divide into a multiply

Yeah, I noticed that when I coded it up. But using up cumulatively MORE time

over a period of time is OK. And it's really not that bad - about 10-20usec

each step. I just need faster response, so this is just what the doctor

ordered for my particular application.

-->Neil

Re: Turn a divide into a multiply

On Fri, 1 Aug 2003 16:08:45 -0700, "Neil Bradley"

Well, if I understand your "10-20us" statement, it's per

iteration. If you are stepping from DAC20%0 to DAC30%00 in say

10 intervals, that about 2800/10 or 280 iterations per DAC

update event. At 10-20us per, that is a LOT of time to chew up.

Just keep it in mind.

Jon

Well, if I understand your "10-20us" statement, it's per

iteration. If you are stepping from DAC20%0 to DAC30%00 in say

10 intervals, that about 2800/10 or 280 iterations per DAC

update event. At 10-20us per, that is a LOT of time to chew up.

Just keep it in mind.

Jon

Re: Turn a divide into a multiply

On Sat, 2 Aug 2003 01:33:36 -0700, "Neil Bradley"

Oh. I guess I didn't understand. I'm not sure I do, yet.

You have a 1ms timer event. You need to go from a starting DAC

value to an ending DAC value over X milliseconds, where X is not

a constant. Yes? And that each millisecond, you post a new DAC

value to the DAC, on your way there. And then I gathered that

the method you were using before is "bresenham-like." So in

that case, your division would only be required exactly once,

not each millisecond step. Now, I thought I understood this to

be a pain because even that once, taking place between two

millisecond events, ate up 1/2 your time and you couldn't afford

even that expense even if it was only once.

Now, let's say you want to proceed from 200 to 3000 in 10ms

using the brute force modification (which I'm surprised you

needed exposed to you in this way, since it was a straight

forward change) to Bresenham. [I'm assuming that sometimes you

might want to transition in 10ms, this way, or at least that it

isn't out of the question.] So now, your code decides to

proceed to calculate the next DAC value from the starting DAC

value. This takes 140 iterations of the loop to get to this

first value, roughly. (Going to the second value will take

about twice that, or 280 iterations of the loop.) Are you

telling me that this takes 30us?

If so, that's great. But I'm just a little curious.

Jon

Oh. I guess I didn't understand. I'm not sure I do, yet.

You have a 1ms timer event. You need to go from a starting DAC

value to an ending DAC value over X milliseconds, where X is not

a constant. Yes? And that each millisecond, you post a new DAC

value to the DAC, on your way there. And then I gathered that

the method you were using before is "bresenham-like." So in

that case, your division would only be required exactly once,

not each millisecond step. Now, I thought I understood this to

be a pain because even that once, taking place between two

millisecond events, ate up 1/2 your time and you couldn't afford

even that expense even if it was only once.

Now, let's say you want to proceed from 200 to 3000 in 10ms

using the brute force modification (which I'm surprised you

needed exposed to you in this way, since it was a straight

forward change) to Bresenham. [I'm assuming that sometimes you

might want to transition in 10ms, this way, or at least that it

isn't out of the question.] So now, your code decides to

proceed to calculate the next DAC value from the starting DAC

value. This takes 140 iterations of the loop to get to this

first value, roughly. (Going to the second value will take

about twice that, or 280 iterations of the loop.) Are you

telling me that this takes 30us?

If so, that's great. But I'm just a little curious.

Jon

Re: Turn a divide into a multiply

On Sun, 3 Aug 2003 00:01:57 -0700, "Neil Bradley"

670us on a 18MHz RD2 '51 running in X2 mode means 670*3 = 2010

instruction cycles. That seems a little excessive.

Actually, it's rather standard to avoid all divisions. This is

done by dividing up the unit-circle into 8 octants and folding.

That way, the slope is always less that one and the dependent

variable advances at a rate <= to the independent variable rate.

Division simply isn't needed in that case. Never has been, so

if you've been seeing it, then it's probably not Bresenham.

Bresenham is based on recurrances and error accumulations are

shifted show as to avoid even a division by 2.

I think it's the 'for()' looping on the mostly false 'if

(x!=xprec)' that is doing it. The while() should either execute

zero or one time.

Anyway, you should look more at your division routine. Consider

writing your own.

Jon

670us on a 18MHz RD2 '51 running in X2 mode means 670*3 = 2010

instruction cycles. That seems a little excessive.

Actually, it's rather standard to avoid all divisions. This is

done by dividing up the unit-circle into 8 octants and folding.

That way, the slope is always less that one and the dependent

variable advances at a rate <= to the independent variable rate.

Division simply isn't needed in that case. Never has been, so

if you've been seeing it, then it's probably not Bresenham.

Bresenham is based on recurrances and error accumulations are

shifted show as to avoid even a division by 2.

I think it's the 'for()' looping on the mostly false 'if

(x!=xprec)' that is doing it. The while() should either execute

zero or one time.

Anyway, you should look more at your division routine. Consider

writing your own.

Jon

Re: Turn a divide into a multiply

On Sun, 03 Aug 2003 08:50:21 GMT, Jonathan Kirwan

In assembler, of course. That is, if you want to keep your old

algorithm.

Do you know how it's done? Assume that your 32-bit numerator is

placed in two 16-bit words labeled 'a' for the high word and 'q'

for the low word. Also, assume that 'd' is your 16-bit divisor.

I'm going to talk about this as if you were writing in assembly.

Some of the semantics simply aren't present in C.

First off, you need to check for overflow. This is done this

way, upon entering your division routine:

if (a >= d)

Overflow ();

This is an unsigned comparison, by the way.

If the upper half of your 32-bit value is at or above your

divisor, your quotient will exceed your ability to store into 16

bits. So that test quickly eliminates this consideration. It

also pretty much excludes 'd' being 0, because with an unsigned

comparison, if d=0 then you are sure that 'a' must be at least =

to 'd', if not greater.

So, now that that is out of the way, you need to perform some

subtractions. But since different processors handle their carry

in different ways for subtraction (for example, the MSP430 sets

or clears the carry in the opposite way that the x86 processor

does for subtractions and comparisons), it's handy to set up a

value which is the two's complement of 'd':

nd= 1 + ~d;

Now, in assembly on some processors, this might just be "neg."

In C, with unsigned ints, the exact way is to complement and

then increment, as shown above.

Now, loop just 16 times:

for (i= 0; i < 16; ++i) {

and the body where the good stuff gets done,

carry= LeftShift (&a, &q);

if (!carry)

carry= Add (a, nd, &r);

else

Add (a, nd, &r);

if (carry) {

++q;

a= r;

}

Or, the above could be re-written:

if (LeftShift (&a, &q))

Add (a, nd, &r);

++q;

a= r;

} else if (Add (a, nd, &r)) {

++q;

a= r;

}

The LeftShift() function shifts the two parameter values,

assuming that the first is the high-order 16 bits and the second

is the low-order, shifting in a zero into the least significant

bit of the second parameter, and shifting out the most

significant bit of the first parameter into the return value.

The Add() function adds the first two values together, giving

the third and returning a '1' if a carry occurred.

...and then the end of loop:

}

That's about it.

In assembly, it's very easy to capture the carry out from the

left shift operation. In C, not so easy. Also, in assembly, it

is very easy to capture the carry out from the Add() operation

of (a+nd)-->r.

I simply cannot imagine why that is taking 670us on your CPU.

At 670us, this is about 42us per loop or 126 instruction cycles.

I count 38 cycles per loop, just doing a quick hand coding of

the main loop shown above, but in assembly code. That means 608

cycles in the body and the loop conditional. There is some set

up, I suppose. But you could inline this code to avoid the

worst of it. There's 3 cycles to init the loop counter. I'm

sure there are some other cycles you could toss in there. But

it should come out to about 210us per division operation, with

your 18MHz X2-type RD2 '51 device.

I'll post my code example if you want it.

Jon

In assembler, of course. That is, if you want to keep your old

algorithm.

Do you know how it's done? Assume that your 32-bit numerator is

placed in two 16-bit words labeled 'a' for the high word and 'q'

for the low word. Also, assume that 'd' is your 16-bit divisor.

I'm going to talk about this as if you were writing in assembly.

Some of the semantics simply aren't present in C.

First off, you need to check for overflow. This is done this

way, upon entering your division routine:

if (a >= d)

Overflow ();

This is an unsigned comparison, by the way.

If the upper half of your 32-bit value is at or above your

divisor, your quotient will exceed your ability to store into 16

bits. So that test quickly eliminates this consideration. It

also pretty much excludes 'd' being 0, because with an unsigned

comparison, if d=0 then you are sure that 'a' must be at least =

to 'd', if not greater.

So, now that that is out of the way, you need to perform some

subtractions. But since different processors handle their carry

in different ways for subtraction (for example, the MSP430 sets

or clears the carry in the opposite way that the x86 processor

does for subtractions and comparisons), it's handy to set up a

value which is the two's complement of 'd':

nd= 1 + ~d;

Now, in assembly on some processors, this might just be "neg."

In C, with unsigned ints, the exact way is to complement and

then increment, as shown above.

Now, loop just 16 times:

for (i= 0; i < 16; ++i) {

and the body where the good stuff gets done,

carry= LeftShift (&a, &q);

if (!carry)

carry= Add (a, nd, &r);

else

Add (a, nd, &r);

if (carry) {

++q;

a= r;

}

Or, the above could be re-written:

if (LeftShift (&a, &q))

Add (a, nd, &r);

++q;

a= r;

} else if (Add (a, nd, &r)) {

++q;

a= r;

}

The LeftShift() function shifts the two parameter values,

assuming that the first is the high-order 16 bits and the second

is the low-order, shifting in a zero into the least significant

bit of the second parameter, and shifting out the most

significant bit of the first parameter into the return value.

The Add() function adds the first two values together, giving

the third and returning a '1' if a carry occurred.

...and then the end of loop:

}

That's about it.

In assembly, it's very easy to capture the carry out from the

left shift operation. In C, not so easy. Also, in assembly, it

is very easy to capture the carry out from the Add() operation

of (a+nd)-->r.

I simply cannot imagine why that is taking 670us on your CPU.

At 670us, this is about 42us per loop or 126 instruction cycles.

I count 38 cycles per loop, just doing a quick hand coding of

the main loop shown above, but in assembly code. That means 608

cycles in the body and the loop conditional. There is some set

up, I suppose. But you could inline this code to avoid the

worst of it. There's 3 cycles to init the loop counter. I'm

sure there are some other cycles you could toss in there. But

it should come out to about 210us per division operation, with

your 18MHz X2-type RD2 '51 device.

I'll post my code example if you want it.

Jon

#### Site Timeline

- » Powering a LPC2114 ARM7
- — Next thread in » Embedded Programming

- » cs8900 not reciveing frames
- — Previous thread in » Embedded Programming

- » "Boost context" for task switching on embedded Linux on ARM
- — Newest thread in » Embedded Programming

- » Zeta flying capacitor
- — The site's Newest Thread. Posted in » Electronics Design