Re: Turn a divide into a multiply

- J
- Jim Granville
  
  Contact options for registered users
posted
20 years ago

Tue, Jul 29, 2003 10:29 PM

> I'm working on an Atmel AT89C51RD2 based circuit that will need to slide > between two arbitrary DAC values at a constant time. I have a 1msec timer > that decrements (or increments) the DAC counters in a particular direction. > Mathematically I know how to do it - > > abs(DACStart - DACEnd) / # of milliseconds = Amount to step each 1ms > > However I'm on an 18Mhz 6 cycle Atmel AT89C51RD2 , and worst case for a 32 > bit divide is somewhere in the 500usec range. I need it to be under 300usec > in order to keep up with an incoming data stream. So I'm trying to avoid > doing a divide and instead somehow use a multiply. The DAC Start and DAC End > values are completely arbitrary - they can be any value, so doing table > lookups is not a possibility. ;-(

What are the DAC resolutions - 8 bits in the PCA, or more external ? I can see the above eqn would have rounding errors, but if you tuned both the step size, and the 1ms time, a better slope-fit would result.

Work out your numeric limits, and what the tolerances are, and you get a handle on the number of bits of precision needed.

The 8051 gives 16 bit results in both MUL and DIV[REM], so you might get enough from that with some carefull scaling.

Try as I may, I haven't been able to come up with an algorithm that works. > Everything either scales wrong or my remainder gets truncated. I have a 16 > bit integer part and an 8 bit decimal (16.8 fixed point) for step values, > and need something similar. > > Is there any trick I'm missing? Or, does anyone know of a 24x8 divide > routine for the 8051 style of CPU that takes under 300usec worst case? Or a > 16x8 divide that gives me a remainder?

Divide routines will give a remainder, but it is not always preserved. I recall we changed our Mod51 libraries to keep the remainder intact, usefull for formatted writes - allowed N MOD 10 and N DIV 10 in a single call.

-jg

- K
- Kelly Hall
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Tue, Jul 29, 2003 10:46 PM

Have you considered a tweaked Bresenham algorithm?

Using the x axis as time, you could compute the next DAC reload value on the fly every millisecond with this approach, removing the 'jaggies' and eliminating the floating point entirely.

formatting link

Kelly

- K
- Kevin D. Quitt
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Tue, Jul 29, 2003 10:52 PM

Instead of dividing for every step, create multi-byte fractions and just add each step.

-- #include _ Kevin D Quitt USA 91387-4454 96.37% of all statistics are made up Per the FCA, this address may not be added to any commercial mail list

- N
- Neil Bradley
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Tue, Jul 29, 2003 10:54 PM

timer

direction.

It's an external 16 bit SPI DAC (yes, it's a bloody $18 part!) and we're using the entire 16 bit range... unfortunately. ;-(

It will have a few rounding errors, however the DAC has 2-3 bits of precision that we don't care about, so it gets absorbed. Timing is critical, but it isn't *THAT* critical, and even a .8 fixed point step is within the range of acceptability.

16.8 Is OK, actually, but 16.16 would be better. Know of a 32x8 divide? I can just shift the 16 bit value left 16 bits, divide, and my low 16 bit is my decimal. I just hope it's under 300usec...

Unfortunately, it's not high enough resolution by itself.

-->Neil

- N
- nospam
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Wed, Jul 30, 2003 1:14 AM

You only need 16.0 and some rounding control as you sequence.

There is at least one 16 by 8 bit divide with remainder to be found on

formatting link

(think it was in math51.zip).

Says it runs in 232us average, 276 worst case at 12MHz.

You can run a Bresenham algorithm on the slope of the remainder to determine if your DAC (inc/dec)rement is the quotient or quotient +1.

- N
- nospam
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Wed, Jul 30, 2003 2:47 AM

There would be no cumulative error. You need to understand the difference between a Bresenham algorithm and a simple incremental algorithm using fractions.

- J
- John R. Strohm
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Wed, Jul 30, 2003 4:03 AM

direction.

300usec

End

a

I'm not familiar with that processor. Have you considered unrolling the divide loop, to kill off the housekeeping and branches back and forth?

Do you HAVE to have a linear ramp, or would an exponential ramp work as well?

- N
- nospam
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Wed, Jul 30, 2003 4:44 AM

Like at this

#include #include #include

int main(int argc, char* argv[]) {

using namespace std;

assert(argc == 4);

int starty = atoi(argv[1]); int endy = atoi(argv[2]); int dx = atoi(argv[3]);

int dy = endy - starty; int qy = dy / dx; int ry = abs(dy % dx); int oy = starty;

int d = 2 * ry - dx; int d0 = 2 * ry; int d1 = 2 * (ry - dx); int o1 = qy + (dy > 0 ? 1 : -1);

for(int i = 1; i

- E
- Everett M. Greene
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Wed, Jul 30, 2003 3:19 PM

Yes. See the non-restoring divide algorithm provided by nospam in another posting.

- C
- Chris Quayle
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Wed, Jul 30, 2003 5:03 PM

If you can reorganise the data, you may be able to use simple bit shifts to scale, multiply and divide by 2^n. I use this technique a lot to save compute time and is very fast in either C or asm...

Chris

--
Aerosystem Designs
------------------
Embedded System and Electronic Design and Development
Oxford. England. lightwork at aerosys dot co dot uk

- A
- Andras Tantos
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Thu, Jul 31, 2003 3:08 PM

The beauty of this algorithm is that you need an integer division only. The fractional part is handled by the Bresenham-like error-accumulation part of the loop. Why do you need 16.8 precision for this?

Andras Tantos

- C
- Chris Hoffmann
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Thu, Jul 31, 2003 11:59 PM

Not familiar w your processor but ... If you've got the memory space... will a lookup table work?

FWIW Chris.

- R
- Robert Lacoste
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Fri, Aug 1, 2003 9:31 AM

Ok, here is an answer : DAC ramping fro many value to any value in discrete time steps, working in all cases, WITHOUT ANY DIVISION and with only multiplications by two, integers only of course !!! Derived from the Bresenham algorithm :

#include "stdafx.h" #define abs(x) ((x)>=0 ? (x) : -(x)) #define sign(x) ((x)>=0 ? 1 : -1)

int main(int argc, char* argv[]) { int xa; int ya; int xb; int yb;

xa=0; //start time xb=10;//stop time ya=100;//start dac value yb=2340;//stop dac value

int dx,dy,x,y,s1,s2,e,temp,swap,i; int xprec;

xprec=xa-1;

dy=abs(yb-ya); dx=abs(xb-xa); s1=sign(xb-xa); s2=sign(yb-ya); x=xa; y=ya; if (dy>dx) { temp=dx; dx=dy; dy=temp; swap=1; } else swap=0; e=2*dy-dx; for(i=1;i=0) { if (swap==1) x+=s1; else y+=s2; e-=2*dx; } if (swap==1) y+=s2; else x+=s1; e+=2*dy; } x++; printf("DAC value at t=%3d : %3d\n",x,y);

return 0; }

Exemples of output :

10 time steps, DAC from 100 to 2340 : DAC value at t= 0 : 100 DAC value at t= 1 : 212 DAC value at t= 2 : 436 DAC value at t= 3 : 660 DAC value at t= 4 : 884 DAC value at t= 5 : 1108 DAC value at t= 6 : 1332 DAC value at t= 7 : 1556 DAC value at t= 8 : 1780 DAC value at t= 9 : 2004 DAC value at t= 10 : 2228 DAC value at t= 11 : 2340

10 time steps, DAC from 2340 to 2337 DAC value at t= 0 : 2340 DAC value at t= 1 : 2340 DAC value at t= 2 : 2339 DAC value at t= 3 : 2339 DAC value at t= 4 : 2339 DAC value at t= 5 : 2338 DAC value at t= 6 : 2338 DAC value at t= 7 : 2338 DAC value at t= 8 : 2338 DAC value at t= 9 : 2337 DAC value at t= 11 : 2337

Nice, isn't it ?

Robert Lacoste - ALCIOM : The mixed signals experts

formatting link

"Neil Bradley" a écrit dans le message de news: snipped-for-privacy@corp.supernews.com...

killing

CPU

- J
- Jonathan Kirwan
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Fri, Aug 1, 2003 6:48 PM

Hmm. Some notes:

(1) Where did "t=10" go in the second group?

(2) The 'while(e>=0)' can be replaced with 'if(e>=0)', as it either takes place, or not, but a single subtraction is always sufficient for 'e'.

(3) The problem with the algorithm is that the for() loop will continue over and over and over for long stretches before (x!=xprec), when the DAC values are widely separated (which may often be the case) as compared with the time tick. The loop continues until 'x' changes and this may be a while.

And worse, still, it will happen just about every time, too. Once you pay the price, you pay it again and again.

I don't think this will save the OP anything over the single division at the beginning that he's stuck with, now. The looping is worse, I suspect, than the problem it is supposed to solve.

This is why I didn't comment earlier about his problem. I recognized that this isn't just a simple hack of Bresenham, to fix the slow timing. Bresenham flips everything in the first 45 degrees, switching independent variables around. This is great for drawing pixels and it avoids the need for division because of limiting the range to TAN(angle)

- J
- Jonathan Kirwan
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Fri, Aug 1, 2003 6:53 PM

Neil, it's a modification of Bresenham to repress output while the dependent variable remains unchanged. Brute force and probably what Kelly was hinting at. It will loop once for each DAC count it is skipping past. And you will pay that loop for each time step. It will probably eat up your CPU time more than the single division at the outset, unless your DAC values have a rather limited range of change.

Jon

- J
- Jonathan Kirwan
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Fri, Aug 1, 2003 7:00 PM

(4) At t=1 in your first group, the DAC output value is 212. This is a step of 112. At t=2, the DAC value is 436 -- a step of 224. It's smoother, later on. A similar thing happens at the end, as well. This is a natural consequence of Bresenham, but it's a little odd for a DAC drive.

- N
- nospam
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Fri, Aug 1, 2003 7:26 PM

:)

- N
- Neil Bradley
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Fri, Aug 1, 2003 11:08 PM

Yeah, I noticed that when I coded it up. But using up cumulatively MORE time over a period of time is OK. And it's really not that bad - about 10-20usec each step. I just need faster response, so this is just what the doctor ordered for my particular application.

-->Neil

- J
- Jonathan Kirwan
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sat, Aug 2, 2003 12:04 AM

Well, if I understand your "10-20us" statement, it's per iteration. If you are stepping from DAC=200 to DAC=3000 in say

10 intervals, that about 2800/10 or 280 iterations per DAC update event. At 10-20us per, that is a LOT of time to chew up.

Just keep it in mind.

Jon

- N
- Neil Bradley
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sat, Aug 2, 2003 8:33 AM

time

10-20usec

It's using 30usec every millisecond since my timer resolution is set to 1ms. So it may eat up a lot of CPU time, but it'll be over a long period.

-->Neil