Re: Turn a divide into a multiply

Oh. I guess I didn't understand. I'm not sure I do, yet.

You have a 1ms timer event. You need to go from a starting DAC value to an ending DAC value over X milliseconds, where X is not a constant. Yes? And that each millisecond, you post a new DAC value to the DAC, on your way there. And then I gathered that the method you were using before is "bresenham-like." So in that case, your division would only be required exactly once, not each millisecond step. Now, I thought I understood this to be a pain because even that once, taking place between two millisecond events, ate up 1/2 your time and you couldn't afford even that expense even if it was only once.

Now, let's say you want to proceed from 200 to 3000 in 10ms using the brute force modification (which I'm surprised you needed exposed to you in this way, since it was a straight forward change) to Bresenham. [I'm assuming that sometimes you might want to transition in 10ms, this way, or at least that it isn't out of the question.] So now, your code decides to proceed to calculate the next DAC value from the starting DAC value. This takes 140 iterations of the loop to get to this first value, roughly. (Going to the second value will take about twice that, or 280 iterations of the loop.) Are you telling me that this takes 30us?

If so, that's great. But I'm just a little curious.

Jon

Reply to
Jonathan Kirwan
Loading thread data ...

Ah hah!! Just as I suspected. Okay. All is clear, now.

Jon

Reply to
Jonathan Kirwan

670us on a 18MHz RD2 '51 running in X2 mode means 670*3 = 2010 instruction cycles. That seems a little excessive.

Actually, it's rather standard to avoid all divisions. This is done by dividing up the unit-circle into 8 octants and folding. That way, the slope is always less that one and the dependent variable advances at a rate > might want to transition in 10ms, this way, or at least that it

I think it's the 'for()' looping on the mostly false 'if (x!=xprec)' that is doing it. The while() should either execute zero or one time.

Anyway, you should look more at your division routine. Consider writing your own.

Jon

Reply to
Jonathan Kirwan

This strikes me as an odd statement, since the whole point of Bresenham's algorithm (which comes (approximately) from an era when LSI meant having a whole four bit counter in one package) is that it avoids division, ie if it uses division, it's not Bresenham's

(and this thread is not the only place I've noticed this anomaly)

what is it evolving language? (I suppose you think a byte is 8 bits ;)

And as an aside/question, isn't there a term applied to these stepwise ratiometric integer calculations (other than Bresenham style or what ever)? (there is but it escapes me)

bogax

Reply to
bogax

Neil...

I've never used the 8051, but have synthesized division using multiplication. Can you specify number of bits for divisor and dividend for me?

Is there a multiply operation that will handle divisor * quotient as a single operation? What are timings for multiplication operations? What are maximum multiplier/multiplicand/product sizes?

Is there a hard time requirement for each division or is it necessary only to achive a sufficiently small average?

--
Morris Dovey
West Des Moines, Iowa USA
 Click to see the full signature
Reply to
Morris Dovey

Digital differential analysis and recurrences.

Jon

Reply to
Jonathan Kirwan

The OP doesn't know the difference between Bresenham and incremental algorithms. I told him he needed to about 20 posts ago in this subthread.

Reply to
nospam

In assembler, of course. That is, if you want to keep your old algorithm.

Do you know how it's done? Assume that your 32-bit numerator is placed in two 16-bit words labeled 'a' for the high word and 'q' for the low word. Also, assume that 'd' is your 16-bit divisor.

I'm going to talk about this as if you were writing in assembly. Some of the semantics simply aren't present in C.

First off, you need to check for overflow. This is done this way, upon entering your division routine:

if (a >= d) Overflow ();

This is an unsigned comparison, by the way.

If the upper half of your 32-bit value is at or above your divisor, your quotient will exceed your ability to store into 16 bits. So that test quickly eliminates this consideration. It also pretty much excludes 'd' being 0, because with an unsigned comparison, if d=0 then you are sure that 'a' must be at least = to 'd', if not greater.

So, now that that is out of the way, you need to perform some subtractions. But since different processors handle their carry in different ways for subtraction (for example, the MSP430 sets or clears the carry in the opposite way that the x86 processor does for subtractions and comparisons), it's handy to set up a value which is the two's complement of 'd':

nd= 1 + ~d;

Now, in assembly on some processors, this might just be "neg." In C, with unsigned ints, the exact way is to complement and then increment, as shown above.

Now, loop just 16 times:

for (i= 0; i < 16; ++i) {

and the body where the good stuff gets done,

carry= LeftShift (&a, &q); if (!carry) carry= Add (a, nd, &r); else Add (a, nd, &r); if (carry) { ++q; a= r; }

Or, the above could be re-written:

if (LeftShift (&a, &q)) Add (a, nd, &r); ++q; a= r; } else if (Add (a, nd, &r)) { ++q; a= r; }

The LeftShift() function shifts the two parameter values, assuming that the first is the high-order 16 bits and the second is the low-order, shifting in a zero into the least significant bit of the second parameter, and shifting out the most significant bit of the first parameter into the return value.

The Add() function adds the first two values together, giving the third and returning a '1' if a carry occurred.

...and then the end of loop:

}

That's about it.

In assembly, it's very easy to capture the carry out from the left shift operation. In C, not so easy. Also, in assembly, it is very easy to capture the carry out from the Add() operation of (a+nd)-->r.

I simply cannot imagine why that is taking 670us on your CPU. At 670us, this is about 42us per loop or 126 instruction cycles.

I count 38 cycles per loop, just doing a quick hand coding of the main loop shown above, but in assembly code. That means 608 cycles in the body and the loop conditional. There is some set up, I suppose. But you could inline this code to avoid the worst of it. There's 3 cycles to init the loop counter. I'm sure there are some other cycles you could toss in there. But it should come out to about 210us per division operation, with your 18MHz X2-type RD2 '51 device.

I'll post my code example if you want it.

Jon

Reply to
Jonathan Kirwan

Sorry, that should be 33 cycles per loop, not 38. So this means about 185us per 32x16 division operation, or thereabouts.

Jon

Reply to
Jonathan Kirwan

I jump in late, so maybe re-telling something one of the 40 posts before did, but

If # of milliseconds is not growing endlessly, why not have a table of 1/(# of ms) ? =>

step = abs(DACstart-DACend)*reztab[# of milliseconds]

Getting this table is kinda tricky as you must ignore overflows. But you can start and see what your compiler (gcc e.g. often uses reciprokal) gives you for certain values.

--
42Bastian
Do not email to bastian42@yahoo.com, it's a spam-only account :-)
 Click to see the full signature
Reply to
42Bastian Schick

Zapped it into a simulator tonight, shaved another cycle, and just tested the 32x16 divide routine using the algorithm I posted here in C. Applied a variety of inputs and it always beats 150us on your 18MHz system and always produces the right results. Typically runs in about 144us. Could be faster with a-priori information on your inputs, too.

Jon

Reply to
Jonathan Kirwan

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.