Division by rotation on 8 bit machine

Hi All,

I am trying to perform 32 bit by 16 bit division on X51 based 8 bit controller. The result can be limited to 16 bits. I want algorithm for division by rotation which gets executed faster. Please send the idea or algorithm to do it. Thank you for the help.

Regards, Shailendra

Reply to
Mehta Shailendrakumar
Loading thread data ...

Have you ever learnt how to divide decimal numbers on paper with a pencil ? The binary division works exactly the same, except that it is even more simple.

Rene

--
Ing.Buero R.Tschaggelar - http://www.ibrtses.com
& commercial newsgroups - http://www.talkto.net
Reply to
Rene Tschaggelar

"Mehta Shailendrakumar" wrote in message news:d9e4gh$5t4$ snipped-for-privacy@ns2.fe.internet.bosch.com...

A truly amazing book I learned of from a colleague is:

Hacker's Delight ISBN 0-201-91465-4

formatting link

It has almost every algorithm you can think of for bit shifting, compressing, parity generation, math functions, etc.

I had to buy one as soon as I saw it.

Rufus

Reply to
Rufus V. Smith

I can tell you how, but I don't have a routine for the 8051 family written already.

Meaning, I assume:

32-bit ------ = 16-bit quotient 16-bit

So it is okay to completely throw away the remainder portion and, if the quotient would be larger than 16-bits then the returned value is allowed to be garbage? You will do your own verification that the result will fit in 16 bits? And all the values are to be treated as unsigned?

Faster than what? You already *have* code for this, aren't disclosing all the details about it (including its execution time in clocks), and are asking a clairvoyant to figure out a _faster_ one for you?

How about you provide some exact, specific details about your situation?

After you expose the details of your situation, perhaps.

Jon

Reply to
Jonathan Kirwan

go to

formatting link
and look at their 16 series math libraries. They are free.

"Mehta Shailendrakumar" wrote in message news:d9e4gh$5t4$ snipped-for-privacy@ns2.fe.internet.bosch.com...

Reply to
MrSmith

Hi,

Your assumptions are correct. Remainder is not needed. I want to write assembly routine. At present I am relying on Keil compiler to do this task for me. I feel writing assembly code would be faster. I have ported one 6502 routine

formatting link
But any other idea by you is welcome.

Thanks & regards, Shailendra

Reply to
Mehta Shailendrakumar

Okay.

Okay. It will be some good experience.

What kind of performance do you get? Why do you think you can write it faster? Do they use C code to perform such a calculation? Do they call other generic functions and paste together the results to get there? Or do they have a hand-coded, assembly routine in their library already?

It could be, if that isn't what Keil already has done for you. But since I don't use Keil, you will have to inform me about that detail.

There are several forms of performing such a calculation, some from the differences arriving from choosing between a restoring or non-restoring method, some from the differences of unrolling the loop. How much are you willing to take on, here? How much improved performance change is worth how much study and work on your part?

It takes time to explain the details of various combinations of these and you may not need or care about some of the more difficult aspects if you only get a small performance difference. Which might mean the time explaining won't be valuable.

I haven't checked that out, as I really don't have time in the near term. Much easier for me to just say in the first place. But I think you need to __understand__ how the process works, very well, if you plan to write excellent, well-performing code for this purpose. It won't be very good if all you are doing is porting the code of others.

Let's hear a little more about the above, first.

Jon

Reply to
Jonathan Kirwan

Please don't top-post. It looses all continuity, and encourages failure to snip. Please do maintain proper attributions.

The remainder is the natural byproduct of the division process. The C interface normally just discards it, apart from the 'div' and 'ldiv' function interfaces.

--
"If you want to post a followup via groups.google.com, don't use
 the broken "Reply" link at the bottom of the article.  Click on 
 "show options" at the top of the article, then click on the 
 "Reply" at the bottom of the article headers." - Keith Thompson
Reply to
CBFalconer

You may be right I think C does 32 by 32 as per the C spec. Try a post at

formatting link
or
formatting link

Reply to
Neil Kurzman

Well, it turns out that I've already written such a routine in 8051 assembly, a couple of years back, just for fun. I think it is on a different machine than what I have handy just now. In any case, it's better you try your hand at it -- it's a job worth doing once in your programming life. You can refer back to these posts of mine on this group:

formatting link
formatting link

(The above discussion was over a special type of 8051 that was clocked at 18MHz and would run 3 instruction cycles per microsecond. So be aware of that detail when you read timing estimates.)

Assuming the usual 12-clock/cycle 8051 style device and the usual timings for the instructions on such a chip, the simple routine was at a max of 29 cycles and a minimum of 24 cycles per loop. This wasn't a non-restoring version. Nor was it unrolled, at all. At 16 iterations (that is what it takes to get both quotient *and* remainder), this works out to something around 430-450 cycles to execute, on balance.

You can trim off another 30 cycles or so by unrolling the loop, once. I haven't played with the non-restoring version, so I can't predict if that will be good, bad, or indifferent on the 8051. Perhaps others know, already.

Since you may like to "port" routines, here's an example I wrote some time ago for the MSP430.

Keep in mind that the registers here are 16 bits in size.

Just to give some more examples for you, here's an example showing what an unrolled-once example of the above code looks like:

And here is an unrolled example, plus non-restoring:

These may provide some utility in your quest for an 8051 divide. You might also read the short answer I gave and listed in the google references provided at the start of this post. In addition, you should search for examples in VHDL, Verilog, and across the web from various programmers and educators who have written a variety of web pages on the subject.

Finally, I can drag up my old 8051 code -- with a little effort in finding the old disk. Some evidence of your interest in learning from the above and elsewhere might encourage me on that score.

Jon

Reply to
Jonathan Kirwan

Hello Mr. Jonathan,

Thank you for your reply. I need to port the routines given by you on 8051. Thanks again.

Regards, Shailendra

Reply to
Mehta Shailendrakumar

Programming this is actually far from easy. There are several subtleties involved, among them that you actually need 33 bits. Getting the most of the particular processor is another challenge.

Via m6809forth.html you can download a Forth for the 6809 that features a 32 by 16 bit division as the word UM/MOD. This was a substantial improvement over the implementation of Brad Rodriguez.

Forth implementations for 8051 likewise contain an implementation of UM/MOD. Because the '51 is a more limited processor, the implementation will be more difficult to understand.

Groetjes Albert

--

--
Albert van der Horst,Oranjestr 8,3511 RA UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
albert@spenarnc.xs4all.nl http://home.hccnet.nl/a.w.m.van.der.horst
Reply to
Albert van der Horst

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.