I am trying to perform 32 bit by 16 bit division on X51 based 8 bit controller. The result can be limited to 16 bits. I want algorithm for division by rotation which gets executed faster. Please send the idea or algorithm to do it. Thank you for the help.
Have you ever learnt how to divide decimal numbers on paper with a pencil ? The binary division works exactly the same, except that it is even more simple.
I can tell you how, but I don't have a routine for the 8051 family written already.
Meaning, I assume:
32-bit ------ = 16-bit quotient 16-bit
So it is okay to completely throw away the remainder portion and, if the quotient would be larger than 16-bits then the returned value is allowed to be garbage? You will do your own verification that the result will fit in 16 bits? And all the values are to be treated as unsigned?
Faster than what? You already *have* code for this, aren't disclosing all the details about it (including its execution time in clocks), and are asking a clairvoyant to figure out a _faster_ one for you?
How about you provide some exact, specific details about your situation?
After you expose the details of your situation, perhaps.
Your assumptions are correct. Remainder is not needed. I want to write assembly routine. At present I am relying on Keil compiler to do this task for me. I feel writing assembly code would be faster. I have ported one 6502 routine
What kind of performance do you get? Why do you think you can write it faster? Do they use C code to perform such a calculation? Do they call other generic functions and paste together the results to get there? Or do they have a hand-coded, assembly routine in their library already?
It could be, if that isn't what Keil already has done for you. But since I don't use Keil, you will have to inform me about that detail.
There are several forms of performing such a calculation, some from the differences arriving from choosing between a restoring or non-restoring method, some from the differences of unrolling the loop. How much are you willing to take on, here? How much improved performance change is worth how much study and work on your part?
It takes time to explain the details of various combinations of these and you may not need or care about some of the more difficult aspects if you only get a small performance difference. Which might mean the time explaining won't be valuable.
I haven't checked that out, as I really don't have time in the near term. Much easier for me to just say in the first place. But I think you need to __understand__ how the process works, very well, if you plan to write excellent, well-performing code for this purpose. It won't be very good if all you are doing is porting the code of others.
Please don't top-post. It looses all continuity, and encourages failure to snip. Please do maintain proper attributions.
The remainder is the natural byproduct of the division process. The C interface normally just discards it, apart from the 'div' and 'ldiv' function interfaces.
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Well, it turns out that I've already written such a routine in 8051 assembly, a couple of years back, just for fun. I think it is on a different machine than what I have handy just now. In any case, it's better you try your hand at it -- it's a job worth doing once in your programming life. You can refer back to these posts of mine on this group:
formatting link
formatting link
(The above discussion was over a special type of 8051 that was clocked at 18MHz and would run 3 instruction cycles per microsecond. So be aware of that detail when you read timing estimates.)
Assuming the usual 12-clock/cycle 8051 style device and the usual timings for the instructions on such a chip, the simple routine was at a max of 29 cycles and a minimum of 24 cycles per loop. This wasn't a non-restoring version. Nor was it unrolled, at all. At 16 iterations (that is what it takes to get both quotient *and* remainder), this works out to something around 430-450 cycles to execute, on balance.
You can trim off another 30 cycles or so by unrolling the loop, once. I haven't played with the non-restoring version, so I can't predict if that will be good, bad, or indifferent on the 8051. Perhaps others know, already.
Since you may like to "port" routines, here's an example I wrote some time ago for the MSP430.
Keep in mind that the registers here are 16 bits in size.
Just to give some more examples for you, here's an example showing what an unrolled-once example of the above code looks like:
And here is an unrolled example, plus non-restoring:
These may provide some utility in your quest for an 8051 divide. You might also read the short answer I gave and listed in the google references provided at the start of this post. In addition, you should search for examples in VHDL, Verilog, and across the web from various programmers and educators who have written a variety of web pages on the subject.
Finally, I can drag up my old 8051 code -- with a little effort in finding the old disk. Some evidence of your interest in learning from the above and elsewhere might encourage me on that score.
Programming this is actually far from easy. There are several subtleties involved, among them that you actually need 33 bits. Getting the most of the particular processor is another challenge.
Via m6809forth.html you can download a Forth for the 6809 that features a 32 by 16 bit division as the word UM/MOD. This was a substantial improvement over the implementation of Brad Rodriguez.
Forth implementations for 8051 likewise contain an implementation of UM/MOD. Because the '51 is a more limited processor, the implementation will be more difficult to understand.
Groetjes Albert
--
--
Albert van der Horst,Oranjestr 8,3511 RA UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
albert@spenarnc.xs4all.nl http://home.hccnet.nl/a.w.m.van.der.horst
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.