I am looking for an algorithm to calculate a floating point divide for an F PGA. I don't think I need help with the details of working in floating poi nt. At one time I worked on an array processor that did everything in floa ting point and was programmed in microcode. So I am at least conversant in the topic even if that was some 40 years ago.

There are a number of choices for doing a divide, but the one that is most clear to me and easiest to implement on the hardware I am designing seems t o be the Newton?Raphson iterative method. I'm trying to understand how many iterations it will take. The formula for that in the Wikipedia art icle assumes an initial estimate for the reciprocal of the divisor; 48/17 - D * 32/17. This would require 2 iterations to gain 16 bits of accuracy or

3 iterations for IEEE 32 bit floating point numbers. My needs are for somet hing between that range, but I would like to minimize the number of iterati ons... ideally one.I believe most implementations use a table lookup for the seed value of the reciprocal of the divisor. In the FPGA I am using, a convenient size would be 2k entries (11 bit address) of 18 bits. I'm wondering if this would be sufficiently more accurate than the above estimate so that one iteration wo uld be sufficient. I'm not clear on how to calculate this. I know in genera l the formula converges rapidly with the error being squared, so I'm thinki ng an initial value that is good to some 11 bits would produce more than 20 bits for one turn of the Newton-Raphson crank (20 bits being a level of ac curacy called "good enough" for this application). But I'd like to have som ething that verifies this rather than using a rule of thumb. At some point I will probably have to justify the correctness of the calculations.

Because of the hardware facilities available the calculations have intermed iate mantissas of 18 or 36 bits with 28 bits stored when in memory/stack ot her than ToS which is the register in the ALU (>36 bits). I don't think 11 bits is quite as good as required, but I'm thinking one crank of the NR mac hine and I'm "good". However, I need to be able to show how good. Anyone k now of good documentation on this matter?

Oh, I guess I should mention later on I might be repeating this process for a square root calculation. A measurement using a sensor that is at least t emporarily deprecated requires a square root. I managed to show the sensor was not capable of providing an accuracy at the low end that would produce meaningful results. But if they find a better sensor they may want to add t hat crappy sensor as an option and the square root will be back. No! The HO RROR!

Whatever... It's just an algorithm...