Fixed point Vs Floating point

It doesn't take much numerical ill-conditioning before you see quite visible and evil effects from IEEE single-precision floats even in straightforward operations like relatively low-order poly evaluation.

Of course if you understand the problem you'll just compute the ratio of max/min singular values or whatever.

Where time is involved, it's quite possible to require more precision than a 24-bit mantissa yields.

Best regards, Spehro Pefhany

--
"it\'s the network..."                          "The Journey is the reward"
speff@interlog.com             Info for manufacturers: http://www.trexon.com
Embedded software/hardware/analog  Info for designers:  http://www.speff.com
Reply to
Spehro Pefhany
Loading thread data ...

Been there, done that, too old and cranky to ever do it again. Damn, but I hate division. Computing 1/x and multiplying makes more sense. Never dividing makes more.

and getting it efficient as well

I program mostly bare assembly, and usually do all the math right on the spot, inline, with no official format or subroutines. It's all just numbers, right? Sorta like climbing Half Dome with just your bare hands.

I was just playing with the minimal amount of assembly code that will make a nice 4096-point sine table. Looks like maybe a dozen lines or so, but the rounding errors are annnoying. Maybe I'll just stick with the 2048-word lookup table, which is accurate and fast.

John

Reply to
John Larkin

[clip]

I wanted a small 256 sine lookup for an AVR. Being lazy I pulled one from the web. Didn't look right on the scope or spectrum analyser, so pulled another which for some reason had slightly different values. That wasn't right either. Curious I then pulled another 3, all different all crap!. The annoyance was that they all came with their own righteous mathematical analyses. Eventually had to do it myself. Being lazy is sometimes hard work.

Reply to
john

For small tables with short word lengths, there must be some subtle arrangement of rounding that nets the purest spectrum. Hurts me head to think about that one.

I suppose you could Fourier it and find the amplitude/phase of all the harmonics, one by one, and try to poke the inverse correction into the table at the sub-LSB level, sort of buried in the rounding noise. Hurts me head.

John

Reply to
John Larkin

John Larkin wrote:

I needed fast and accurate to 16 bits approximations of Sin/Cos. Below is what I ended up with. The LUT was optimized for the minimax criteria.

//----------------------- // Sin/Cos approximation (max_error ~ 3.7e-5, avg_error ~ 1e-5) //

static const u16 sin16_table[257] = { 0, 402, 804, 1206, 1608, 2010, 2412, 2814, 3216, 3617, 4019, 4420, 4821, 5222, 5623, 6023, 6424, 6824, 7223, 7623, 8022, 8421, 8820, 9218, 9616, 10014, 10411, 10808, 11204, 11600, 11996, 12391, 12785, 13179, 13573, 13966, 14359, 14751, 15142, 15533, 15924, 16313, 16703, 17091, 17479, 17866, 18253, 18639, 19024, 19408, 19792, 20175, 20557, 20939, 21319, 21699, 22078, 22456, 22834, 23210, 23586, 23960, 24334, 24707, 25079, 25450, 25820, 26189, 26557, 26925, 27291, 27656, 28020, 28383, 28745, 29106, 29465, 29824, 30181, 30538, 30893, 31247, 31600, 31952, 32302, 32651, 32999, 33346, 33692, 34036, 34379, 34721, 35061, 35400, 35738, 36074, 36409, 36743, 37075, 37406, 37736, 38064, 38390, 38715, 39039, 39361, 39682, 40001, 40319, 40635, 40950, 41263, 41575, 41885, 42194, 42500, 42806, 43109, 43411, 43712, 44011, 44308, 44603, 44897, 45189, 45479, 45768, 46055, 46340, 46624, 46905, 47185, 47464, 47740, 48014, 48287, 48558, 48827, 49095, 49360, 49624, 49885, 50145, 50403, 50659, 50913, 51166, 51416, 51664, 51911, 52155, 52398, 52638, 52877, 53113, 53348, 53580, 53811, 54039, 54266, 54490, 54713, 54933, 55151, 55367, 55582, 55794, 56003, 56211, 56417, 56620, 56822, 57021, 57218, 57413, 57606, 57797, 57985, 58171, 58356, 58537, 58717, 58895, 59070, 59243, 59414, 59582, 59749, 59913, 60075, 60234, 60391, 60546, 60699, 60850, 60998, 61144, 61287, 61429, 61567, 61704, 61838, 61970, 62100, 62227, 62352, 62475, 62595, 62713, 62829, 62942, 63053, 63161, 63267, 63371, 63472, 63571, 63668, 63762, 63853, 63943, 64030, 64114, 64196, 64276, 64353, 64428, 64500, 64570, 64638, 64703, 64765, 64826, 64883, 64939, 64992, 65042, 65090, 65136, 65179, 65219, 65258, 65293, 65327, 65357, 65386, 65412, 65435, 65456, 65475, 65491, 65504, 65515, 65524, 65530, 65534, 65535 };

s16 Sin_16(u32 phase) { u32 tmp_phase;

if(phase & 0x40000000) tmp_phase = 0x80000000 - (phase&0x7FFFFFFF); else tmp_phase = phase&0x7FFFFFFF;

u16 phase_hi = tmp_phase >> 22; u32 phase_lo = tmp_phase & 0x3FFFFF;

s32 s0 = sin16_table[phase_hi]; s32 s1 = sin16_table[phase_hi+1];

s0 += ((s1 - s0)*phase_lo)>>22;

if(phase&0x80000000) s0 = -s0;

return (s16)(s0>>1); }

s16 Cos_16(u32 phase) { return Sin_16(phase + 0x40000000); } //-----------------------------------

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

Reply to
Vladimir Vassilevsky

What does "small 256 sine lookup" mean? 256 points? How many bits per point?

...Jim Thompson

--
| James E.Thompson, P.E.                           |    mens     |
| Analog Innovations, Inc.                         |     et      |
| Analog/Mixed-Signal ASIC\'s and Discrete Systems  |    manus    |
| Phoenix, Arizona  85048    Skype: Contacts Only  |             |
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  |
| E-mail Icon at http://www.analog-innovations.com |    1962     |
             
         America: Land of the Free, Because of the Brave
Reply to
Jim Thompson

Sure, I do all my EM simulations in single precision because of the memory, more than the speed (which nowadays is usually quite comparable--transistors being free).

Polynomials are surprisingly bad if you evaluate them from the coefficients of powers of X, because of the creeping linear dependence between your basis functions. If you use chebyshev polynomials instead, especially with Clenshaw's summation rule, it's much much better conditioned and almost as fast as Horner's rule for powers of X. [You already use Horner's rule, it's P(x) = a+x*(b+x*(c+x*(...)))].

Cheers,

Phil Hobbs

Reply to
Phil Hobbs

This is an excellent book for implementing many functions that need to be done on small (or otherwise resource-limited) devices:

formatting link
. In many cases he has surprisingly clever ways of implementing functions that are smaller and faster than what the "straightforward" implementation would be.

Reply to
Joel Koltner

Rational polynomials are nowhere near as bad. Are you familiar with Bjorne Gustavsen's "Vector Fitting" work

formatting link
He has Matlab code on his web site, and the algorithm itself has shown up in commercial software -- particularly those aimed at fitting measured S parameters. I believe that both AWR (Microwave Office) and Cascade Microtech were using it at some point, and it's surely founds its way into other packages as well.

---Joel

Reply to
Joel Koltner

Yup, anything other than Horner's would be insane.

Best regards, Spehro Pefhany

--
"it\'s the network..."                          "The Journey is the reward"
speff@interlog.com             Info for manufacturers: http://www.trexon.com
Embedded software/hardware/analog  Info for designers:  http://www.speff.com
Reply to
Spehro Pefhany

Not wanting to get into FPU yet, I wrote a sine/cosine table for my various assembly programming explorations. 0 to 359 degrees (plus same range for cosine), signed integer so needs a shift after the multiply to get nearly correct values (SIN(angle) * 65534 is what, 20ppm shy of correct).

I generated it with QBasic, with some string processing to put in the spaces and "word" and lowercase hex conventions that I prefer. It compiles in MASM, probably among others.

Sine word 00h, 23ch, 478h, 6b3h, 8eeh, 0b28h, 0d61h, 0f99h, 11d0h word 1406h, 163ah, 186ch, 1a9dh, 1ccbh, 1ef7h, 2121h, 2348h, 256ch word 278eh, 29ach, 2bc7h, 2ddfh, 2ff3h, 3203h, 3410h, 3618h, 381ch word 3a1ch, 3c17h, 3e0eh, 3fffh, 41ech, 43d4h, 45b6h, 4793h, 496ah word 4b3ch, 4d08h, 4ecdh, 508dh, 5246h, 53f9h, 55a5h, 574bh, 58eah word 5a82h, 5c13h, 5d9ch, 5f1fh, 609ah, 620dh, 6379h, 64ddh, 6639h word 678dh, 68d9h, 6a1dh, 6b59h, 6c8ch, 6db7h, 6ed9h, 6ff3h, 7104h word 720ch, 730bh, 7401h, 74eeh, 75d2h, 76adh, 777fh, 7847h, 7906h word 79bbh, 7a67h, 7b0ah, 7ba2h, 7c32h, 7cb7h, 7d33h, 7da5h, 7e0dh word 7e6ch, 7ec0h, 7f0bh, 7f4bh, 7f82h, 7fafh, 7fd2h, 7febh, 7ffah Cosine word 7fffh, 7ffah, 7febh, 7fd2h, 7fafh, 7f82h, 7f4bh, 7f0bh, 7ec0h word 7e6ch, 7e0dh, 7da5h, 7d33h, 7cb7h, 7c32h, 7ba2h, 7b0ah, 7a67h word 79bbh, 7906h, 7847h, 777fh, 76adh, 75d2h, 74eeh, 7401h, 730bh word 720ch, 7104h, 6ff3h, 6ed9h, 6db7h, 6c8ch, 6b59h, 6a1dh, 68d9h (And so on . . .)

It makes a reasonable circle, which is good enough for me.

Generating tables in HLL is pretty typical business.

Tim

--
Deep Friar: a very philosophical monk.
Website: http://webpages.charter.net/dawill/tmoranwms
Reply to
Tim Williams

=A0 =A0 ...Jim Thompson

=A0 | =A0 =A0mens =A0 =A0 |

| =A0 =A0 et =A0 =A0 =A0|

=A0|

=A0 =A0 =A0 |

A list of 256 values, each value 8 bits. A 256 point approximation of one complete sine cycle, rather than the accurate, 16 bit, 1/4 cycle shown by Vladimir.

Reply to
john

Mine looks like this, generated by PowerBasic, in decimal...

.SBTTL . SINEWAVE TABLE

; 2048 WORDS, HALF A SINE

SINTAB:

.NLIST

.WORD 0, 50, 101, 151, 201, 251, 302, 352 .WORD 402, 452, 503, 553, 603, 653, 704, 754 .WORD 804, 854, 905, 955, 1005, 1055, 1106, 1156 .WORD 1206, 1256, 1307, 1357, 1407, 1457, 1507, 1558 .WORD 1608, 1658, 1708, 1758, 1809, 1859, 1909, 1959 .WORD 2009, 2059, 2110, 2160, 2210, 2260, 2310, 2360 .WORD 2410, 2461, 2511, 2561, 2611, 2661, 2711, 2761 .WORD 2811, 2861, 2911, 2962, 3012, 3062, 3112, 3162 .WORD 3212, 3262, 3312, 3362, 3412, 3462, 3512, 3562 .WORD 3612, 3662, 3712, 3761, 3811, 3861, 3911, 3961 .WORD 4011, 4061, 4111, 4161, 4210, 4260, 4310, 4360 .WORD 4410, 4460, 4509, 4559, 4609, 4659, 4708, 4758 .WORD 4808, 4858, 4907, 4957, 5007, 5056, 5106, 5156

John

Reply to
John Larkin

Has anyone implemented (in hardware) Don Lancaster's Magic Sinewaves scheme?

...Jim Thompson

--
| James E.Thompson, P.E.                           |    mens     |
| Analog Innovations, Inc.                         |     et      |
| Analog/Mixed-Signal ASIC\'s and Discrete Systems  |    manus    |
| Phoenix, Arizona  85048    Skype: Contacts Only  |             |
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  |
| E-mail Icon at http://www.analog-innovations.com |    1962     |
             
         America: Land of the Free, Because of the Brave
Reply to
Jim Thompson

Why not just delta-sigma?

John

Reply to
John Larkin

It depends on how you evaluate them--from coefficients or from a nice stable recurrence. Dividing one polynomial by another can put off the evil day by reducing the required order of numerator and denominator, but it doesn't affect the underlying ill-conditioning of the powers-of-x basis set.

Cheers

Phil Hobbs

Reply to
Phil Hobbs

Don claims some phenomenally low distortion.

...Jim Thompson

--
| James E.Thompson, P.E.                           |    mens     |
| Analog Innovations, Inc.                         |     et      |
| Analog/Mixed-Signal ASIC\'s and Discrete Systems  |    manus    |
| Phoenix, Arizona  85048    Skype: Contacts Only  |             |
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  |
| E-mail Icon at http://www.analog-innovations.com |    1962     |
             
         America: Land of the Free, Because of the Brave
Reply to
Jim Thompson

text -

Most of what is puzzling you is some lacking of some fundamentals, and never having had to wade through both integer and floating point at the gate level, like i did. As always, for both integer and floating point, divide is the slowest. For multiply Wallace trees can provide great acceleration, they just do not help all that much for divide. Perhaps you should start with fundamental theory (gate and algorithm level) of computer arithmetic, it is not all that difficult. Then proceed to Booth's algorithm(s) and Wallace trees. Do integer first, then learn barrel shifters. They are crucial to decent FP add/subtract performance. At that point everything is just extensions and combinations of what you already know.

Reply to
JosephKK

Actually both the 8087 and its successors, and the 68881 and its successors, as well as later MIPS, SPARCs, and HP-PARISC, and IBM POWER processors all do correct and complete IEEE 754 floating point. Most of cause interrupts on each denormal result.

Now that is an inefficient implementation issue.

For the x86 series the FPU was brought onboard with the 486, and with the 68020 in that series. I think it was incorporated in MIPS in the

4000 series and later, and the first SPARCs. Superscalar came later.

Correct description of denormals.

Reply to
JosephKK

Do you know of any processor that handles denormals in hardware, other than by flushing to zero? I haven't found one, and I don't think any exists. That's what I mean by "no hardware support".

Cheers,

Phil Hobbs

Reply to
Phil Hobbs

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.