Unusual Floating-Point Format Remembered?

Currently being voted on by the IEEE, an extension to the IEEE 754 floating-point standard will include the definition of a decimal floating-point format.

This has some interesting features. Decimal digits are usually compressed using the Densely Packed Decimal format developed by Mike Cowlishaw at IBM on the basis of the earlier Chen-Ho encoding. The three formats offered all have a number of decimal digits precision that is 3n+1 for some n, so a five-bit field combines the one odd digit with the most significant portion of the exponent (for a range of exponents that has 3*2^n values for some integer n) to keep the coding efficient. Its most unusual, and potentially controversial, feature is the use it makes of unnormalized values.

One possible objection to a decimal floating-point format, if it were used in general for all the purposes for which floating-point is used, is that the precision of numbers in such a format can vary by a whole decimal digit, which is more than three times as large as a binary bit, the amount by which the precision varies in a binary format with a radix-2 exponent. (There were binary formats with radix-16 exponents, on the IBM System/360 and the Telefunken TR 440, and these were found objectionable, although the radix-8 exponents of the Atlas and the Burroughs B5500 were found tolerable.)

I devised a scheme, described along with the proposed new formats, on

formatting link

by which a special four-bit field would describe both the most significant digit of a decimal significand (or coefficient, or fraction, or, horrors, mantissa) and a least significant digit which would be restricted in the values which it could take.

MSD 1, appended LSD can be 0, 2, 4, 6, or 8. MSD 2 or 3, appended LSD can be 0 or 5. MSD 4 to 9, appended LSD is always 0.

In this way, the distance between representable values increases in steps of factors of 2 or 2.5 instead of factors of 10, making decimal floating-point as "nice" as binary floating-point.

As another bonus, when you compare the precision of the field to its length in bits, you discover that I have managed to achieve the same benefit for decimal floating point as was obtained for binary floating- point by hiding the first bit of a normalized significand!

Well, I went on from there.

If one can, by this contrivance, make the exponent move in steps of

1/3rd of a digit instead of whole digits, why not try to make the exponent move in steps of about 1/3 of a bit, or 1/10 of a digit?

And so, on the next page,

formatting link

I note that if instead of appending one digit restricted to being either even or a multiple of five, I append values in a *six-digit* field, I can let the distance between representable points increase by gentle factors of 1.25 or 1.28.

But such a format is rather complicated. I go on to discuss using an even smaller factor with sexagesimal floating point... in a format more suited to an announcement *tomorrow*.

But I also mention how this scheme could be _simplified_ to a minimum.

Let's consider normalized binary floating points.

The first two bits of the significand (or mantissa) might be 10 or 11.

In the former case, let's append a fraction to the end of the significand that might be 0, 1/3 or 2/3... except, so that we can stay in binary, we'll go with either 5/16 or 11/16.

In the latter case, the choice is 0 or 1/2.

Then, the coding scheme effectively makes the precision of the number move in small jumps, as if the exponent were in units of *half* a bit instead of whole bits.

But now a nagging feeling haunts me.

This sounds vaguely familiar - as if, instead of *inventing* this scheme, bizarre though it may sound to many, I just *remembered* it, say from the pages of an old issue of Electronics or Electronics Design magazine.

Does anyone here remember what I'm thinking of?

John Savard

Reply to
Quadibloc
Loading thread data ...

Not at all sure why you keep writing this, John -- that's the way amost all existing decimal floating-point arithmetic works, and has done for several hundred years. It was, perhaps, controversial around

1202 in Europe (e.g., Liber Abaci, by Fibonacci) -- but since 1500 or so, that's the way arithmetic has been done, even in the late-adopting countries of Europe. :-)

mfc

Reply to
Mike Cowlishaw

LOL!

People designing floating-point units for computers, however, have somewhat different tastes from those using pencil and paper. As a result, while they never even considered using Roman numerals for that purpose, other aspects of how decimal arithmetic is done by hand have tended, as a general practice, to be modified in the floating-point units of computers.

John Savard

Reply to
Quadibloc

In article , "Quadibloc" writes: |>

|> One possible objection to a decimal floating-point format, if it were |> used in general for all the purposes for which floating-point is used, |> is that the precision of numbers in such a format can vary by a whole |> decimal digit, which is more than three times as large as a binary |> bit, the amount by which the precision varies in a binary format with |> a radix-2 exponent. (There were binary formats with radix-16 |> exponents, on the IBM System/360 and the Telefunken TR 440, and these |> were found objectionable, although the radix-8 exponents of the Atlas |> and the Burroughs B5500 were found tolerable.)

As I pointed out, this is at most a storm in a teacup. The reason that those formats were found objectionable was the truncation, and had nothing to do with the base. There is no critical reason not to use a base of 10 as against a base of 2. And I speak as someone with practical experience of writing and porting base-independent numerical code.

My arguments with Mike have nothing to do with saying that decimal is harmful IN ITSELF. They are that the way IEEE 754R decimal is being spun is misleading and often incorrect, and the consequences of this are harmful, verging on the catastrophic.

Regards, Nick Maclaren.

Reply to
Nick Maclaren

As far as I know, 'IEEE 754r decimal' is not being "spun", at all -- in fact people opposed to it seem to be far more noisy about it than those who are quietly using it.

Reply to
Mike Cowlishaw

In most cases, the truncation is a _consequence_ of the base.

I *have* come up with a way in which one could have a base-10 format whose truncation is (almost) as gentle as that of a binary floating- point format, though, so I know that an exception is possible.

I do not know if any controversy has, in fact, raged around the proposed DFP format, but I suspect that if there are people who find the format off-putting, most of them will be exercised not by subtle differences in numerical properties, but instead, when they notice the free use of unnormalized values - and then see the "ideal exponent" rules - they will realize that this isn't their grandfather's floating- point, and that something rather novel is being done here.

Some, seeing this bold and novel idea, will stand up and applaud. Using the z9 implementation as an example, they will not feel threatened by it - if the new format runs much slower than binary floating-point, clearly, it isn't intended as a replacement, and so if it has poorer numeric properties for classic floating-point applications, that isn't really an issue. It's there to take computers into new applications.

But others will see this as just the beginning - and, certainly, if the transistors are worth it, hardware implementations of DFP having near-parity in speed with binary floating-point are possible. If DFP can do everything BFP can do, and then some, why not? And if that is an eventual possibility, DFP ought to be put under intense scrutiny right now, before it's too late!

Well, if there _is_ controversy, my humble contribution might just save DFP. Because what I worked out is a DFP that looks "just like" binary floating point. The storage efficiency is the same as BFP - I managed to simulate hiding the first bit! The truncation changes by small steps nearly as small as those of BFP - good enough to answer your specific example, for which the threshold is radix-4.

The same hardware could handle both my trivial modification of the DFP proposal *and* the one proposed as the standard. So there would simply be a choice between format-centric DFP and numeric-centric DFP, and it would be safe to drop binary floating point, because numeric-centric DFP would preserve those of its good points perhaps not fully reflected in the current DFP proposal.

John Savard

Reply to
Quadibloc

In article , "Mike Cowlishaw" writes: |> > My arguments with Mike have nothing to do with saying that decimal |> > is harmful IN ITSELF. They are that the way IEEE 754R decimal is |> > being spun is misleading and often incorrect, and the consequences |> > of this are harmful, verging on the catastrophic. |> |> As far as I know, 'IEEE 754r decimal' is not being "spun", at all -- in fact |> people opposed to it seem to be far more noisy about it than those who are |> quietly using it.

I have tried to give you credit for being honest and fair, but it is getting increasingly hard. Your Web pages spin and postings spin the advantages of decimal, and so do a hell of a lot of things taken from them - the Python Decimal class documentation, for example.

All I have ever done is to point out the misleading claims, both ones of fact and ones of presentation, and the inconsistencies; and to point out that, in the opinion of all the people with extensive numerical and pedagogical experience that I know of, the value judgements used to justify decimal are mistaken.

To point out one egregious inconsistency, you have claimed that the use of 128-bit decimal for fixed-point calculations will be universal, that the cost of implementing decimal is only a little more than that of binary and that decimal is intended to replace binary. I can't recall whether you (or others) have also said that decimal is NOT intended to replace binary, but be in addition, in order to say that my points are irrelevant - but it has been said by its proponents!

Those CANNOT all be true at once.

1) The cost of implementing decimal is slightly more than the cost of implementing binary only in the case of the VERY heavyweight floating-point units produced by IBM. I don't know its cost in the other ones, but would expect it to be at least twice as much in the stripped-down ones favoured for embedded work. And that is not just development and sales cost, but power consumption for a given performance. 2) Where floating-point is a major factor, the use of 128-bit arithmetic doubles the cost for a given performance (or conversely), as the current bottleneck is memory access. Twice as much memory with twice the bandwidth costs twice as much and needs twice the power. You simply can't get round that. 3) If you don't use 128-bit arithmetic, there WILL be overflow problems with fixed-point. You can't enable trapping, because the exception that needs to be trapped MUST be ignored when you are using the arithmetic for floating-point. I won't go into the errors made by the IEEE 754(R) people about why language support is effectively absent, but that isn't going to change.

Regards, Nick Maclaren.

Reply to
Nick Maclaren

In article , snipped-for-privacy@cus.cam.ac.uk (Nick Maclaren) writes: |> |> All I have ever done is to point out the misleading claims, both ones |> of fact and ones of presentation, and the inconsistencies; and to point |> out that, in the opinion of all the people with extensive numerical and |> pedagogical experience that I know of, the value judgements used to |> justify decimal are mistaken.

Correction. Except for Professor Kahan.

I will repeat, ad tedium, that the universal opinion is that there is essentially no difference between binary and decimal floating-point for numerical work. The former is marginally better for 'serious' work, on many grounds, and the latter marginally more convenient for 'trivial' work.

Regards, Nick Maclaren.

Reply to
Nick Maclaren

I think you must be using the word 'spin' in a different way than I do :-). I point out facts, but that's not 'spin'. And I don't think I refer to the 'advantages' of decimal arithmetic anywhere...

I actually expected the 64-bit format to be the more popular. It was a surprise to me that people are standardizing on the 128-bit format. On the costs, the components needed are about 15%-20% more than binary, according to most references. That is actually quite significant when multiplied over a large number of cores.

I have never said that decimal is intended to replace binary. If your data are in binary, why convert to decimal?

I cannot imagine how it could be twice; a decimal adder, for example, is only slightly more costly than a binary one (it has to carry at 9 rather than 15). Multipliers are more complex, to be sure (but lots of people are doing new research on that). On the other hand, zeros and subnormal numbers are somewhat easier to handle in the 754r decimal formats because they are unnormalized and so zeros and subnormals are not a special case. And, of course, an all-decimal computer wouldn't need a binary integer unit :-).

Of course. The same is true of 128-bit ('quad') binary arithmetic, too. I rather assume most binary FP processing will migrate to that size, too, just as it migrated from single to double when the latter became widely available.

We are somewhat agreed on that :-). However, one can detect possible rounding even at the smaller sizes (with the effective loss of a digit of precision) by checking the MSD (which is conveniently in the top byte of the encoding). If it is still zero after a multiplication, for example, there was no rounding (except possibly for subnormal results, but when doing fixed-point arithmetic one is unliklely to be anywhere near the extremes of exponent range).

mfc

Reply to
Mike Cowlishaw

In article , "Mike Cowlishaw" writes: |> > I have tried to give you credit for being honest and fair, but it is |> > getting increasingly hard. Your Web pages spin and postings spin the |> > advantages of decimal, and so do a hell of a lot of things taken from |> > them - the Python Decimal class documentation, for example. |> |> I think you must be using the word 'spin' in a different way than I do :-). |> I point out facts, but that's not 'spin'. And I don't think I refer to the |> 'advantages' of decimal arithmetic anywhere...

You do, frequently, which is is reasonable. What you don't do is give fair weight to the disadvantages and, more importantly, phrase your claims in ways in which the naive will assume that the advantages are more than they are.

|> I have never said that decimal is intended to replace binary. If your data |> are in binary, why convert to decimal?

formatting link
Mike Cowlishaw wrote: ...

Yes, there was a smiley. But I remember similar remarks without them, perhaps in Email. And IEEE 754R makes decimal an ALTERNATIVE to binary, despite the claims that it is supposed to be an addition.

|> I cannot imagine how it could be twice; a decimal adder, for example, is |> only slightly more costly than a binary one (it has to carry at 9 rather |> than 15). Multipliers are more complex, to be sure (but lots of people are |> doing new research on that). On the other hand, zeros and subnormal numbers |> are somewhat easier to handle in the 754r decimal formats because they are |> unnormalized and so zeros and subnormals are not a special case. And, of |> course, an all-decimal computer wouldn't need a binary integer unit :-).

You can use the same multiplier and divider as integers, and those units are MUCH larger than the adder; conversion to and from integers is vastly easier, and so on. The IEEE 754R format also forces the support of unnormalised numbers, which adds extra complexity even for comparison.

For reasons that you know perfectly well, there isn't the chance of a flea in a furnace of integers becoming decimal during either of our working lifetimes, or a couple of decades beyond.

|> Of course. The same is true of 128-bit ('quad') binary arithmetic, too. I |> rather assume most binary FP processing will migrate to that size, too, just |> as it migrated from single to double when the latter became widely |> available.

Well, Kahan and others disagree, on very solid grounds. It may happen, but not in the foreseeable future.

|> We are somewhat agreed on that :-). However, one can detect possible |> rounding even at the smaller sizes (with the effective loss of a digit of |> precision) by checking the MSD (which is conveniently in the top byte of the |> encoding). If it is still zero after a multiplication, for example, there |> was no rounding (except possibly for subnormal results, but when doing |> fixed-point arithmetic one is unliklely to be anywhere near the extremes of |> exponent range).

If programming in assembler, yes, you can. You can ALMOST do that when using a high-level language as a pseudo-assembler, but I know of nobody skilled enough to achieve it for most important languages. It can be done for Java, for example, but not Fortran or the C-derived languages.

Regards, Nick Maclaren.

Reply to
Nick Maclaren

Were it not for the support of unnormalized numbers, the "ideal exponent" rules would not be possible, and without those rules, indeed, there would be "no difference" between decimal and binary, and less reason to add decimal.

In the event that chips start using real estate for really fast implementations of decimal floating-point, I suppose that a new architecture *might* be designed that tries to save transistors by dropping binary, on the theory that decimal can do anything binary can do, but more understandably to mere mortals. Since, as you've noted, that isn't _quite_ true in a few cases, I'm happy that I've shown how decimal can be made more like binary - should this day ever dawn.

It may also be noted that the use of DPD (as opposed to packed BCD) for the decimal format *already* makes the comparison of decimal floating-point numbers rather complex.

John Savard

Reply to
Quadibloc

OTOH, universal HW support for the 128-bit decimal fp would make it the default choice for all those algorithms where you suspect binary 64-bit (double) might not be good enough.

Personally, I would _much_ rather have 128-bit binary fp, but when faced with the choice between sw-emulated 128-bit binary and hw-implemented decimal, I would at the very least like to run my possibly marginal

64-bit algorithms a few times in decimal and compare the results. :-)

Terje

--
- 
"almost all programming can be viewed as an exercise in caching"
Reply to
Terje Mathisen

the

Why can't you do so in any language that allows operator overloading, i.e. C++?

Replace the stock operations with your own that checks all results?

Or is the problem that you cannot do it while maintaining useful performance?

Terje

--
- 
"almost all programming can be viewed as an exercise in caching"
Reply to
Terje Mathisen

Not really:

Any hw-supported DFP implementation will have to have a parallel chunk of gates that converts between 10-bit DPD and 12-bit BCD format.

Even if your adders and multipliers could accept DPD directly, you would still need the pack/unpack unit to handle scaling by a 1 or 2 (mod 3) digits.

The DPD_to_BCD conversion is effectively a 9-bit lookup table with

11-bit results (the parity bit is passthrough), so each instance of this table requires 512*11=5632 bits. Going the other way you need an 11-bit index into a table returning 9-bit results, so that is 2048*9=18432 bits.

Having 4 such tables and pipelining the accesses would handle conversion in a very small number of cycles, but it might be faster to create explicit logic to handle the conversions and save on lookup table space.

The obvious way to implement DFP would be to convert between DPD and BCD during all load/store operations, and expand the internal registers to work directly in BCD.

I.e. I'm not particularly worried about the conversion overheads.

Terje

--
- 
"almost all programming can be viewed as an exercise in caching"
Reply to
Terje Mathisen

...

Do any of you involved in this discussion imagine that it has relevance to embedded, low-power systems? If not, please don't cross post to comp.dsp any more. I think we get the idea here, and it's interesting in general, but not in the details.

...

Jerry

--
Engineering is the art of making what you want from things you can get.
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
Reply to
Jerry Avins

In article , Terje Mathisen writes: |>

|> > If programming in assembler, yes, you can. You can ALMOST do that when |> > using a high-level language as a pseudo-assembler, but I know of nobody |> > skilled enough to achieve it for most important languages. It can be |> > done for Java, for example, but not Fortran or the C-derived languages. |> |> Why can't you do so in any language that allows operator overloading, |> i.e. C++? |> |> Replace the stock operations with your own that checks all results?

Because the problem is not with the operations - it is with the overall model, especially the memory and state ordering aspects. In C-derived languages, that is so ambiguous that nobody knows what the standard specifies; at least in Fortran, it is just plain unspecified.

To take a simple (!) example, setting an exception flag (which are essential to any IEEE 754 model) is a side-effect, and therefore should not occur twice between sequence points (which ordering is itself inscrutable beyond the understanding of mere mortals). The attempt to specify an exemption for them foundered before it got anywhere.

|> Or is the problem that you cannot do it while maintaining useful |> performance?

That is the second problem, and is why Java is such a dead duck where performance matters.

Regards, Nick Maclaren.

Reply to
Nick Maclaren

In article , Jerry Avins writes: |> Terje Mathisen wrote: |> |> > Any hw-supported DFP implementation will have to have a parallel chunk |> > of gates that converts between 10-bit DPD and 12-bit BCD format. |> |> Do any of you involved in this discussion imagine that it has relevance |> to embedded, low-power systems? If not, please don't cross post to |> comp.dsp any more. I think we get the idea here, and it's interesting in |> general, but not in the details.

Regrettably, it does - and I have spent quite a lot of the past decade trying to argue the embedded, low-power corner to various people. I am handicapped in that by knowing only a little bit more about that area than the people I was talking to :-(

The point here is that the commoditisation of CPUs is increasingly leading to 'standard' ISAs being used in the embedded market, and an increasing amount of software that runs on such things is developed at least initially on 'general purpose' CPUs. That is why it is such a pity that the embedded, low-power people have not been more active in the languages and related standards activities.

Think to yourself. If you were told to redesign your software or firmware for a system with 128-bit decimal as the only hardware floating-point, while still maintaining performance and power, how would you do it? And it isn't implausible that you will be faced with that in a decade's time.

Regards, Nick Maclaren.

Reply to
Nick Maclaren

In software, on Pentium M x86, the lookup (a 2-byte x 1024 table) is so fast (LEA) that I have not been able to reliably measure any difference between code using the lookup and code not using it (i.e., < 1%). In hardware it's

3 FO4 gate delays (or 2, if the inputs have complementary ouputs, as ususally the case for CMOS). Whether the latter bumps one into a new cycle or not depends on the design (some processors have a 13 FO4 cycle, so 3 of those is significant; others have a much wider cycle).

'Me too'

mfc

Reply to
Mike Cowlishaw

Quadibloc wrote: (snip)

I don't believe this should be past tense. Machines still in production still support IBM S/360 HFP, and I believe compilers still generate code for it. Linux/390 and associated compilers I believe use BFP, but most of the MVS and VM side still use HFP.

Note, though, that if the results are printed in decimal, as they usually are, the variable precision effect comes out again. The advantage of DFP is that the precision change in internal format matches the precision change in external (human readable) form, where it doesn't in BFP or HFP. 32 bit DFP can reliably hold seven printable decimal digits, where 32 bit BFP cannot. As has been mentioned here, reliably generating those digits can be a challenge.

(snip)

-- glen

Reply to
glen herrmannsfeldt

I read comp.dsp fairly often, and hadn't thought of it as a place for embedded low-power systems.

I do believe that floating point is overused in DSP problems, where wider fixed point would probably be a better choice, but that is a different question.

As many people do use floating point for DSP, even if I don't agree, though should probably be interested in some of the problems with its use.

-- glen

Reply to
glen herrmannsfeldt

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.