Here is some new bad news, and i mean really bad news

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Tue, Apr 22, 2014 10:08 PM

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Wed, Apr 23, 2014 1:22 PM

FORTRAN wasn't *all* that bad. I also worked on speeding up a fluid in cell code for relativistic plasma beaming in radio galaxies back then. It helped a lot that I understood how to prevent the exception handling from going crazy when computing with denormalised numbers.

The problem as originally written was scaled in SI units as I recall and included many terms which were something vaguely like k*G/c^2. Benchmarking it showed that 80% of the time was spent in exception handling for some values that could be safely set to zero!

For a short while I played with RATFOR (by Brian Kernigan)

formatting link

And for non numerical stuff BCPL (by Martin Richards) a predecessor of B which later gave rise to C. I still miss BCPL's tagged $). Allegedly it is that language that started the canonical "HELLO WORLD" craze.

formatting link

Its claim to fame was as an incredibly small portable compiler.

Mostly it was F77 on everything from the humble Z80 to a Cray-XMP. We always learnt something new porting "working" code to a new machine.

I also used some very early computer algebra packages around the same time Camal and REDUCE. They could be made to output FORTRAN code too.

One thing about FORTRAN I don't miss are continuation cards. A mistake in one of the computer algebra programs resulting in VSOP82 getting the light travel time wrong near Jupiter - a flaw that was detected observationally ~1984 by the binary pulsar getting close enough to Jupiter for this to matter.

--
Regards, 
Martin Brown

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Wed, Apr 23, 2014 2:16 PM

Been there.

Denormalized numbers are usually handled in microcode, which is slooooowwwwwww. My 3D EM simulator used to slow way down after a few minutes' run time, as a tiny leading field, due entirely to roundoff error, filled the space with denormals. Once I figured out the cause, all it took was a compile option to set denormals to zero.

Talk about remote debugging!

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
ElectroOptical Innovations LLC 
Optics, Electro-optics, Photonics, Analog Electronics 

160 North State Road #203 
Briarcliff Manor NY 10510 

hobbs at electrooptical dot net 
http://electrooptical.net

- U
- upsidedown
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Wed, Apr 23, 2014 3:53 PM

That is not a problem with Fortran, rather the problem was that the hardware platforms behave differently. The IEEE floating point standard helped to clear out some of this mess.

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Wed, Apr 23, 2014 9:58 PM

The denormal problem was _introduced_ by IEEE floating point, iirc.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
ElectroOptical Innovations LLC 
Optics, Electro-optics, Photonics, Analog Electronics 

160 North State Road #203 
Briarcliff Manor NY 10510 

hobbs at electrooptical dot net 
http://electrooptical.net

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Apr 24, 2014 1:08 AM

Denormals are intended as a way to handle underflow gracefully, rather than brutally setting all such numbers to zero like an oppressive white male. (But I repeat myself.)

Normally floats are implemented as a binary fraction (significand) and an exponent, which is usually base-2 but sometimes (as in the old IBM format) base-16.

If the exponent is binary, the leading bit of the significand is always a 1, unless the number is identically zero. Thus IEEE and various other base-2 formats take advantage of this free bit and don't bother storing it, which gives them a factor of 2 increase in precision in ordinary circumstances.

However, when the exponent reaches its maximum negative value, there's no room left. In order to make the accuracy of computations degrade gracefully in that situation, i.e. to make such a number distinguishable from zero, the IEEE specification allows "denormalized numbers", i.e. those in which the leading bit of the significand is not 1.

The problem is that denormals are considered so rare that AFAIK FPU designers don't implement them in silicon, but rather in microcode. Hence the speed problem.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
ElectroOptical Innovations LLC 
Optics, Electro-optics, Photonics, Analog Electronics 

160 North State Road #203 
Briarcliff Manor NY 10510 

hobbs at electrooptical dot net 
http://electrooptical.net

- J
- josephkk
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Apr 24, 2014 2:03 AM

of

.....

I

and

Yes and no. Current IBM mainframes have to support 5 different systems of floating point and IEEE is just one of them.

?-)

- J
- josephkk
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Apr 24, 2014 2:06 AM

indexing of

.....

all. I

in

then.

handling

and

IEEE 854 formalized the terminology, but underflow was already being dealt with in DEC VAXes and likely IBM 370s.

?-)

- U
- upsidedown
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Apr 24, 2014 6:03 AM

Which decade are you talking about ?

For instance in the x87 based FP units, the evaluation stack is 80 bits wide. To enter a 32 bit float or 64 bit double for calculations, it is first pushed into the 80 bit wide evaluation stack. The floating point expressions are evaluated on the 80 bit stack and the final expression result is popped of the stack and stored as 32/64 bit IEEE values. I see no reason, why the hidden bit would cause any problems in pushing or popping values to/from the stack.

In more traditional architectures, you still will have to have an extra bit, so that when there is an overflow in float add/sub, the mantissa must be shifted right and the exponent incremented.

Your description about denormals in microcode/software might apply to established manufacturer that adapt their existing FP-processor board for IEEE with only small alterations (e.g. changing the offset in the exponent), Adding the denormalization logic might have required more extensive PCB alterations, thus it would be a temptation to use emulation for denorms.

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Apr 24, 2014 11:05 AM

No. It definitely predates IEEE 754. The IBM 360/370 series definitely had the denormalised FP handling problems in FORTRAN. Indeed ISTR that its 7bit exponent range for REAL*4 was smaller than that for IEEE 754.

I think S/390 was the first IBM mainframe to offer IEEE 754 FP. I can't recall exactly what the Cyber CDC 7600 did although since it had native fast 60bit floating point it was less likely to fail denormalised.

--
Regards, 
Martin Brown

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Apr 24, 2014 12:06 PM

And by default the operating system would usually trap on a denorm operand and go through a tedious long winded recovery routine every time. You had to alter the control word to mask denorm traps out.

Big iron 70's and 80's.

It depends whether or not you have masked the bit for generating an interrupt on encountering a denormalised operand.

The problem can still arise today if you have the wrong compiler options set and scale your problem unwisely. The denorm operand exception handler can end up taking the bulk of all the elapsed time.

--
Regards, 
Martin Brown

- J
- Joe Gwinn
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Apr 24, 2014 1:37 PM

I first ran into such issues when doing polynomial approximations to functions too complicated to compute in realtime. The resulting polynomials (or rational polynomials) were solved using 32-bit fixed-point arithmetic. Unity scaling of the polynomial variable x was essential.

It seems to me that the root cause of the denormalized numbers is trying to do everything in unscaled SI units. If one formulates the problem such that variable values are roughly centered on unity, in most problem domains the underflows and overflows will vanish.

If not, reformulation to operate in the log domain may be useful.

Joe Gwinn

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Apr 24, 2014 2:18 PM

Around 2007, on a cluster of dual 12-core AMD Magny Cours boxes. It flew, except when it slowed down by 30 times due to a simulation space full of denormals. Finding the right compiler switch fixed it completely. It did the same on the dual Xeon box I had in my office.

That's only a worry during an actual FP operation, so is easily handled in silicon. FP numbers stored in registers and main memory always use the free bit except with denormals and NANs, AFAIK.

Nope, relatively recent hardware and good compilers.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
ElectroOptical Innovations LLC 
Optics, Electro-optics, Photonics, Analog Electronics 

160 North State Road #203 
Briarcliff Manor NY 10510 

hobbs at electrooptical dot net 
http://electrooptical.net

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Apr 24, 2014 2:22 PM

No, the issue is that in a FDTD code, when you turn on the source, the real fields propagate at approximately c. However, since you're iterating through all of main memory twice per full time step, the roundoff errors propagate superluminally and fill the domain with denormals until the real fields get there.

A combination of initializing the domain with small amounts of properly normalized noise, and the right compiler switches, fixed it.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
ElectroOptical Innovations LLC 
Optics, Electro-optics, Photonics, Analog Electronics 

160 North State Road #203 
Briarcliff Manor NY 10510 

hobbs at electrooptical dot net 
http://electrooptical.net

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Apr 24, 2014 2:27 PM

The 360 FP format was also powers-of-16, iirc. That didn't hurt the average precision too much (since the 3 bits you lose on the swing (significand) you gain on the roundabouts (exponent)). You did lose the free bit, though, and it made the roundoff error horribly pattern-sensitive, so the LSB/sqrt(12) approximation was hopelessly wrong.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
ElectroOptical Innovations LLC 
Optics, Electro-optics, Photonics, Analog Electronics 

160 North State Road #203 
Briarcliff Manor NY 10510 

hobbs at electrooptical dot net 
http://electrooptical.net

- J
- John Devereux
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Apr 24, 2014 6:16 PM

[...]

Yeah, that's how god does it.

--

John Devereux

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Thu, Apr 24, 2014 8:42 PM

A bit like the conjecture that if we can build a quantum computer in this universe with a serious number of qubits then it becomes a lot more likely that we are inside a higher level computer simulation.

--
Regards, 
Martin Brown

- J
- josephkk
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Fri, Apr 25, 2014 1:12 AM

dealt

Base 16 is just a compaction of binary for easier readability. The real difference it made is that the exponent was base 16 as well so the lead digit could be anything from 1 to F. They also used the same hardware to implement BCD floating point since the 360 series.

- J
- Joe Gwinn
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Fri, Apr 25, 2014 2:48 PM

[snip]

Ahh. Interesting. Never used or built FDTD code (read about it though), or anything faster than light.

Joe Gwinn

- J
- Joe Gwinn
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Fri, Apr 25, 2014 3:27 PM

And an afterthought: Is this "properly normalized noise' shaped to look like vacuum fluctuations (where particles flicker in and out of existence, right at the edge of the uncertainty principle)?

Joe Gwinn