Kids can do math better than x86 cpu's.

S

Skybuck Flying 19 years ago

Hello,

In principle it should be possible to add, substract, multiple or divide a 1 billion digits with a 1 billion digits, even kids learn this in school with the standard math tables and math algorithms for addition, substraction, multiple and division.

These algorithms can be applied to arbitrary length numbers. (For positional number systems)

(I think these algorithms were invented thousands of years ago by the Greeks).

However x86 cpu's today can not do with kids, Greeks or common people can do and that is do math with arbitrary long numbers.

x86 cpu's only do math with a limited/fixed number of digits/bits... which is quite disappointing.

Why is this ? Sigh.

Some possibilities:

Add new instructions which can multiple arbitary long numbers in memory.

(Example. multiply memory_pointer1, memory_length1, memory_pointer2, memory_length2, memory_pointer3, memory_length3) (something like that... could even be a 2 memory location operation instead of 3) (A * B = C)

or

Add high level chips which can use the low level chips to do these more complex math algorithms...

The point is: this would prevent programmers from having to reinvent the old Greek math wheel, saving them time ;) and ofcourse giving them easy coding platform/more productivity :D

Just some thoughts.

Bye, Skybuck.

Vote

T

Tim Shoppa 19 years ago

There are numerous CPU's out there that deal with arbitrary precision arithmetic. Typically the instruction set was optimized for COBOL :-), which should give you a hint that you want to look at ISA's from the

60's and 70's to see examples.

In the real world there are very few needs for so many digits of precision.

But when the needs do arrive, only the stupid programmers re-invent the wheel. The others use commonly available algorithms already out there. The GNU multi-precision library and the C bignum library come to mind immediately.

And when the wheel has been re-invented, it's been re-invented for the better: for example there are better algorithms for multiplication, these algorithms borrowing ideas from FFT's. It should not be a surprise that a lot of computers have already been optimized for FFT performance :-). Google for "Schoenhage-Strassen algorithm".

Tim.

Vote

=

=?ISO-8859-1?Q?Jan_Vorbr=FCgge 19 years ago

And again, if and when you need an implementation of these algorithms, get an existing one, e.g., contained in OpenSSL, instead of re-doing it from scratch (and likely getting it wrong in the process).

Jan

Vote

J

John Larkin 19 years ago

You really should do a little googling before you go off on rants like this. There are lots of extended math libraries and applications around, both binary and decimal. Some will compute to arbitrary precision. This issue was addressed decades ago.

The most common math functions, 32 bit integer operations and floats out to 80 bits, are executed in x86 hardware. The oddball stuff, like arbitrary precision and decimal floats, is done as subroutines.

John

Vote

L

langwadt 19 years ago

which is basically how humans would do it; digit by digit

-Lasse

Vote

N

Nick Maclaren 19 years ago

In article , "Tim Shoppa" writes: |>

|> In the real world there are very few needs for so many digits of |> precision. |> |> But when the needs do arrive, only the stupid programmers re-invent |> the wheel. The others use commonly available algorithms already out |> there. The GNU multi-precision library and the C bignum library come |> to mind immediately.

That's too strong. I am doing it at present, for good and sufficient reasons. I agree that your statement is true 99% of the time, perhaps more :-)

To the best of my knowledge, there is nothing that comes close to what I am currently doing, though there has been in the past. But I baulk at trying to FIND the old codes, though I would happily convert one of them if I could do so. Where would YOU look up things that dropped out of common practice in the 1960s?

|> And when the wheel has been re-invented, it's been re-invented for the |> better: for example there are better algorithms for multiplication, |> these algorithms borrowing ideas from FFT's. It should not be a |> surprise that a lot of computers have already been optimized for FFT |> performance :-). Google for "Schoenhage-Strassen algorithm".

Not at all. It has very often been invented worse. I despair at the way that computer scientists reinvent technologies that predate their subject - and then get the number of sides on the wheel wrong :-(

For example, how would YOU do division and modular multiplication, how do those packages do it, and are you quite sure that is the best way? Well, let's not go there - it would take all week :-)

Regards, Nick Maclaren.

Vote

J

John Larkin 19 years ago

Yeah. I posted a problem a while back, about needing a fast 64-bit binary-to-ascii conversion routine. Someone made a suggestion that helped solve the problem for me (divide by 1e9 and convert the chunks separately) and when I wrote the code I realized, for the first time, how "long division" actually works.

John

Vote

J

John Larkin 19 years ago

From "Dreaming in Code" : The worst thing about programmers is that they love to write code.

John

Vote

J

Jim Thompson 19 years ago

They're back to teaching "casting out 9's" in the schools here. Except they gave it some different name to confuse grandparents ;-)

...Jim Thompson

Vote

R

Richard The Dreaded Libertaria 19 years ago

You could download the source for bc, which compiles on Linux, but could probably be ported:

formatting link

Good Luck! Rich

Vote

S

Skybuck Flying 19 years ago

It's not only about reinventing the wheel, it's also about efficiency/speed/performance/easy of use even.

A CPU should be able to do it faster in hardware then a software library...

Bye, Skybuck.

Vote

S

shoppa 19 years ago

Highest efficiency (in terms of programmer hours) is almost always achieved by pulling in an on off-the-shelf library that you know works :-).

There are exceptions, and the market finds them: crypto accelerators for example. The market even finds things that are not truly exceptions!

Apply this for each of the bazillion things you might want your CPU to do, and you end up with a CPU with a bazillion instructions.

Tim.

Vote

M

MitchAlsup 19 years ago

What makes you think this?

Vote

N

Nick Maclaren 19 years ago

He is right, but not VERY right.

The standard ISAs provide poor primitives for the purpose. For multiple precision integer arithmetic, there are only a few extra primitives needed (e.g. NxN->2N multiplication, add with carry etc.) For what I am doing (software binary floating-point), the ISA and HLL primitives stink. The actual arithmetic isn't the problem, as it is the same as for integers, but the housekeeping, normalisation, rounding etc. are a right royal pain in the, er, neck.

My estimates from many years back, which are probably still true, is that a decent set of primitives would enable a software floating-point to be 3-5 times slower than all-hardware (for a plausible precision). But current ones make it a LOT slower than that.

However, when one is talking large integer arithmetic (hundreds of digits), all one needs is a small number of extra primitives, and there is no difficulty in doing it in software. Even without them, it is very rarely more than 3 times slower than the best hardware could be.

Regards, Nick Maclaren.

Vote

R

robertwessel2 19 years ago

Only on the really slow implementations. While bignum addition and subtraction has to be done digit by digit (for some definition of digit), you do *not* want to do division and multiplication the way you learned it in school. Karatsuba multiplication (O(n**1.6) instead of O(n**2) for the classical algorithm) is useful for medium sized numbers, and FFT (O(n ln(n) ln(ln(n)))) for large numbers. Division is usually be preformed by approximating a result and then refining it with an appropriate number of iterations of Newton-Raphson (which just needs multiplications).

Vote

J

John Larkin 19 years ago

But there are few problems that need more than 64 or 80 bit floating-point math, and fewer still that need it fast.

John

Vote

R

Richard 19 years ago

[Please do not mail me a copy of your followup]

John Larkin spake the secret code thusly:

...which makes it perfect for reconfigurable computing.

As I understand it, that's what people are doing these days. Where previously you would have been required to develop a custom ASIC at great NRE cost, nowadays you can develop custom hardware using a relatively inexpensive FPGA peripheral card. Configuring the FPGA circuitry is very much like writing software when you use something like VHDL or Verilog.

In fact, you can even get these in a portable form factor -- there's an FPGA card you can buy that plugs into the Nintendo GameBoy Advance and it comes with a complete tool chain included. All for about $200 from Charmed Labs as the Xport 2.0 which has 150,000 gates.

"The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download Legalize Adulthood!

Vote

Q

Quadibloc 19 years ago

Also known as Sch=F5nhage-Strassen multiplication.

John Savard

Vote

T

Terje Mathisen 19 years ago

That was me, I use that approach in all my binary arbitrary precision libs.

:-)

With me it was the opposite, sort of: The _very_ first non-obligatory program I wrote was a (Fortran-2) implementation of arbitrary precision, written to calculate as many digits of pi as my student cpu-seconds budget would allow.

OTOH, I used to spend my high school physics classes playing around with Taylor series and arbitrary precision algorithms on paper, just because that allowed me to calculate with better precision than my big 5-digit log table book would allow. :-)

Terje

- "almost all programming can be viewed as an exercise in caching"

Vote

T

Terje Mathisen 19 years ago

That is only true of the core building blocks take so short a time to run that the decoding time becomes a limiting factor.

With pipelined cpus lots of algorithms does in fact run faster in sw than the corresponding high-level opcode, even on those cpus which did bother to implement these CISCy operations.

(Look for VAX polynomial evaluation, or x86 BCD and string opodes.)

Terje

- "almost all programming can be viewed as an exercise in caching"

Vote

Kids can do math better than x86 cpu's.

Join the Discussion

Didn't find your answer?