Kids can do math better than x86 cpu's.

Hello,

In principle it should be possible to add, substract, multiple or divide a 1 billion digits with a 1 billion digits, even kids learn this in school with the standard math tables and math algorithms for addition, substraction, multiple and division.

These algorithms can be applied to arbitrary length numbers. (For positional number systems)

(I think these algorithms were invented thousands of years ago by the Greeks).

However x86 cpu's today can not do with kids, Greeks or common people can do and that is do math with arbitrary long numbers.

x86 cpu's only do math with a limited/fixed number of digits/bits... which is quite disappointing.

Why is this ? Sigh.

Some possibilities:

  1. Add new instructions which can multiple arbitary long numbers in memory.

(Example. multiply memory_pointer1, memory_length1, memory_pointer2, memory_length2, memory_pointer3, memory_length3) (something like that... could even be a 2 memory location operation instead of 3) (A * B = C)

or

  1. Add high level chips which can use the low level chips to do these more complex math algorithms...

The point is: this would prevent programmers from having to reinvent the old Greek math wheel, saving them time ;) and ofcourse giving them easy coding platform/more productivity :D

Just some thoughts.

Bye, Skybuck.

Reply to
Skybuck Flying
Loading thread data ...

There are numerous CPU's out there that deal with arbitrary precision arithmetic. Typically the instruction set was optimized for COBOL :-), which should give you a hint that you want to look at ISA's from the

60's and 70's to see examples.

In the real world there are very few needs for so many digits of precision.

But when the needs do arrive, only the stupid programmers re-invent the wheel. The others use commonly available algorithms already out there. The GNU multi-precision library and the C bignum library come to mind immediately.

And when the wheel has been re-invented, it's been re-invented for the better: for example there are better algorithms for multiplication, these algorithms borrowing ideas from FFT's. It should not be a surprise that a lot of computers have already been optimized for FFT performance :-). Google for "Schoenhage-Strassen algorithm".

Tim.

Reply to
Tim Shoppa

And again, if and when you need an implementation of these algorithms, get an existing one, e.g., contained in OpenSSL, instead of re-doing it from scratch (and likely getting it wrong in the process).

Jan

Reply to
=?ISO-8859-1?Q?Jan_Vorbr=FCgge

You really should do a little googling before you go off on rants like this. There are lots of extended math libraries and applications around, both binary and decimal. Some will compute to arbitrary precision. This issue was addressed decades ago.

The most common math functions, 32 bit integer operations and floats out to 80 bits, are executed in x86 hardware. The oddball stuff, like arbitrary precision and decimal floats, is done as subroutines.

John

Reply to
John Larkin

which is basically how humans would do it; digit by digit

-Lasse

Reply to
langwadt

In article , "Tim Shoppa" writes: |>

|> In the real world there are very few needs for so many digits of |> precision. |> |> But when the needs do arrive, only the stupid programmers re-invent |> the wheel. The others use commonly available algorithms already out |> there. The GNU multi-precision library and the C bignum library come |> to mind immediately.

That's too strong. I am doing it at present, for good and sufficient reasons. I agree that your statement is true 99% of the time, perhaps more :-)

To the best of my knowledge, there is nothing that comes close to what I am currently doing, though there has been in the past. But I baulk at trying to FIND the old codes, though I would happily convert one of them if I could do so. Where would YOU look up things that dropped out of common practice in the 1960s?

|> And when the wheel has been re-invented, it's been re-invented for the |> better: for example there are better algorithms for multiplication, |> these algorithms borrowing ideas from FFT's. It should not be a |> surprise that a lot of computers have already been optimized for FFT |> performance :-). Google for "Schoenhage-Strassen algorithm".

Not at all. It has very often been invented worse. I despair at the way that computer scientists reinvent technologies that predate their subject - and then get the number of sides on the wheel wrong :-(

For example, how would YOU do division and modular multiplication, how do those packages do it, and are you quite sure that is the best way? Well, let's not go there - it would take all week :-)

Regards, Nick Maclaren.

Reply to
Nick Maclaren

Yeah. I posted a problem a while back, about needing a fast 64-bit binary-to-ascii conversion routine. Someone made a suggestion that helped solve the problem for me (divide by 1e9 and convert the chunks separately) and when I wrote the code I realized, for the first time, how "long division" actually works.

John

Reply to
John Larkin

From "Dreaming in Code" : The worst thing about programmers is that they love to write code.

John

Reply to
John Larkin

They're back to teaching "casting out 9's" in the schools here. Except they gave it some different name to confuse grandparents ;-)

...Jim Thompson

--
|  James E.Thompson, P.E.                           |    mens     |
|  Analog Innovations, Inc.                         |     et      |
|  Analog/Mixed-Signal ASIC\'s and Discrete Systems  |    manus    |
|  Phoenix, Arizona            Voice:(480)460-2350  |             |
|  E-mail Address at Website     Fax:(480)460-2142  |  Brass Rat  |
|       http://www.analog-innovations.com           |    1962     |
             
I love to cook with wine.      Sometimes I even put it in the food.
Reply to
Jim Thompson

You could download the source for bc, which compiles on Linux, but could probably be ported:

formatting link

Good Luck! Rich

Reply to
Richard The Dreaded Libertaria

It's not only about reinventing the wheel, it's also about efficiency/speed/performance/easy of use even.

A CPU should be able to do it faster in hardware then a software library...

Bye, Skybuck.

Reply to
Skybuck Flying

Highest efficiency (in terms of programmer hours) is almost always achieved by pulling in an on off-the-shelf library that you know works :-).

There are exceptions, and the market finds them: crypto accelerators for example. The market even finds things that are not truly exceptions!

Apply this for each of the bazillion things you might want your CPU to do, and you end up with a CPU with a bazillion instructions.

Tim.

Reply to
shoppa

What makes you think this?

Reply to
MitchAlsup

In article , "MitchAlsup" writes: |> On Mar 7, 1:39 pm, "Skybuck Flying" wrote: |> |> > A CPU should be able to do it faster in hardware then a software library... |> |> What makes you think this?

He is right, but not VERY right.

The standard ISAs provide poor primitives for the purpose. For multiple precision integer arithmetic, there are only a few extra primitives needed (e.g. NxN->2N multiplication, add with carry etc.) For what I am doing (software binary floating-point), the ISA and HLL primitives stink. The actual arithmetic isn't the problem, as it is the same as for integers, but the housekeeping, normalisation, rounding etc. are a right royal pain in the, er, neck.

My estimates from many years back, which are probably still true, is that a decent set of primitives would enable a software floating-point to be 3-5 times slower than all-hardware (for a plausible precision). But current ones make it a LOT slower than that.

However, when one is talking large integer arithmetic (hundreds of digits), all one needs is a small number of extra primitives, and there is no difficulty in doing it in software. Even without them, it is very rarely more than 3 times slower than the best hardware could be.

Regards, Nick Maclaren.

Reply to
Nick Maclaren

Only on the really slow implementations. While bignum addition and subtraction has to be done digit by digit (for some definition of digit), you do *not* want to do division and multiplication the way you learned it in school. Karatsuba multiplication (O(n**1.6) instead of O(n**2) for the classical algorithm) is useful for medium sized numbers, and FFT (O(n ln(n) ln(ln(n)))) for large numbers. Division is usually be preformed by approximating a result and then refining it with an appropriate number of iterations of Newton-Raphson (which just needs multiplications).

Reply to
robertwessel2

But there are few problems that need more than 64 or 80 bit floating-point math, and fewer still that need it fast.

John

Reply to
John Larkin
[Please do not mail me a copy of your followup]

John Larkin spake the secret code thusly:

...which makes it perfect for reconfigurable computing.

As I understand it, that's what people are doing these days. Where previously you would have been required to develop a custom ASIC at great NRE cost, nowadays you can develop custom hardware using a relatively inexpensive FPGA peripheral card. Configuring the FPGA circuitry is very much like writing software when you use something like VHDL or Verilog.

In fact, you can even get these in a portable form factor -- there's an FPGA card you can buy that plugs into the Nintendo GameBoy Advance and it comes with a complete tool chain included. All for about $200 from Charmed Labs as the Xport 2.0 which has 150,000 gates.

--
"The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download
      

        Legalize Adulthood!
Reply to
Richard

Also known as Sch=F5nhage-Strassen multiplication.

John Savard

Reply to
Quadibloc

That was me, I use that approach in all my binary arbitrary precision libs.

:-)

With me it was the opposite, sort of: The _very_ first non-obligatory program I wrote was a (Fortran-2) implementation of arbitrary precision, written to calculate as many digits of pi as my student cpu-seconds budget would allow.

OTOH, I used to spend my high school physics classes playing around with Taylor series and arbitrary precision algorithms on paper, just because that allowed me to calculate with better precision than my big 5-digit log table book would allow. :-)

Terje

--
- 
"almost all programming can be viewed as an exercise in caching"
Reply to
Terje Mathisen

That is only true of the core building blocks take so short a time to run that the decoding time becomes a limiting factor.

With pipelined cpus lots of algorithms does in fact run faster in sw than the corresponding high-level opcode, even on those cpus which did bother to implement these CISCy operations.

(Look for VAX polynomial evaluation, or x86 BCD and string opodes.)

Terje

--
- 
"almost all programming can be viewed as an exercise in caching"
Reply to
Terje Mathisen

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.