Re: Intel details future Larrabee graphics chip

As the number of cores goes up the watt requirements goes up too ?

Will we need a zillion watts of power soon ?

Bye, Skybuck.

Reply to
Skybuck Flying
Loading thread data ...

Since the ATI Radeon? HD 4800 series has 800 cores you work it out.

--
Dirk

http://www.transcendence.me.uk/ - Transcendence UK
 Click to see the full signature
Reply to
Dirk Bruere at NeoPax

Not necessarily, if the technology progresses and the clock rates are kept reasonable. And one can always throttle down the CPUs that aren't busy.

I saw suggestions of something like 60 cores, 240 threads in the reasonable future.

This has got to affect OS design.

John

Reply to
John Larkin

Oops, 4 threads per core is 320 threads.

My XP is currently running 33 processes and maybe a couple dozen device drivers.

John

Reply to
John Larkin

I can see it now... A mega-core GPU chip that can dedicate 1 core per-pixel.

lol.

They need to completely rethink their multi-threaded synchronization algorihtms. I have a feeling that efficient distributed non-blocking algorihtms, which are comfortable running under a very weak cache coherency model will be all the rage. Getting rid of atomic RMW or StoreLoad style memory barriers is the first step.

Reply to
Chris M. Thomasson

Why not? Probably configured as a systolic array

formatting link

--
Dirk

http://www.transcendence.me.uk/ - Transcendence UK
 Click to see the full signature
Reply to
Dirk Bruere at NeoPax

Just note that the 4870 needs TWO of those 6 pin power leads...

Rarius

---- Posted via Pronews.com - Premium Corporate Usenet News Provider ----

formatting link
offers corporate packages that have access to 100,000+ newsgroups

Reply to
Rarius

Each of the 800 "cores", which are simple stream processors, in ATI RV770 (Radeon 4800 series) are not comparable to the 16, 24, 32 or 48 cores that will be in Larrabee. Just like they're not comparable to the 240 "cores" in Nvidia GeForce GTX 280. Though I'm not saying you didn't realize that, just for those that might not have.

Reply to
NV55

Run one process per CPU. Run the OS kernal, and nothing else, on one CPU. Never context switch. Never swap. Never crash.

John

Reply to
John Larkin

[...]

I meant to say:

/One/ bottleneck is the cache-coherency system.

Reply to
Chris M. Thomasson

In article , "Chris M. Thomasson" writes: |> |> FWIW, I have a memory allocation algorithm which can scale because its based |> on per-thread/core/node heaps: |> |> AFAICT, there is absolutely no need for memory-allocation cores. Each thread |> can have a private heap such that local allocations do not need any |> synchronization.

Provided that you can live with the constraints of that approach. Most applications can, but not all.

Regards, Nick Maclaren.

Reply to
Nick Maclaren

On a sunny day (Thu, 07 Aug 2008 08:39:21 -0700) it happened John Larkin wrote in :

to

things.

to the right place.

I already have those, they run Linux. I give you though, that a bad behaving module can cause big problems. Just had to reboot a couple of times to get rid of 'vloopback', wanted to interface the Ethernet webcam with Flashplayer. It works now:

formatting link
not with the new adobe flashplayer 10 beta for Linux though.... We will almost always be one step behind I guess..

If I understood the Intel press release correctly, the API of Larabee will be not different from a normal graphics card, that would be nice. They create the problem, let them write the soft :-)

One can wonder how important speed really is for the consumer PC. Sure, HD video, and later maybe 4096xsomething pixels will take more speed. However for normal HD already cheap chipsets provide the power. For HD video editing the speed can probably never be high enough... but that is not only a graphics issue.

John, I dunno where it will go, but one thing I know: It Will Not Become Simpler :-)

There is a tendency to more and more complex structures in nature. With us at the top perhaps, little one cell organisms at the bottom, molecules, atoms, quarks, what not. Self organising in a way, the best configurations make it - in time - And what is time, we are but a dash in eternity.

Reply to
Jan Panteltje

True, but they seem to be positioning Larrabee in the same tech segment as video cards. Which makes sense since a SIMD system is the easiest to program. If they want N general purpose cores doing general purpose computing the whole thing will bog down somewhere between 16 and 32. A lot of the R&D theory was done 30+ years ago.

Maybe they will try something radical, like an ancient data flow architecture, but I doubt it.

--
Dirk

http://www.transcendence.me.uk/ - Transcendence UK
 Click to see the full signature
Reply to
Dirk Bruere at NeoPax

saying

"General purpose" GPU's are not really general purpose, but they aren't doing graphics, either.

Robert.

Reply to
Robert Myers

Actually, doing I/O or networking in a "main" CPU is waste of resources. Any sane architecture (CDC 6600, mainframes) has a bunch of multi-threaded IO processors, which you program so that the main CPU has little effort to deal with IO.

This works well even when you do virtualization. The main CPU sends a pointer to an IO processor program ("high-level abstraction", not the device driver details) to the IO processor, which in turn runs the device driver to get the data in or out. In a VM, the VM monitor has to sanity-check the command, maybe rewrites it ("don't write to track 3 of disk 5, write it to the 16 sectors starting at sector 8819834 in disk 1, which is where the virtual volume of this VM sits").

The fact that in PCs the main CPU is doing IO (even down to the level of writing to individual IO ports) is a consequence of saving CPUs - no money for an IO processor, the 8088 can do that itself just fine. Why we'll soon have 32 x86 cores, but still no IO processor is beyond what I can understand.

Basically all IO in a modern PC is sending fixed- or variable-sized packets over some sort of network - via SATA/SCSI, via USB, Firewire, or Ethernet, etc.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
 Click to see the full signature
Reply to
Bernd Paysan

On a sunny day (Fri, 08 Aug 2008 13:02:15 +0200) it happened Bernd Paysan wrote in :

Do not forget, since the days of 8088, and maybe CPUs running at about 13 MHz, we now run at 3.4 GHz, 3400 / 13 = 261 x faster. Also even faster because of better architectures. This leaves plenty of time for a CPU to do normal IO. And in fact the IO has been hardware supported always. For example, although you can poll a serial port bit by bit, there is a hardware shift register, hardware FIFO too. Although you can construct sectors for a floppy in software bit by bit, there is a floppy controller with write pre-compensation etc.. all in hardware. Although you could do graphics there is a graphics card with hardware acceleration. the first 2 are included in the chip set, maybe the graphics too. The same thing for Ethernet, it is a dedicated chip, or included in the chip set, taking the place of your 'IO processor'. Same thing for hard disks, and those may even have on board encryption, all you have to do is specify a sector number and send the sector data.

So.. no real need for a separate IO processor, in fact you likely find a processor in all that dedicated hardware, or maybe a FPGA.

Reply to
Jan Panteltje

hardware shift register,

is a floppy controller

acceleration.

set,

processor

That's the IBM "channel controller" concept: add complexm specialized dma-based i/o controllers to take the load off the CPU. But if you have hundreds of CPU's, the strategy changes.

John

Reply to
John Larkin

On a sunny day (Fri, 08 Aug 2008 07:40:53 -0700) it happened John Larkin wrote in :

Ultimately you will have to move bytes, from one CPU to the other, or from dedicated IO to one CPU, and things have to happen at the right moment. Results will never be available before requests...... It is a bit like Usenet: (smile), there are many 'processors' (readers. posters, lurkers) here, some output some data at some time in response to some event, could be a question, others read it, later, much later perhaps, see the problem? Watched the Olympic opening, I must say the Chinese make a beautiful event. Never got boring, the previous one was ugly and not worth looking at, but anyways, so many LEDs? And some projection! Seems they are ahead in many a field. Would you not be scare to death if you were a little girl hanging 25 meters above the floor from some steel cables..... Chinese are brave too :-)

Reply to
Jan Panteltje

I think the trend is to have the cores surround a common shared cache; a little local memory (and cache, if the local memory is slower for some reason) per CPU wouldn't hurt.

Cache coherency is simple if you don't insist on flat-out maximum performance. What we should insist on is flat-out unbreakable systems, and buy better silicon to get the performance back if we need it.

I'm reading Showstopper!, the story of the development of NT. It's a great example of why we need a different way of thinking about OS's.

Silicon is going to make that happen, finally free us of the tyranny of CPU-as-precious-resource. A lot of programmers aren't going to like this.

John

Reply to
John Larkin

On a sunny day (Fri, 08 Aug 2008 08:54:36 -0700) it happened John Larkin wrote in :

John Lennon:

'You know I am a dreamer' .... ' And I hope you join us someday'

(well what I remember of it). You should REALLY try to program a Cell processor some day.

Dunno what you have against programmers, there are programmaers who are amazingly clever with hardware resources. I dunno about NT and MS, but IIRC MS plucked programmers from unis, and sort of brainwashed them then.. the result we all know.

Reply to
Jan Panteltje

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.