a dozen cpu's on a chip

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 10:08 AM

It is well known how to do it right.

There are perfectly good OS's out there and for that matter quite plausible virtual machine PC software that will allow you to run guest operating systems independently on the latest P4s. Hardware support for virtual CPU is present in the newer chips so you can test your hypothesis.

OS/2 was very impressive for its day - it was perfectly possible to write a device driver to simulate the buffered 16550 serial IO chip on that. The world chose Windows glitz over reliability and performance...

Having a linear address space that anything can trample on is about the most dangerous architecture for security. Terminating a process the moment it tries to read or write something that it doesn't own makes for robustness and instills discipline in the programmers.

That is a problem with the rush to market rather than a pure technical problem. Given the time and expense a robust OS for the PC was possible, but businesses decided they wanted whizzy graphics much more than robustness. First mover advantage is too great to get software right :(

You will still get nasty deadlock situations when these CPUs try to interact with the real world via peripherals. It will be the supervisor that is hard to write in this sort of approach. That is also true for the high performance computing game the highest priority task is the one which keeps all the other CPUs busy and loaded with data to work on.

No not at all. Some of us do understand realtime and asynchronous events. I will grant you that a frightening number do not and are programming in an environment like Windows where this can leave them dangerously exposed to unwelcome surprises.

Depends what you mean by this. One of the best programmers I have known trained as a classicist. Self taught ones can be good but are often unaware of the importance of a good algorithm and data structure.

A prototype electronics rats nest of unreliable bodging is pretty obvious on the bench, unfortunately with dodgy software it is harder for the lay observer to judge the quality from the outside.

Regards, Martin Brown

** Posted from

formatting link

**

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 10:30 AM

Apart from Windows ME (named after an illness?) I have found most Windows versions to be tolerable to varying degrees. USB support was very flakey early on though. I don't like Vista very much as XP runs faster on the same h/w, but some customers require me to support it.

I had a lot of bother getting Vista to work for me with WPA encryption, but the actual fault really lay with my unfortunate choice of hardware router. I did find a lot of disturbing latent errors in the driver distributions whilst researching the problem. (eg Identical binaries with different names, and different binaries with the same name as well as some major players distributing unsigned Vista install components)

I don't believe in luck. Computer programs are largely deterministic and if the crashes are reproducible you should be able to pin down the root cause. Fault finding methodologies are similar for both hardware and software.

Are you sure they are truly identical at a detailed physical hardware level? Most times I have seen a batch of "identical" HP PCs with some misbehaving it was because they were *not* truly identical. Different versions of firmware and driver revisions are the first thing to look for (also random differences in the HD or even for that matter the mouse could be critically significant). Dodgy mouse drivers are rare, but I have one that can easily lock up Netscrape if the thumbwheel is used.

Nastiest one I ever saw was due to a DMA timing error that caused occassional bytes to vanish from files. It looked exactly like hostile virus action except that there was no malware found.

Regards, Martin Brown

** Posted from

formatting link

**

- K
- krw
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 1:33 PM

Since your proposition is wrong, everything that come from it is suspect even if the logic is flawless, which it isn't.

Wrong again. I can tell you're not a computer architect.

If your application is that limited you're not likely to be using a general purpose processor, so the discussion is moot.

--
Keith

- K
- krw
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 1:33 PM

formatting link

ONLY if the CPUs have tasks assigned when the system is designed. IOW, this may work for embedded processors but not general purpose computing.

No, it's really not new, rather rejected.

Now you sound like Dimbulb. ;-) Note that the Cell processor is essentially an embedded processor. The tasks it has been designed for are quite limited.

No, you have it backwards. Intel has been driven by hardware for the past thirty years. As many processors as we're likely to see, there will always be more tasks/threads.

Well, we can certainly agree there, though not likely on what. ;-)

--
Keith

- M
- MooseFET
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 1:36 PM

On May 10, 10:50 am, Phil Hobbs wrote: [.....]

I think in many cases the illusion of a single large shared memory is not needed to implement the needed operations. If we can remove the need for it, we don't need the huge bandwidth on the interconnect.

If we assume a Harvard like processor where the code space is never written to, the code space of each CPU can be independant during most of the run time. This means that no transactions of the slave CPU can ever require that the code space be brought back into sync.

The stack space and local variables of the routines running on the slave CPUs are private to the task. Only the defined inputs and outputs of the task really need to be shared with others. If we assume that memory is reasonably low cost, we can have a block of memory passed between processes to carry the information from task to task. This makes resyncing the caches a lot easier. The source CPU's cashe's "dirty" flags can be used to indicate what needs to be copied over.

If the source CPU is forbidden from overwritting the output data, then the transfer logic is reasonably simple. The source CPU's dirty flags just copy into the recieving CPU's "out of date" flags.

This sort of processor would mean that programmers would draw data flow diagrams and not flow charts. It would also push towards thinking like you are coding in APL or octave and not like fortran or C. Lots of things would be done to arrays.

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 3:12 PM

formatting link

I said it "trends in that direction", not that it was the ultimate architecture. But why is one PPC plus six simpler integer processors "quite limited"? It's obviously more general than the PPC alone.

I believe that's just what I said. They have just pushed an 8008 architecture - a dog when it was new - into nanometer silicon. Their attempts at cutting over to more modern architectures - iapx32, Arm, Itanic - have been expensive failures.

As many processors as we're likely to see,

Why? What would a desktop PC need with 1024 threads?

Something more like Cobol, where programmers are forced to deal with the application, rather than using the application as a platform to show off how tricky they can be. Something without pointers. Something that is impossible to crash.

John

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 3:23 PM

But the problem does exist. I suggest that different hardware, and an accompanying OS and set of rules, would make most of the nightmarish problems of modern pc's impossible. The Mac OS demonstrates that good programmers in a closed, disciplined environment *can* do a decent job using traditional CPU architectures, but the fact that they are an exception suggests that a more-hardware/less-software approach to managing resources would be better for everyone.

Engineers are, on average, better thinkers than programmers. Perhaps because engineers generally admire simplicity, whereas programmers admire complexity.

Intel and Microsoft still haven't come up with a way to keep from executing data and stacks!

John

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 4:13 PM

The nightmarish problems of modern PCs are an artefact of the "ship it and be damned" management culture in some software shops. And the very openness of the PC architecture which allows zillions of permutations of imperfectly written drivers.

It might be at least for thread management.

And they are so modest with it...

I count myself as an engineer in that respect. I don't think there are many programmers who admire unnecessary complexity. There are far too many that do not know enough about classical algorithms though.

Oh yes they have! Hardware assist and software to go with it. An implementation was included in XP(SP2) the Data Execution Prevention using the NOEX page exception mechanism. It isn't perfect - nothing ever is, but it is a big step forward. See

formatting link

Regards, Martin Brown

** Posted from

formatting link

**

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 5:35 PM

It wasn't bus bandwidth that bogged down most timeshare systems, it was cpu cycles but more often it was disk thrashing, caused by task swapping. Thrashing disappears as an issue when a single-user system has gigabytes of ram and doesn't need to swap/page code or data to disk (unless the ram is virtual and the apps are bloated, ie most Windows systems.)

I get occasional messages from Windows informing me that it has to increase my virtual buffer filesize on disk. And I have 2G of ram. Insane!

John

- S
- Spehro Pefhany
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 6:38 PM

Vista uses >1G out of 2G just sitting there.

Best regards, Spehro Pefhany

--
"it\'s the network..."                          "The Journey is the reward"
speff@interlog.com             Info for manufacturers: http://www.trexon.com
Embedded software/hardware/analog  Info for designers:  http://www.speff.com

- T
- Tim Williams
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 7:30 PM

Bleh, I hate it already! Here's a perfect example from my own experience:

OPCODE_INC = 40h ; INC AX OPCODE_DEC = 48h ; DEC AX

SET_X_INC textequ SET_X_DEC textequ (and for Y)

The code for the inner loop (containing label incdecx) is identical for all four directions (+/-X, +/-Y) except for this one opcode. So I have it change, instead of quadrupling the entire routine, or putting conditionals inside the loop (gack!).

I don't know just how often this is done, although I recall MS is fond of it for some purposes (obfuscated, self-modifying code appears in a few shady places, like for testing compatibility). Compressed executables, I would suppose, have to do this. Situations like this example are definitely an efficient place to use Von Neumann architecture.

Funny, it also occurs to me that, in protected mode x86 (the above runs in real mode 808x, BTW), you could put the code in a read-only segment, so writing to it would throw a page fault, thereby having the same effect.

Tim

-- Deep Fryer: A very philosophical monk. Website @

formatting link

- T
- Tim Williams
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 7:31 PM

Java? Well, it crashes, but it doesn't take down the entire computer.

Eww, Java...

Tim

-- Deep Fryer: A very philosophical monk. Website @

formatting link

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 9:46 PM

The memory bandwidth isn't fixed--the more cores you have, the more front-side busses you put in, till you run out of space for solder bumps. Memory controllers (e.g. Northbridge) are easier to make than CPUs.

That's where we on-chip optics folk are aiming--getting _lots_ more off-chip bandwidth with much lower power.

Cheers,

Phil Hobbs

- L
- Leon
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 10:11 PM

The new XMOS chip, developed by David May who designed the Inmos transputer, has just been announced:

formatting link

They've told me that working chips have been supplied to some customers, and that they will be available from Digi-Key in a few weeks. Development software will be available from Amazon.

Leon

- R
- Robert Adsett
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Mon, May 12, 2008 11:24 PM

Given my recent experience with PLC software it seems reasonable to project that future revisions will need 1023 threads and the first TB for the drivers and copy protection.

Maybe ADA? Ducks.

Robert

** Posted from

formatting link

**

- M
- MooseFET
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, May 13, 2008 1:23 AM

The IBM360/IBM370 had an easy solution for this. You could exec and instruction that was held in a register.

You can also simply have N copies of the routine. Since memory is cheap and the results would be a faster machine, I see it as worth the extra copies.

When stupidity will serve as an explanation, you need look no further.

Those are decompressed once at load time. This could be done in the load process. I know that MS does it by building in the code to do the decompress (eg: format.com) but that would serve as evidence that it is a bad idea.

No, it really isn't. You are better off to have a Harvard that sees the code space of another processor as its data space. That way the decompress process can go much faster since you keep the advantages of the Harvard.

Except that this means that the data and code transactions still go on the same bus.

- M
- MooseFET
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, May 13, 2008 1:29 AM

Except on Windoz 98.

- M
- MooseFET
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, May 13, 2008 1:31 AM

On May 12, 8:12 am, John Larkin wrote: [....]

No nontrivial language can ever be proven to be imposible to crash.

- M
- MooseFET
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, May 13, 2008 1:33 AM

[....]

Intel came up with the 8051.

- M
- MooseFET
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, May 13, 2008 1:36 AM

But...but...but The 8051 is perfect, you can't execute data nor the contents of the stack.