Regarding Dual core processors

S

ssubbarayan 19 years ago

Dear Gurus, I am curious to know the advantages of using Dual core processors from the end user perspective.In the recent times I am hearing more about this from Intel about their latest processor in line products.Recently I purchased an IBM laptop which boasts about having Intel Dual core processor. How is it advantageous compared to normal single core processor? When you say Dual core does that mean,it contains 2 processors(pardon my ignorance!)? Is dual core common in embedded world? What support should we expect from an OS while working on Dual core processor?(Mainly trying to understand OS level modifications needed incase one has to move from single core to a dual core processor)

Regards, s.subbarayan

Vote

R

Robert Kaiser 19 years ago

Basically, yes. The CoreDuo processors share a common Level 2 cache, but apart from that, they are two independent processors on a single die.

[IMHO, multicore processors are not upcoming because everybody wants them, but because it's the best that chip manufacturers can do to satisfy the perpetual demand for more performance: Increasing performance by ramping up clock frequencies has worked for years, but won't do much longer (for physical reasons, I understand). So, to increase throughput, the only remaining way is to do the work in parallel.]

Not yet. Chances are though, that this will change soon. Intel, for example, are explicitly touting the CoreDuo's comparatively good MIPS per Watt.

The need for doing things in parallel has a strong impact on programming techniques: If you want a certain job to be done faster, you have to make it multi-threaded, and this typically goes far beyond a simple recompile. So, tool support (e.g. auto-parallelizing compilers) is needed for this, and is becoming available just now. But also the OS needs some re-thinking. It is not sufficent to just distribute a given workload across multiple threads: the OS must know which threads are tightly interacting and it must schedule them to run in parallel.

In contrast to that, it seems that the current approach (if any) taken to multiprocessing in embedded systems OSes is one that has been adopted from the server world: just dump all threads into a single pool and select from that pool on a "best effort" basis, which means there is no guarantee for tightly interacting threads to be scheduled together ("co-scheduled"). This has worked well for servers because there, a) threads are typically independent of each other and b) there are always enough threads eagerly waiting to keep all CPUs busy.

However, for many embedded systems (especially real-time ones), these assumptions are not suitable: For example, if you want to speed up a system's worst case response to some event by use of a multiprocessor, you need to parallelize the responding program, i.e. you need to run multiple responding threads *and* the OS must make sure they are run truly (not: pseudo-)parallel, because the job won't be finished until *all* threads are done.

Cheers

Rob

Robert Kaiser email: rkaiser AT sysgo DOT com SYSGO AG http://www.elinos.com Klein-Winternheim / Germany http://www.sysgo.com

Vote

C

Clifford Heath 19 years ago

The Propeller being a case in point.

What's that recent dual core PPC chip that uses 14 Watts at 2GHz? That sounds pretty interesting... Way better MIPS/watt than anything Intel-ish. But it's nice to see they're learning... with results like my new Macbook... sweet :-)

Myself, I can't see your typical "MSP" (medium-skilled programmer, a polite derogation) ever learning to write multi-threaded code. Hell, most can't even write good single-threaded code! I think the multi-core approach will make it quicker to run many single- threaded programs instead... and that's not as useful.

Instead of pursuing this avenue, we're going to have to produce yet more specialized silicon for specialized apps like gaming etc, and for the rest we're going to have to shift the focus back to writing efficient software. We need a new Moores law to chase, trying to get more useful work done per MIP, instead of per watt. There's still several orders of magnitude to recover there, in my experience, and I'm in the software world :-).

Clifford Heath.

Vote

R

Robert Kaiser 19 years ago

Hmm, I see what you mean but I don't quite share your pessimism. First, I believe that your typical embedded systems programmer tends to be somewhat more skilful than your typical "MSP". Second, there are some (IMHO promising) attempts to let the tools do much of that multi-threaded programming under the hood (see

formatting link

so programmers may actually get away with only superficial knowledge on multi-threaded programming.

But only time will tell...

Well, as I said, it has been quite useful in the server world, but for embedded/real-time use cases, it is not.

No argument about that.

But, but why should "your typical MSP" be up to that, if he cant figure out how to program multi-threaded? (After all, his kind is responsible for this mess..)

Cheers

Rob

-- Robert Kaiser email: rkaiser AT sysgo DOT com SYSGO AG

formatting link

Klein-Winternheim / Germany

formatting link

Vote

C

CBFalconer 19 years ago

... snip ...

This is because embedded systems often have no OS, or require excessively intimate knowledge of whatever exists. One of the prime purposes of an OS is to isolate the awkward programming of i/o and other resource use, so that the programmer can regard his application as operating strictly sequentially. Similarly for multi-processing systems. A good OS will provide communication mechanisms between processes (or threads) that handle all those nuisances, and avoid the programmer needing to know anything much more than "enter_critical_section" and "end_critical_section", possibly with a few succint parameters.

Chuck F (cbfalconer at maineline dot net) Available for consulting/temporary embedded and systems.

Vote

C

Clifford Heath 19 years ago

I met one the DEC employees who implemented hardcore multi- threaded code as part of DCE. I liked his comment: "only 1% of programmers think they can write multi-threaded code, and of those, only 1% actually can". Most of the trouble comes from the 0.99%, or course. I can't honestly believe that CBFalconer has written programs that have many threads (not just interrupts) traversing and modifying complex data structures, and really thinks that ordinary programmers can learn to do that safely.

Yes, that can work - hide the threading inside libraries that expose a serialised interface, are written by folk who know what they're doing, and are tested to death.

It can be taught. At least, I've had some success in teaching it, simply by encouraging people to walk through the data flows and think laterally about exactly where the structure of the input mismatches the output (I'm talking enterprise software here). It's amazing still how often I look at a system that's performing poorly because the engineer has used a pessimistic implementation chosen without thought of optimisation or data flow minimization, and I spot a 2 orders of magnitude speedup in the first thirty seconds by implementing a fast path. Several times, after implementing such a fix, it's happened a second time. One such program I worked on which processes GB of highly structured data, has dropped it's typical runtime from nearly a week to just several minutes. Nearly always, the final code line count is significantly reduced as well - this one dropped from 80,000 LOC to around 10,000.

Clifford Heath.

Vote

C

CBFalconer 19 years ago

... snip ...

The point is that a good OS, or even a good library, hides all those problems. The problem is to write that library in the first place. A good implementation of buffers will hand the producer/consumer interaction, and thus will do most of the heavy lifting. More specialized things will often need the concept of critical sections. By looking on such a unit as a monitor you can avoid worrying about synchronization.

An interrupt effectively just hands over control to a different thread or process. The only point of threads is that they make common memory easy, with the concomitant need for synchronization constructs, and less overhead.

Chuck F (cbfalconer at maineline dot net) Available for consulting/temporary embedded and systems.

Vote

R

Roberto Waltman 19 years ago

Many posts in comp.arch.embedded could be use as a counter argument to that claim ... ;)

Multithreaded programs ban be extremely difficult to debug. The coding standards (written by experienced programmers) that I had to follow at a PPOE, explicitly forbid using threads unless a special waiver was obtained from management.

Roberto Waltman

[ Please reply to the group, return address is invalid ]

Vote

P

Paul Keinanen 19 years ago

Multitasking/multithreading is not that hard, but it requires a bit more conservative in design. For instance you have to think carefully about data ownership i.e. who is allowed to update a certain item etc.

Having complex data structures updated indiscriminately from multiple threads will sooner or later end in a catastrophe. Using multiple locks to control the access is also very error prone due to priority inversions, especially if the OS does not contain a priority boost system.

For instance, grouping data into read-only, writen-only during startup but read-only during multithreading, putting data owned and written by a thread but read by many threads into own structures. clarifies the design. Depending on OS overhead, using a special thread solely for a data structure management is an option, at least for complex operations such as insertion and deletation. Using simpler data structures and preferably variables with atomic access will reduce the need for locks.

Paul

Vote

C

Clifford Heath 19 years ago

That's what I feared you thought, and in many applications, it's completely wrong. The issue comes when you have a problem that's irreducably like the OS in complexity, or worse. A commercial DBMS, for example, has a core which is of the order of four times as complex as the core of a good operating system, with a vast number of latches and various kinds of locks, of which any one thread of hundreds may need to hold hundreds at one time - and yet not allow unsafe code or deadlocks anywhere, for fear that a lawsuit over lost data will kill the company that authored it. The stakes are very high here!

Even in a lower-stake games, there are many examples where threading is used and lowers the reliability of programs, even though they're built on good libraries and good OS's. If the problem is always as simple as writing a good OS first, why is that?

Clifford Heath.

Vote

A

Alex Colvin 19 years ago

and all the libraries you use have to be equally good, including proprietary third-party software.

Including those libraries you're not aware you're using. For instance, string reference counting in C++ needs to be thread-safe.

And, of course, your compiler has to play along.

mac the naïf

Vote

C

CBFalconer 19 years ago

I believe you are saying that a carefully crafted OS, library, or DBMS can have performance problems, and that shortcuts will need to be taken for one reason or another. Those shortcuts can be much more easily evaluated if started from a sound base. They will generally be the results of "can't happen" or "don't care" conditions. After a point you may arrive at "don't do that" cautions to the user, or inhibitions in the command interface. Somewhere along the line the artistry of the programmer appears.

In general, we do not expect to be dealing with full fledged GP data bases in the embedded world.

Maybe you should evaluate the goodness of the library/OS, and then the accuracy of the application code?

Chuck F (cbfalconer at maineline dot net) Available for consulting/temporary embedded and systems.

Vote

C

Clifford Heath 19 years ago

Not really. Just that when you combine high complexity data structures with a large number of threads, and you have high concurrency requirements, the threads cannot afford to hold global locks everywhere, but must "crab" about the data structures - and that's exceptionally difficult to get right. It's not a shortcut, and it's expected of sophisticated OS's and DBMS.

That was exactly my point. My original comment here was in response to purported imminent introduction of Intel Core2 Duo processors to the embedded world. If you need that many MIPs and can't just use an FPGA, it's likely that you have a situation that will use many threads. It's not a typical embedded scenario, but it was proposed as a future scenario. Embedded engineers are more careful, but because of that they tend to also be more conservative, so I can't see that sort of difficult multi-threading becoming widespread.

Clifford Heath.

Vote

C

Chris Hills 19 years ago

In article , ssubbarayan writes

Well in most multi tasking systems there is only 1 MCU. And you time slice. So many miliseconds per task to give the illusion of lots os things happening at once.

The idea with the PC dual, quad core etc is that you really can have several tasks in parallel. It can give a speed increase.

In some niche areas. Multiple cores have been in use for over a decade. It depends how you define multiple core. Some DECT systems have dissimilar cores in them others like some of the PPc range have core similar cores.

Some embedded systems have multiple MCU's but that is not multi core

Good question.... Until it moves more main stream I am not sure. Many (most? ) embedded systems don't use an OS or need an RTOS

the are more problems than re-entrancy here and the problems will be different to distributed or multi processor programming.

I think a lot of it is going to be hidden deep inside the OS and libraries.

\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\ \/\/\/\/\ Chris Hills Staffs England /\/\/\/\/ /\/\/ chris@phaedsys.org www.phaedsys.org \/\/\ \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

Vote

D

Dennis 19 years ago

It is great fun to talk to Windows programmers who think that because they have been "multi tasking" and "threading" for years that multicore processors will just work. They will soon find out that the fact there was only one real processor covered up a lot of concurrency and synchronization bugs.

Vote

P

Paul Keinanen 19 years ago

Motherboards with two or four x86 CPUs have been available since

1990's, mainly intended for server applications.

Anyone serious about writing multithreading applications would have been able to test concurrency issues before any HyperThreading etc. CPUs arrived.

Paul

Vote

C

Chris Hills 19 years ago

There is a difference between multi CPU/MCU and multipile cores onthe same chip.

\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\ \/\/\/\/\ Chris Hills Staffs England /\/\/\/\/ /\/\/ chris@phaedsys.org www.phaedsys.org \/\/\ \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

Vote

M

Michael N. Moran 19 years ago

That's a rather broad statement. With the possible exception of some very low-level initialization issues, however, there is no difference at the multi-threading level between an SMP implementation on one multiple cores.

Michael N. Moran (h) 770 516 7918 5009 Old Field Ct. (c) 678 521 5460 Kennesaw, GA, USA 30144 http://mnmoran.org "So often times it happens, that we live our lives in chains and we never even know we have the key." The Eagles, "Already Gone" The Beatles were wrong: 1 & 1 & 1 is 1

Vote

D

David Brown 19 years ago

In reference to motherboards (you snipped that bit of the post), there is a vast difference. It is only since multiple core chips have become common that SMP has reached the desktop - multiple sockets are very much the domain of servers and high-end workstations.

From a software viewpoint, you are of course correct - they are virtually identical, other than for performance issues.

Vote

P

Paul Keinanen 19 years ago

Could you, please, be a bit more specific about the implication on the software design.

I have used multiprocessors based on 74Sxx TTL chips, ASICs, multiple Pentiums (just writing this on such machine) and multicore processors and I have not seen a reason, why I should program these differently.

I think it is important to realise, that in any virtual memory system, any code/data memory access can cause a page fault, i.e. cause a page to be loaded from the disk (slowly), thus, there can be a process/thread switch between each memory access.

Paul

Vote

Regarding Dual core processors

Join the Discussion

Didn't find your answer?