Fast embedded architectures.

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
There is some new computer architecture stuff being done. It is
interesting that the interrupt response time is as slow on today's
fast desktop processors as it was on 1Mhz 6502's. Caching and context
switches are expensive.

Embedded processors are interesting. Why are x86s running at
3,000,000,000 hz and the 8051 is still running at 1Mip (12Mhz)?
Besides the fact that sometimes that is good enough, there are some
new embedded cpu technologies being implemented. For instance, I think
it is Xtensa processor that implements a pure move machine and is
novel and interesting.

I am particularly interested in the Ubicom machines. They have an
"interesting" marketing plan to hide themselves from anyone that will
not promise 100K unit sales, but their technology is fascinating. I
personally am most interested in the single chip 2022 processor which
is a 120 Mhz, 120 MIP super PIC programmable in GNU C, but very fast
and different. Recently they have introduced the 250 Mhz, 250 MIP 3023
processor which implements on-chip 8 hardware threads. It can hide
pipeline delays by substituting other threads, the same as the "big"
chips, but without the overheads/delays introduced by today's caches
and MMUs etc. The 3023 can dedicate a thread to watching a particular
I/O pin or device, which makes context switches immediate. These are
not intended to be desk top machines, but in the more conservative
embedded space the implementations are unique.

I am not affiliated with Ubicom and I am far from expert on this new
chip, but if you want to get an architectural description (at least a
datasheet) for either the 2022 or 3023 processor, you can get it at:
http://www.silicondust.com/ubicom/forum/index.php

Regards, Steve



Re: Fast embedded architectures.


Quoted text here. Click to load it

The x86s are general purpose CPUs while the 8051s are simple controller
CPUs. BTW, 8051's now run around 33 MIPS (Cygnal 8051s).

--
- Mark ->
--

Re: Fast embedded architectures.
Quoted text here. Click to load it

Being simple is really no excuse for being slow. The 8051 is so byzantine
that it is very difficult to make it fast, and still keep it small. So is
the x86, but when you have a power budget of 100W, and a die size of a cm˛,
you can implement tricks that aren't possible within an embedded
controller.

You can make fast and simple embedded architectures, given you throw the
byzantine stuff over board. Been there, done that. My b16
(http://www.b16-cpu.de /) is about the silicon size of a small 12-cycle 8051
(with microcode), but it's a single-cycle implementation, and the cycle
time is much shorter, too. As bonus, you get 16 bit ALU width, since the
ALU really isn't a big block, even of a small microcontroller.

I don't have figures for a 12-cycle 8051 core at hand for modern processes,
but the 2-cycle 8051 from Inventra we used some time ago did get up to
16MHz in a 0.5µ TSMC process (8 MIPS), while the b16 could run at 100MHz in
the same process (100 MIPS) - using the same standard cell library*. MIPS
for such small and simple CPUs is pretty meaningless, especially for the
8051, which can do a few things efficiently (like moving constants to
SFRs), and the rest very clumsy. The b16 MHz is about 2 486 MHz on the
(very ancient) Byte sieve benchmark, i.e. a 100MHz b16 finds the first 1899
primes in the same time as a 50MHz 486. I don't have any data for 8051 and
Byte sieve at hand, though, but with the very limited and complicated
addressing into X-RAM, the 8051 has to be quite slow there.

~100MHz in 0.5µ means about a GHz in 0.18µ (give or take, since in 0.18µ,
you can select between fast and low-power transistors), more if you are
willing to spent the time for a full-custom design. The b16 is inspired by
Chuck Moore's c18, which is a full-custom design, and expected to run at
2.4GHz in 0.18µ.

*) The 8051 from Inventra has multiple critical paths. One of the worst is
that it has program memory in the critical path. So the 40ns flash was one
of the limiting factors. You'd say that the b16 would suffer a lot from a
40ns flash, too, at 100MHz, but that's not the case. The b16 packs 3 and a
bit instruction in one 16 bit word, and therefore wouldn't be slowed down
much.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
We've slightly trimmed the long signature. Click to see the full one.
Re: Fast embedded architectures.
Some of the 8051 designs (in form of Verilog / VHDL) can easily be
synthesis and implemented on 0.13um processes running at over 200MHz.
The question is how can they make money out of it.  With ATMEL pushing
out ARM7 chip at US$3, it doesn't make sense to sell a 100MIPS 8051
which can cost US$10.

(http://www10.edacafe.com/nbc/articles/view_article.php?section=ICNews&articleid14%8406 )

Another factor is that how can you get your program ROM running at such
high speed?  In high end chips the program run from RAM.  If you want to
do the same in 8051, you need to copy the program code from flash to
on-chip RAM. I.e. 32k flash + 32k on chip SRAM for 32k of program code,
it can be done but not very cost effective.

Joe

Re: Fast embedded architectures.


Quoted text here. Click to load it

Do you really need to have your keyboard controller check to
see if you have pressed a key a billion times a second?  :)



Re: Fast embedded architectures.
Quoted text here. Click to load it

  The 80C51 is available in 1 Cycle models, with BUS-Bandwidths of
100MHz, for 100 Mips peak.
  The x86s that claim 3GHz have lower Bus-Bandwidth.
( only a few nodes spin at 3GHz )


Quoted text here. Click to load it

  That is a good idea, similar to the MultiThreading the bigger CPUs now
offer.

  Another good idea is Multiple-CPUs - even for small uC: for a while
http://www.quickcores.com/ had a system of many-80C51s but they seem
to have been offline for a while now ?

  This makes good sense, given the finite memory bandwidths.

-jg


Re: Fast embedded architectures.
On Wed, 10 Nov 2004 11:43:07 +1300, Jim Granville

Quoted text here. Click to load it
What chip is that? Lets not confuse IP cores with actual single
chippers. Also what is the external crystal/clock? The ubicom ip2k has
a 50x pll that takes an external 4.8Mhz crystal to 120 Mhz clock rate.


Quoted text here. Click to load it
I agree, see my comments on interrupt latency. However, your 100 MIP
8051 uses what kind of memory? What is its external memory speed?
Quoted text here. Click to load it
Some people (Guy) have asked why do we need the speed. I think the
above sentence says you don't always.


Quoted text here. Click to load it

My intent was to point at a very interesting architecture that allows
doing bit banging i/o in software for many types of i/o and has a
serializer/deserilizer for even faster serial I/O (10baset ethernet
and USB device AND host) all in a single chip package. If a PIC or an
8051 is good enough, fine. But if it is possible to 'net enable
current embedded apps and do it in software, I think that is
revolutionary.

Regards, Steve


Re: Fast embedded architectures.
Quoted text here. Click to load it

Look at
http://www2.silabs.com/public/documents/marcom_doc/mcoll/Microcontrollers/en/mcu_product_selector_guide.xls

and yes, this IS an 'actual single chip device' (family) :)

Quoted text here. Click to load it

FLASH


I'm not sure they can fetch code from external memory - why would you
want to, with 64K / 128K of Flash on chip ?

<snip>
Quoted text here. Click to load it

  The concept of S/W solutions to what are traditional Hardware layers is
interesting, but the SX family (ex scenix, now ubicom) have not really
been a outstanding success.
  Their operate Icc was VERY HIGH, which is no real surprise if you try
and do a HW task using SW.
  The 3023 improves on that, with the new time-slice and more HW
support, but that device is not a merchant microcontroller.

  -jg


Re: Fast embedded architectures.
Quoted text here. Click to load it

Have you ever tried designing a board with signals at that frequency?
There are many problems with high speed logic that make development
prohibitivly expensive. So most people will attempt to get away with as
slow of a device as possible which results in little demand for such fast
controllers.

--

Wing

Re: Fast embedded architectures.
Quoted text here. Click to load it

2 MIPS = 2 million instructions per second
1 MIPS = 1 million instructions per second
1 MIP = 1 million instructions per



Re: Fast embedded architectures.

Quoted text here. Click to load it

Isn't the default "fortnight" if not specified?  :)



Re: Fast embedded architectures.

[...]
Quoted text here. Click to load it

MIP = "Meaningless Indicator of Performance"  

MIPS = "Marketing's Idea of Processor Speed."

Regards,

                               -=Dave
--
Change is inevitable, progress is not.

Re: Fast embedded architectures.
Quoted text here. Click to load it

MIPS == Meaningless Information Provided by Sales people


--
Michael N. Moran           (h) 770 516 7918
5009 Old Field Ct.         (c) 678 521 5460
We've slightly trimmed the long signature. Click to see the full one.
Re: Fast embedded architectures.
Quoted text here. Click to load it

Some of us remember when the unit was used by technical people,
and gave a crude measurement of processor speed.  For reasons that I
am completely unable to fathom, the other half of the act (MFlops)
is regarded as more meaningful.

Those of us who tried to get MAPS (Memory Access Per Second) into use
failed dismally, despite the fact that it has remained the most
important of such measures of speed.


Regards,
Nick Maclaren.

Re: Fast embedded architectures.
:>steven wrote:
:>>
:>>>Embedded processors are interesting. Why are x86s running at
:>>>3,000,000,000 hz and the 8051 is still running at 1Mip (12Mhz)?
:>>
:>> 1 MIPS = 1 million instructions per second
:>
:>MIPS == Meaningless Information Provided by Sales people
:
: Some of us remember when the unit was used by technical people,

[SNIP]

: Those of us who tried to get MAPS (Memory Access Per Second) into use
: failed dismally, despite the fact that it has remained the most
: important of such measures of speed.

Errr, why? For the low end you're right. Most 8-bit applications
are not ALU bound but IO bound. MCUs are the blue-collar workers.

But as soon as the task become computing intensive, you've stepped
right into the controversary between RISC and CISC. Where the point
of RISC is to reduce the number of addressing modes that instructions
are allowed to use (and thus, not the actual number of instructions.)

In return, Load and Store are allowed all addressing modes and then some.
This split between data movement and the actual computation is ideally
suited for caches, especially split I/D caches of Harvard architectures.
Hence, RISC computers are to be considered white-collar workers.

But as we all know, there are few clear borders in life. And as an
supposed white-collar engineer you often have to dirty your hands.
Likewise, a compromise between RISC and CISC is required.

--
  ******************************************************
  Never ever underestimate the power of human stupidity.
We've slightly trimmed the long signature. Click to see the full one.
Re: Fast embedded architectures.

|>
|> : Those of us who tried to get MAPS (Memory Access Per Second) into use
|> : failed dismally, despite the fact that it has remained the most
|> : important of such measures of speed.
|>
|> Errr, why? For the low end you're right. Most 8-bit applications
|> are not ALU bound but IO bound. MCUs are the blue-collar workers.
|>
|> But as soon as the task become computing intensive, you've stepped
|> right into the controversary between RISC and CISC. Where the point
|> of RISC is to reduce the number of addressing modes that instructions
|> are allowed to use (and thus, not the actual number of instructions.)

That is true, and it means that MIPS will vary by about a factor of
three - that is slightly more than the variation 30 years ago, but
not much.  But ONLY a factor of three - and that is between VAX and
any of the modern RISCs.  What you may have missed is that MFLOPS
also varies by up to a factor of two between architectures.

But this has nothing to do with MAPS, which is Cinderella to the two
ugly sisters, MIPS and MFLOPS.  So far, I have failed dismally at
being the Fairy Godmother :-)

|> In return, Load and Store are allowed all addressing modes and then some.
|> This split between data movement and the actual computation is ideally
|> suited for caches, especially split I/D caches of Harvard architectures.
|> Hence, RISC computers are to be considered white-collar workers.

That was the beautiful theory, which held sway until it was beaten
up by a brutal gang of ugly facts.  No, that separation doesn't
help enough to make a significant difference, and that was known
before 1980 (i.e. BEFORE the RISC systems hit the public).

|> But as we all know, there are few clear borders in life. And as an
|> supposed white-collar engineer you often have to dirty your hands.
|> Likewise, a compromise between RISC and CISC is required.

That difference is more-or-less irrelevant.  It wasn't particularly
interesting in the late 1980s (the heyday of modern RISC) and is
even less relevant today.


Regards,
Nick Maclaren.

Re: Fast embedded architectures.

Quoted text here. Click to load it

Because they're designed for vastly different purposes.  Because most
embedded systems can't afford to burn 100 W of electrical energy.  And
for gazillions of other reasons.  In other words, the question is
meaningless.

(And BTW, 8051s come at a good deal more than 1 MIPS these days).

Quoted text here. Click to load it

"Novel and interesting" is usually beside the point.  Reliable and
working is what is needed in the world of embedded computing.

--
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Re: Fast embedded architectures.
Quoted text here. Click to load it

You can get ARMs running at 700MHz (Intel's XScale flavour of ARM) and using
100MHz SDRAM. Unfortunatley, every last Hz of CPU and memory bandwidth is
eaten by doing something as simple as writing to CompactFlash at a measly
20MB/s (PXA processors). It's a shame Intel don't make the XScale core
available by itself.

On the subject of loads of MHz with little to see for it, there are
clockless CPUs starting to appear. There's a clockless ARM called Amulet and
a clockless 8051 called HT-80C51.



Re: Fast embedded architectures.
Quoted text here. Click to load it

Sounds interesting; I'll look in to that.

Quoted text here. Click to load it
devices and memories (at
Quoted text here. Click to load it

In the case of PXA25x, it has a reasonablly good core but the IO bus and
on-chip peripherals let it down badly.

Quoted text here. Click to load it



Re: Fast embedded architectures.
8051 is cheap, low tech equipment.
In addition it´s not specialized (general purpose).
That´s what makes it useful.

Two years ago i´ve seen 8051s at 100 MHz and above.
More commonly available are 30 - 60 MHz nowadays.
(Aren´t most derivates down to 4 cycles instead of 12 nowadays?)
If that´s not enough, there are other µCs that are more high tech.
(sharc?, Atmel´s ARM beginn where their 8051s end,
there are fast DSPs, ...)

So 8051 derivates do get faster, but high speed is not always what is demanded.


Yours
martin.

Site Timeline