Fast embedded architectures.

There is some new computer architecture stuff being done. It is interesting that the interrupt response time is as slow on today's fast desktop processors as it was on 1Mhz 6502's. Caching and context switches are expensive.

Embedded processors are interesting. Why are x86s running at

3,000,000,000 hz and the 8051 is still running at 1Mip (12Mhz)? Besides the fact that sometimes that is good enough, there are some new embedded cpu technologies being implemented. For instance, I think it is Xtensa processor that implements a pure move machine and is novel and interesting.

I am particularly interested in the Ubicom machines. They have an "interesting" marketing plan to hide themselves from anyone that will not promise 100K unit sales, but their technology is fascinating. I personally am most interested in the single chip 2022 processor which is a 120 Mhz, 120 MIP super PIC programmable in GNU C, but very fast and different. Recently they have introduced the 250 Mhz, 250 MIP 3023 processor which implements on-chip 8 hardware threads. It can hide pipeline delays by substituting other threads, the same as the "big" chips, but without the overheads/delays introduced by today's caches and MMUs etc. The 3023 can dedicate a thread to watching a particular I/O pin or device, which makes context switches immediate. These are not intended to be desk top machines, but in the more conservative embedded space the implementations are unique.

I am not affiliated with Ubicom and I am far from expert on this new chip, but if you want to get an architectural description (at least a datasheet) for either the 2022 or 3023 processor, you can get it at:

formatting link

Regards, Steve

Reply to
Steve Calfee
Loading thread data ...

Steve Calfee wrote in news: snipped-for-privacy@4ax.com:

The x86s are general purpose CPUs while the 8051s are simple controller CPUs. BTW, 8051's now run around 33 MIPS (Cygnal 8051s).

--
- Mark ->
--
Reply to
Mark A. Odell

Some of the 8051 designs (in form of Verilog / VHDL) can easily be synthesis and implemented on 0.13um processes running at over 200MHz. The question is how can they make money out of it. With ATMEL pushing out ARM7 chip at US$3, it doesn't make sense to sell a 100MIPS 8051 which can cost US$10.

formatting link

Another factor is that how can you get your program ROM running at such high speed? In high end chips the program run from RAM. If you want to do the same in 8051, you need to copy the program code from flash to on-chip RAM. I.e. 32k flash + 32k on chip SRAM for 32k of program code, it can be done but not very cost effective.

Joe

Reply to
Joe

Do you really need to have your keyboard controller check to see if you have pressed a key a billion times a second? :)

Reply to
Guy Macon

The 80C51 is available in 1 Cycle models, with BUS-Bandwidths of

100MHz, for 100 Mips peak. The x86s that claim 3GHz have lower Bus-Bandwidth. ( only a few nodes spin at 3GHz )

That is a good idea, similar to the MultiThreading the bigger CPUs now offer.

Another good idea is Multiple-CPUs - even for small uC: for a while

formatting link
had a system of many-80C51s but they seem to have been offline for a while now ?

This makes good sense, given the finite memory bandwidths.

-jg

Reply to
Jim Granville

What chip is that? Lets not confuse IP cores with actual single chippers. Also what is the external crystal/clock? The ubicom ip2k has a 50x pll that takes an external 4.8Mhz crystal to 120 Mhz clock rate.

I agree, see my comments on interrupt latency. However, your 100 MIP

8051 uses what kind of memory? What is its external memory speed?

Some people (Guy) have asked why do we need the speed. I think the above sentence says you don't always.

My intent was to point at a very interesting architecture that allows doing bit banging i/o in software for many types of i/o and has a serializer/deserilizer for even faster serial I/O (10baset ethernet and USB device AND host) all in a single chip package. If a PIC or an

8051 is good enough, fine. But if it is possible to 'net enable current embedded apps and do it in software, I think that is revolutionary.

Regards, Steve

Reply to
Steve Calfee

Look at

formatting link

and yes, this IS an 'actual single chip device' (family) :)

FLASH

I'm not sure they can fetch code from external memory - why would you want to, with 64K / 128K of Flash on chip ?

The concept of S/W solutions to what are traditional Hardware layers is interesting, but the SX family (ex scenix, now ubicom) have not really been a outstanding success. Their operate Icc was VERY HIGH, which is no real surprise if you try and do a HW task using SW. The 3023 improves on that, with the new time-slice and more HW support, but that device is not a merchant microcontroller.

-jg

Reply to
Jim Granville

Have you ever tried designing a board with signals at that frequency? There are many problems with high speed logic that make development prohibitivly expensive. So most people will attempt to get away with as slow of a device as possible which results in little demand for such fast controllers.

--

Wing
Reply to
A B

2 MIPS = 2 million instructions per second 1 MIPS = 1 million instructions per second 1 MIP = 1 million instructions per
Reply to
steven

Isn't the default "fortnight" if not specified? :)

Reply to
Guy Macon

Because they're designed for vastly different purposes. Because most embedded systems can't afford to burn 100 W of electrical energy. And for gazillions of other reasons. In other words, the question is meaningless.

(And BTW, 8051s come at a good deal more than 1 MIPS these days).

"Novel and interesting" is usually beside the point. Reliable and working is what is needed in the world of embedded computing.

--
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.
Reply to
Hans-Bernhard Broeker
[...]

MIP = "Meaningless Indicator of Performance"

MIPS = "Marketing's Idea of Processor Speed."

Regards,

-=Dave

--
Change is inevitable, progress is not.
Reply to
Dave Hansen

Being simple is really no excuse for being slow. The 8051 is so byzantine that it is very difficult to make it fast, and still keep it small. So is the x86, but when you have a power budget of 100W, and a die size of a cm², you can implement tricks that aren't possible within an embedded controller.

You can make fast and simple embedded architectures, given you throw the byzantine stuff over board. Been there, done that. My b16

formatting link
is about the silicon size of a small 12-cycle 8051 (with microcode), but it's a single-cycle implementation, and the cycle time is much shorter, too. As bonus, you get 16 bit ALU width, since the ALU really isn't a big block, even of a small microcontroller.

I don't have figures for a 12-cycle 8051 core at hand for modern processes, but the 2-cycle 8051 from Inventra we used some time ago did get up to

16MHz in a 0.5µ TSMC process (8 MIPS), while the b16 could run at 100MHz in the same process (100 MIPS) - using the same standard cell library*. MIPS for such small and simple CPUs is pretty meaningless, especially for the 8051, which can do a few things efficiently (like moving constants to SFRs), and the rest very clumsy. The b16 MHz is about 2 486 MHz on the (very ancient) Byte sieve benchmark, i.e. a 100MHz b16 finds the first 1899 primes in the same time as a 50MHz 486. I don't have any data for 8051 and Byte sieve at hand, though, but with the very limited and complicated addressing into X-RAM, the 8051 has to be quite slow there.

~100MHz in 0.5µ means about a GHz in 0.18µ (give or take, since in 0.18µ, you can select between fast and low-power transistors), more if you are willing to spent the time for a full-custom design. The b16 is inspired by Chuck Moore's c18, which is a full-custom design, and expected to run at

2.4GHz in 0.18µ.

*) The 8051 from Inventra has multiple critical paths. One of the worst is that it has program memory in the critical path. So the 40ns flash was one of the limiting factors. You'd say that the b16 would suffer a lot from a

40ns flash, too, at 100MHz, but that's not the case. The b16 packs 3 and a bit instruction in one 16 bit word, and therefore wouldn't be slowed down much.
--
Bernd Paysan
"If you want it done right, you have to do it yourself"
 Click to see the full signature
Reply to
Bernd Paysan

You can get ARMs running at 700MHz (Intel's XScale flavour of ARM) and using

100MHz SDRAM. Unfortunatley, every last Hz of CPU and memory bandwidth is eaten by doing something as simple as writing to CompactFlash at a measly 20MB/s (PXA processors). It's a shame Intel don't make the XScale core available by itself.

On the subject of loads of MHz with little to see for it, there are clockless CPUs starting to appear. There's a clockless ARM called Amulet and a clockless 8051 called HT-80C51.

Reply to
Tim Clacy

MIPS == Meaningless Information Provided by Sales people

--
Michael N. Moran           (h) 770 516 7918
5009 Old Field Ct.         (c) 678 521 5460
 Click to see the full signature
Reply to
Michael N. Moran
8051 is cheap, low tech equipment. In addition it´s not specialized (general purpose). That´s what makes it useful.

Two years ago i´ve seen 8051s at 100 MHz and above. More commonly available are 30 - 60 MHz nowadays. (Aren´t most derivates down to 4 cycles instead of 12 nowadays?) If that´s not enough, there are other µCs that are more high tech. (sharc?, Atmel´s ARM beginn where their 8051s end, there are fast DSPs, ...)

So 8051 derivates do get faster, but high speed is not always what is demanded.

Yours martin.

Reply to
martin.lexen

Some of us remember when the unit was used by technical people, and gave a crude measurement of processor speed. For reasons that I am completely unable to fathom, the other half of the act (MFlops) is regarded as more meaningful.

Those of us who tried to get MAPS (Memory Access Per Second) into use failed dismally, despite the fact that it has remained the most important of such measures of speed.

Regards, Nick Maclaren.

Reply to
Nick Maclaren

:>MIPS == Meaningless Information Provided by Sales people : : Some of us remember when the unit was used by technical people,

[SNIP]

: Those of us who tried to get MAPS (Memory Access Per Second) into use : failed dismally, despite the fact that it has remained the most : important of such measures of speed.

Errr, why? For the low end you're right. Most 8-bit applications are not ALU bound but IO bound. MCUs are the blue-collar workers.

But as soon as the task become computing intensive, you've stepped right into the controversary between RISC and CISC. Where the point of RISC is to reduce the number of addressing modes that instructions are allowed to use (and thus, not the actual number of instructions.)

In return, Load and Store are allowed all addressing modes and then some. This split between data movement and the actual computation is ideally suited for caches, especially split I/D caches of Harvard architectures. Hence, RISC computers are to be considered white-collar workers.

But as we all know, there are few clear borders in life. And as an supposed white-collar engineer you often have to dirty your hands. Likewise, a compromise between RISC and CISC is required.

--
  ******************************************************
  Never ever underestimate the power of human stupidity.
 Click to see the full signature
Reply to
Geir Frode Raanes Sørensen

In article , |> |> : Those of us who tried to get MAPS (Memory Access Per Second) into use |> : failed dismally, despite the fact that it has remained the most |> : important of such measures of speed. |> |> Errr, why? For the low end you're right. Most 8-bit applications |> are not ALU bound but IO bound. MCUs are the blue-collar workers. |> |> But as soon as the task become computing intensive, you've stepped |> right into the controversary between RISC and CISC. Where the point |> of RISC is to reduce the number of addressing modes that instructions |> are allowed to use (and thus, not the actual number of instructions.)

That is true, and it means that MIPS will vary by about a factor of three - that is slightly more than the variation 30 years ago, but not much. But ONLY a factor of three - and that is between VAX and any of the modern RISCs. What you may have missed is that MFLOPS also varies by up to a factor of two between architectures.

But this has nothing to do with MAPS, which is Cinderella to the two ugly sisters, MIPS and MFLOPS. So far, I have failed dismally at being the Fairy Godmother :-)

|> In return, Load and Store are allowed all addressing modes and then some. |> This split between data movement and the actual computation is ideally |> suited for caches, especially split I/D caches of Harvard architectures. |> Hence, RISC computers are to be considered white-collar workers.

That was the beautiful theory, which held sway until it was beaten up by a brutal gang of ugly facts. No, that separation doesn't help enough to make a significant difference, and that was known before 1980 (i.e. BEFORE the RISC systems hit the public).

|> But as we all know, there are few clear borders in life. And as an |> supposed white-collar engineer you often have to dirty your hands. |> Likewise, a compromise between RISC and CISC is required.

That difference is more-or-less irrelevant. It wasn't particularly interesting in the late 1980s (the heyday of modern RISC) and is even less relevant today.

Regards, Nick Maclaren.

Reply to
Nick Maclaren

Checked the date for my IP3023 and it was Oct 4, 2003 ;-) Multithreading is nice, Pity very few companies understand how useful it is. Did a study on that around 1996, but did not result in anything at the company but I think this may have infliuenced Ubicom since the study was presented to Bulent Celebi, who I believe became CEO of Ubicom. Would be nice to know if this really had any influence...

--
Best Regards,
Ulf Samuelsson   ulf@a-t-m-e-l.com
 Click to see the full signature
Reply to
Ulf Samuelsson

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.