New embedded CPU architecture

Hi all,

I'm looking for ideas as to what features should be included in a new embedded CPU architecture. What are current microcontrollers missing? All suggestions greatly received!

Cheers, JonB

Reply to
Jon Beniston
Loading thread data ...

Wow...what a loaded question...and don't you mean what are current embedded CPU's missing?? There is a difference.

I don't think it's an issue of what are they's a matter of taking the best features froma variety of CPU's and ensuring they're covered as best as possible in a new architecture.

How much will you pay me for my consulting services??


Reply to

This is a fairly open-ended question. My response to this question would be - "What is the application?". The problem is that most embedded systems have a cost constraint, so having a micro with every conceivable peripheral on-board is unnecessary. The following is on my list:

  1. Rather than a single micro, but a re-usable micro core(s) which can be used in various variants or configurations of the microcontroller. It would be desirable that the core offer code copmpatibility from
8-bits -> 16-bits -> 32-bits. Often in a project, it may be difficult to pick the right micro until near the end of the project. The "right" micro" meaning one that is optimised by features, performance and cost. By having reasonable code compatibility and scaleability, a late-binding of this decision can be made, with development being carried out on a "superset" of the micro.
  1. An extensive selection of peripherals, eg. serial, CAN, timers, DMA, ADC, DAC, watchdog, I2C.
  2. CISC or RISC? I personally believe it doesn't matter -- what's important is the system effective MIPs which is when you look at the total system with external resources such as Flash & RAM. The current trend is for microcontrollers with larger on-board Flash & RAM, running at maximum bus speed. However there is a cost trade-off & so it would be desirable to have a micro family with different sizes of on-board memory.


+====================================+ I hate junk email. Please direct any genuine email to: kenlee at
Reply to
Ken Lee

formatting link

Reply to
Alex Gibson

Thats good, because that's exactly what I've designed. As well as a configurable word width, there are many other configurable parameters (instruction set, number of registers, addressing modes, etc. etc).

Actually, I haven't bothered with an 8-bit config, as it turns other that a 16-bit implementation is no bigger than a lot of the 8-bit micros out there (6811, z80s etc).

It will be available soon...


Reply to
Jon Beniston

32bit, 1MB Flash, 1MB static RAM, running at 10uA, for less than 1$ Oh, yes with the usual peripherals, perhaps in a TQFP64.


Reply to
Rene Tschaggelar

These are my suggestions:

1) it should have a scalable and compatible core through its versions (8/16/32 bit). The core should be designed in order to be fast, to save power and program memory storage space (one-word instructions ??)

2) it should have an on-chip debugger

3) it should plan a large set of pheripherals through its versions

4) it should give to the user the possibility to design an "on-chip logic" dependent by the application. The user may use logic ports, flip-flops, registers and so on to build the own logic in a part of the chip


Reply to
Alessandro Strazzero

"Alessandro Strazzero" schrieb im Newsbeitrag news:



Most important there should be a well working, stable ASM/C/C++ development suite. The core should allow for easy debugging by adding on-core debug support a la BDM or JTAG.

Don't forget documentation, code samples and so on.


- Rene

My mail address (wou will find out how it works): r @ . e s c n e h e 5

Reply to

"" schreef in bericht news:bl4hmc$t9$


Wou will find out what?

Frank Bemelman
 Click to see the full signature
Reply to
Frank Bemelman

Isn't RISC supposed to be a lot easier for a compiler to generate efficient code for? And you DO want to have a (at least C) compiler for it.


formatting link

Reply to
Ben Bradley

Having a nicely sized general purpose register file is probably the key. Personally I have gone for a RISC architecture as it allows for simple, fast and efficient pipeline implementations. For the C compiler I have ported GCC.

Reply to
Jon Beniston

It's a matter of opinion whether a truck load of registers is a good thing for an embedded processor; in particular it will slow event response time (unless you plan to have banks of sets of registers). Also, it's a matter of opinion whether a pipeline is a good thing for embedded processors since it doesn't help (in fact, hinders) repsonding to events deterministicaly and efficiently.

An embedded processor should be simple and able to respond to multiple real-time events efficiently and deterministically; I would like to see an interrupt flag, uninterruptable instruction sequences, program-counter-relative instruction/data movement and a high speed serial connection (RS232, 1394) to a built-in, hardware interpreter (so that a platform can be brought up with a terminal and with no other hardware or software). I would leave off pipeline, branch-target buffer and caches, although some mechanism to move external data to and from fast local store under control of the developer would be extremely useful. I wouldn't bother with much computational support like floating point or SIMD (in fact I wouldn't bother with an ALU if the core was good enough), but I would thnk about a simple hardware and software interface to enable connections to external or on-chip application-specific functions for such work (a bus for multiple processor elements). Try and make it so that if the CPU isn't working, it doesn't use any power without requiring messing about with frequency chage sequences or practically unusable power modes [a 'wait(Mask events)' instruction for instance].


Reply to
Tim Clacy

There are a few things worth thinking about in a processor aimed at embedded systems. One is fast and deterministic interrupt response - aim to get the processor running interrupt code as fast as possible. That means that register sets should be small (although not too small :-) - I would estimate about 8 general-purpose registers, each of which can also function as a pointer, to be ideal for small processors. When you have a big register set, a lot of time and space is lost while stacking and restoring registers. Also make sure that any status or flag registers can be efficiently saved and restored - architectures like the COP8 where the status register also contains interrupt flags are a nightmare (when running with a 10 MHz crystal, the COP8 needs about 60us overhead for saving and restoring the processor status - and it only has 3 registers).

In general, it is worth looking at other chips and getting an idea of their particular good and bad qualities. For example, the AVR is a nice, elegant processor, but would be very much nicer with 16 registers at 16-bit rather than 32 registers at 8-bit, and its pointers should be more symmetrical.

If you want easy compiler support, keep the instruction set as orthogonal as possible. Also, keep a single address space - it makes programming (assembly programming, compiler writing, and C programming) *much* easier. Provide a single 64k address space for the 16-bit version - if the user needs more address space, they can use manual banking and address latches (for slow access) or the 32-bit version of your processor.

Add instructions that are useful for embedded systems. Bit manipulation is more important than in normal processors, so functions like barrel shifters, bit reversing, and-not, find-first-one, count-ones, and so on can be very useful. Arithmetic with cut-offs (especially a decrement that won't pass zero) could be very useful.

Multi-tasking (or, more often, multi-threading) is very important in real-time systems. If you can put some basic primitives into the hardware, you can really speed up and simplify an OS. An instruction which temporarily disabled interrupts for, say, 4 instuction cycles, would be an enormous benifit to synchronisation primitives - it would be safer than normal interrupt disabling, far faster than locking calls, and make it very easy to consistently access data shared between threads (or interrupt routines).

Avoid caches - they are difficult to make efficiently, and take up a lot of space - use on-board ram instead. But a short instruction pre-fetch cache can be useful, will let you do decoding in advance (and maybe even execution of straight branches), and will let you optomise tight while (n--) {*p++ =

*q++} loops without accessing program memory.
Reply to
David Brown

And a C compiler with decent optimisation would be a bonus, so that you could write more understandable and less error-prone C and still get the "tight loop" benefits. :-) Not that this particular example is all that bad.


Reply to
Peter Bushell



You are thinking of being able to write things like:

int a[100]; for (i = 0; i < 99; i++) { a[i] = a[i + 1]; }

instead of

int a[100]; int n, *p, *q; n = 99; p = &a[0]; q = &a[1]; while (--n) { *p = *q; p = q; q++; }

On some C compilers for some architectures, you need to do that sort of "manual optimisation" while looking closely at the generated assembly code if you want the most efficient run-time code. I fully agree with you that such things should be handled by the compiler - fortunately, the OP is intending to use gcc as his compiler, so this sort of thing is already handled by the gcc front-end. That's the great thing about using gcc as the basis for your port - you get top-class front-end optomisations for free before you even start writing the new back-end.

Reply to
David Brown

Sounds lke homework to me...

First draw up a list of current microcontrollers. Then list : what they have in common, what they have that is unique, and what market share each has.

List market share against release date of original core, and software maturity.

Next, define your target users, and what MIPS/CODE/RAM/PIN/Peripheral/Price numbers they will need.

Once you have CODE/RAM defined, you can derive opcodes to efficently access those areas. Conversely, if you do not define CODE/RAM targets first, you cannot design efficent opcodes.

'Do efficent opcodes really matter' can be next semesters homework...


Reply to
Jim Granville

Nope, I have actually designed and implemented the CPU and toolchain already. Just looking for a few new ideas to make it stand (further) out from the crowd when I release it.

Say what you want, and you might just get it!

Cheers, Jon

Reply to
Jon Beniston


Reply to
Jon Beniston


I can't imagine what you'd do with a CPU that couldn't do any logical or arithmetic operations.

  • No and/or/xor/not * No compare/test * No add/subtract/increment/decrement

Sounds pretty useless...

Grant Edwards                   grante             Yow!  You mean you don't
                                  at               want to watch WRESTLING
 Click to see the full signature
Reply to
Grant Edwards

How about direct support in hardware for multi-threading ? Multiple Program Counters etc. With mechanisms where a specific interrupts can be allocated to a specific thread.

Regards Anton Erasmus

Reply to
Anton Erasmus

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.