$0.03 microcontroller

U

upsidedown 7 years ago

Depending how you look at it, you could claim that it has 64 registers and no RAM. It is a quite orthogonal single address architecture. You can do practically all single operand instructions (like inc/dec, shift/rotate etc.) either in the accumulator but equally well in any of the 64 "registers". For two operand instructions (such as add/sub, and/or etc,), either the source or destination can be in the memory "register".

Both Acc = Acc Op Memory or alternatively Memory = Acc Op Memory are valid.

Thus the accumulator is needed only for two operand instructions, but not for single operand instructions.

What is the difference, you have 64 on chip RAM bytes or 64 single byte on chip registers. The situation would have been different with on-chip registers and off chip RAM, with the memory bottleneck.

Of course, there were odd architectures like the TI 9900 with a set of sixteen 16 bit general purpose register in RAN. The set could be switched fast in interrupts, but slowed down any general purpose register access.

For a stack computer you need a pointer register with preferably autoincrement/decrement support. This processor has indirect access and single instruction increment or decrement support without disturbing the accumulator.Thus not so bad after all for stack computing.

Vote

N

Niklas Holsti 7 years ago

The data-sheet describes the OTP program memory as "1KW", probably meaning 1024 instructions. The length of an instruction is not defined, as far as I could see.

The data-sheet mentions something they call "Mini-C".

I don't think that an interpreted Forth is feasible for this particular MCU. Where would the Forth program (= list of pointers to "words") be stored? I found no instructions for reading data from the OTP program memory, and the 64-byte RAM will not hold a non-trivial program together with the data for that program.

Moreover, there is no indirect jump instruction -- "jump to a computed address". The closest is "pcadd a", which can be used to implement a

256-entry case statement. You would be limited to a total of 256 words.

Moreover, each RAM-resident pointer to RAM uses 2 octets of RAM, giving a 16-bit RAM address, although for this MCU a 6-bit address would be enough. Apparently the same architecture has implementations with more RAM and 16-bit RAM addresses.

That said, one could perhaps implement a compiled Forth for this machine.

Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .

Vote

D

David Brown 7 years ago

My point is that /this/ CPU is not a good match for Forth, though many other very cheap CPUs are. Whether or not you think that matches "CPUs like this should be programmed in Forth" depends on what you mean by "CPUs like this", and what you think the benefits of Forth are.

It has a single register, not unlike the "W" register in small PIC devices. Yes, I expect it is going to be slower than you would get from having a few more registers. But it is missing (AFAICS) auto-increment and decrement modes, and has only load/store operations with indirect access.

So if you have two 8-bit bytes x and y, then adding them as "x += y;" is:

mov a, y; // 1 clock add x, a; // 1 clock

If you have a data stack pointer "dsp", and want a standard Forth "+" operation, you have:

idxm a, dsp; // 2 clock mov temp, a; // 1 clock dec dsp; // 1 clock idxm a, dsp; // 2 clock add a, temp; // 1 clock idxm dsp, a; // 2 clock

That is 9 clocks, instead of 2, and 6 instructions instead of 3.

Of course you could make a Forth compiler for the device - but you would have to make an optimising Forth compiler that avoids needing a data stack, just as you do on many other small microcontollers (and just as a C compiler would do). This is /not/ a processor that fits well with Forth or that would give a clear translation from Forth to assembly, as is the case on some very small microcontrollers.

Vote

D

David Brown 7 years ago

Not quite, no. Only the first 16 memory addresses are directly accessible for most instructions, with the first 32 addresses being available for word-based instructions. So you could liken it to a device with 16 registers and indirect memory access to the rest of ram.

But you can't use the indirect memory accesses for any ALU instructions

- only for loading or saving the accumulator. So all indirect accesses need to go via the accumulator - and if you want to operate on two indirect accesses (like adding the top two elements on the stack), you have to use another "register" address to store one element temporarily. Yes, it would be bad for stack computing.

Vote

N

Niklas Holsti 7 years ago

Ok, before anyone else notices, I admit I forgot about implementing an indirect jump by pushing the target address on the stack and executing a return instruction. That would work for this machine.

Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .

Vote

N

Niklas Holsti 7 years ago

Except that one can only "push" the accumulator and flag registers, combined, and the flag register cannot be set directly, and has only 4 working bits.

What would work, as an indirect jump, is to set the Stack Pointer (sp) to point at a RAM word that contains the target address, and then execute a return. But then one has lost the actual Stack Pointer value.

Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .

Vote

U

upsidedown 7 years ago

Really ?

In the manual

The M.n notation is for bit operations, in which M is the byte address and n is the bit number in byte. Restricting M to 4 bits makes sense, since n requires 3 bits, thus the total address size for bit operations would be 7 bits.

I couldn't find a reference that the restriction on M also applies to byte access. Where is it ?

Vote

U

upsidedown 7 years ago

Yes, I misread the data sheet. It is really 1 kW.

The nice feature about Harvard architecture is that the data and instruction size can be different.

I have tried to locate the bit allocation of various fields (opcode, address etc.) ut no luck.

Vote

P

Paul Rubin 7 years ago

I think this chip is too small for traditional Forth implementation methods. Just 64 bytes of ram and no registers. If you have 16 bit cells and 8 levels of return and data stacks, half the ram is already used by the stacks.

An F18 processor (GA144 node for those not familiar) has around 3x as much ram including the stacks, and it doesn't pretend to be a complete MCU (you usually split your application across multiple nodes). Plus it has that very efficient 5-bit instruction encoding. On the other hand, you have to use ram as program memory.

You might be able to concoct some usable Forth dialect compiled with an optimizing compiler and using 8-bit data when possible, but it doesn't seem that useful for a chip like this.

Vote

U

upsidedown 7 years ago

Just call a "Jumper" routine, the call pushes the return address on stack. In "Jumper" read SP from IO address space, indirectly modify the return address on stack as needed and perform a ret instruction, causing a jump to the modified return address and it also restores the SP to the value before the call.

Vote

N

Niklas Holsti 7 years ago

Right, that sounds possible. But wow what a circumlocution.

Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .

Vote

P

Philipp Klaus Krause 7 years ago

Am 13.10.2018 um 18:59 schrieb Niklas Holsti:

It seems unclear to me which of acc and sp is pushed first. But if acc is pushed first, one could do

pushaf; mov a, sp; inc a; mov sp, a;

to push any desired byte onto the stack.

Philipp

Vote

P

Philipp Klaus Krause 7 years ago

Am 12.10.2018 um 22:45 schrieb snipped-for-privacy@downunder.com:

People have tried before

formatting link

Apparently, even with access to the tools it is not obvious.

However, a Chinese manual contains these examples:

5E0A MOV A BB1 1B21 COMP A #0x21 2040 T0SN CF 5C0B MOV BB2 A C028 GOTO 0x28 0030 WDRESET 1F00 MOV A #0x0 0082 MOV SP A

Philipp

Vote

N

Niklas Holsti 7 years ago

There's also a rule that the sp must always contain an even address, at least if interrupts are enabled, as I understand it.

Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .

Vote

P

Paul Rubin 7 years ago

That's not minimal ;). More practically, the 3mm square package sounds like a WLCSP which I think requires specialized ($$$) board fab facilities (it can't be hand soldered or done with normal reflow processes). Part of the Padauk part's attraction is the 6-pin SOT23 package.

Here's a complete STM8 board for 0.77 USD shipped:

formatting link

It has 8k of program flash and 1k of ram and can run a resident Forth interpreter. I think they also make a SOIC-8 version of the cpu. I bought a few of those boards for around 0.50 each last year so I guess they have gotten a bit more expensive since then.

Vote

T

Tim 7 years ago

This is quite curious. I wonder

- Has anyone actually received the devices they ordered? The cheaper variants seem to be sold out. - Any success in setting up a programmer?

Vote

G

gnuarm.deletethisbit 7 years ago

o why

the

ented

t.

kend

tm8

ative

s can be hosted

rd

er chip to running forth on this chip. How is that useful? There are many other chips that run very fast. So?

o registers.

e

How fast are instructions that access memory? Most MCUs will perform regis ter operations in a single cycle. Even though RAM may be on chip, it typic ally is not as fast as registers because it is usually not multiported. DS P chips are an exception with dual and even triple ported on chip RAM.

r based design.

Yeah, I'm familiar with the 9900. In the 990 it worked well because the CP U was TTL and not so fast. Once the CPU was on a single chip the external RAM was not fast enough to keep up really and instruction timings were domi nated by the memory.

pared to other compilers since there won't be a significant penalty for usi ng a stack.

The stack in memory is usually a bottle neck because memory is typically sl ow so optimizations would be done to keep operands in registers. In this c hip no optimizations are possible, but likely it wouldn't be too bad as lon g as the stack operations are flexible enough. But then I don't think you said this CPU has the sort of addressing that allows an operand in memory t o be used and popped off the stack in one opcode as many, higher level CPUs do. So adding the two numbers on the stack would involve keeping the top of stack in the accumulator, adding the next item on the stack from memory to the accumulator, then another instruction to adjust the stack pointer wh ich is also in memory. So two instructions? How many clock cycles?

What happens when there is a change in the instruction pointer of the Forth virtual machine? Calling a new word would require saving the current valu e of the Forth IP on the return stack (separate from the data stack) and lo ading a new value into the Forth IP? This is a piece of code typically cal led "next". It varies a bit between indirect and direct threaded code. Th en there is subroutine threaded code that just uses the CPU IP as the Forth IP and each address is actually a CPU call instruction.

Rick C.

Vote

G

gnuarm.deletethisbit 7 years ago

For programs on such a small MCU 256 words is likely much overkill. But you don't need to have the above features for Forth. Subroutine threading uses call and return instructions instead of an address list.

Yeah, I'm pretty sure it is too small for a resident Forth, so a host would be required and a Forth can be compiled and subroutine threaded.

Rick C.

Vote

G

gnuarm.deletethisbit 7 years ago

Keep the TOS in the accumulator and I think you end up with

add a, x; // 1 clock inc DSTKPTR; // adjust stack pointer - 1 clock?

Does that work? Reading below, I guess not.

What does idxm do? Looks like an indirect load? Can this address mode be combined with any operations? Are operations limited in the addressing modes? This seems like a very, very simple CPU, but for the money, I guess I get it.

OK

Rick C.

Vote

P

Paul Rubin 7 years ago

Do you mean you want a Forth with 8-bit data cells? What about the cells on the return stack, if there is one?

Yes.

No. Just load or store.

Vote

$0.03 microcontroller

Join the Discussion

Didn't find your answer?