A Challenge for serialized processor design and implementation

For really frugal concepts we may have to go back further in history, perhaps further than anybody else on this ng remembers. In 1964 I was challenged to put a cash register design on a 15000-bit torsional delay line (and failed miserably..) In the later 'sixties, I worked at Monroe (part of Litton) on a drum- based business computer, with an extra short loop to store registers. Very few transistors. We also played with a torsional delay line. In 1970 my predecessor to the hp35 used a 100-bit word, organized as

25 serial 4-bit characters. For their very successful calculator, hp then modified this architecture to bit serial, to further reduce pin- count. Those were the days of 16-pin packages and expensive gates and flip- flops...

Peter Alfke

Reply to
Peter Alfke
Loading thread data ...

Well, a FPGA version wouldn't get anywhere near these kinds of numbers. But I do agree that MISC variations are a very good idea (I have done some myself and others have suggested some in this thread). A key MISC feature, the loading of several instructions in a single memory access, would be missing from a serial implementation, however.

As additional sources of inspiration I would like to mention the control computer for the Minuteman missile:

formatting link

and the Kenbak-1 (considered by many as the first personal computer since it sold for $700 in 1971):

formatting link
formatting link

The documentation for the Kenbak includes a detailed explanation of how it works as well as all the schematics. It only addressed 256 bytes (in a shift register - not RAM), but otherwise is a pretty neat design.

-- Jecel

Reply to
Jecel

That's interesting: the instruction set, addressing modes and even the mnemonics, are looking a bit like the 6502.

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
Reply to
Frank Buss

How about the LGP-30 or RPC-9000? 30-bit words, 4K store, multi-address,

16 instructions IIRC, and Algol, Dartmouth Basic, realtime control apps etc.

Michael

Reply to
msg

There's a similar instruction set (IIRC without message sending or much complexity devoted to multi-tasking) from the same era, which might be worth pursuing ... the Lilith, from ETH Zurich.

Like the Transputer it would be considerably larger than a bit-serial machine, but probably more powerful.

I wonder if you can still get a Modula-2 compiler for it...

- Brian

Reply to
Brian Drummond

There was also Hillis' Connection Machine from the 1980s. It was a massively parallel machine with 64k bit-serial processors, programmed in data parallel Lisp or C. While the processor was very simple, I recall the processor interconnection network was a bit more complex.

-- Steve

-- 3/21/08

Reply to
Steven Guccione

I had a fair amount of experience with a serial processor in college -- a donated Bendix G-15, with rotating drum for the main store and vacuum tube logic circuits. It was a clever bit serial design, with lsb first. Everything was clocked to the rotating drum. Left shift was done with a clock delay and right shift done with a "short line" that only had 55 bits instead of the usual 56 bits for a double word register.

The University of Colorado had an experimental optical computer that was bit serial for cost reasons. It used 23 gates, IIRC, which were very expensive.

--
Thad
Reply to
Thad Smith

I wonder what the optimization target is? Size? I am pretty sure that at some point the overhead of handling a serial datapath exceeds the benefits. eg I would assume that a 2 or 4 bit datapath does actually use fewer resources than a 1 bit datapath.

Or is the defining target the ability to execute program from a serial memory device? In that case one may wonder how suboptimal it would really be to use an existing parallel design and add some par/serial conversion logic.

Reply to
referringto

(snip)

I suppose so for FPGA or other modern systems.

Years ago I heard about the PDP8/S, a bit serial PDP-8, though the S was also mentioned as meaning slow. In the days of individual transistors it might have made more of a difference.

Still, once you have the overhead of bit counting (or whatever unit) it probably doesn't matter much.

-- glen

Reply to
glen herrmannsfeldt

bit datapath does actually use fewer

I would even go as far as saying that a 1 bit datapath does not even represent the clock speed optimum. The clock speed optimum is probably around 4-8 bits.

Reply to
referringto

bit datapath does actually use fewer

Correct. The speed actually only needs to be faster than the memory :)

The 4 bit-wide, SPI access Winbond part Antti referenced, would make a 4 bit datapath natural for most opcodes. Special cases like MUL and DIV may choose to go to 1 bit for a size/speed trade off.

-jg

Reply to
Jim Granville

One of the smallest languages is Forth. There are a couple of cpu engines designed with Forth in mind, and they are really tiny. I.a. Bernd Paysan has published fpga code. The Novix chip (Forth chip design) could be emulated in less that 4k gates (inasfar as I recollect. They were remarkably efficient, I have seen bit-banged video output with the Novix name. (20 Mhz processor).

This you would have to add based on

--

--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
 Click to see the full signature
Reply to
Albert van der Horst

What's the smallest instruction set supported by an existing and available C compiler? Is there a C compiler available for any of the tiniest stack machines, or even for an OISC (one instruction set computer)?

Reply to
Ron N.

Another advantage would be that it could use serial peripherals. A 16 bit timer with a min prescaling of 16 would be very cheap to implement as a shiftregister.

--
Best Regards,
Ulf Samuelsson
 Click to see the full signature
Reply to
Ulf Samuelsson

I am not sure what the smallest instruction set that a C compiler has been written for. We have written quite a few C compilers for processors with unusual instruction sets and limited resources.

The key is, "Can you write code to do +,~, |,>>,conditional branch on z or carry"

I think that is all that is actually needed.

Some of the implementation may be ugly but it can be done. A surprising number of the processors 20 years ago missed some of the logic operators and shifts limited to rotates.

Regards,

-- Walter Banks Byte Craft Limited Tel. (519) 888-6911

formatting link

Reply to
Walter Banks

Actually, needed is only ONE instruction,

SUBLEQ (SUbtract and Branch if Less than or EQual to zero) -

formatting link

or MOVe The Ultimate RISC -

formatting link

Everything else can be derived from this ....at "some cost" ;) It's equivalent to NAND gate somehow...

Cheers Krzysztof

Reply to
Krzysztof Kepa

IIRC first year computer science the requirements for a machine that will run any kind of code at all (a universal Turing Machine) can be surprisingly simple.

formatting link

I recall having to write at least one program for a Turing machine:

formatting link

Anything more just has to do with efficiency, ease of use, and that sort of stuff. Best regards, Spehro Pefhany

--
"it's the network..."                          "The Journey is the reward"
speff@interlog.com             Info for manufacturers: http://www.trexon.com
 Click to see the full signature
Reply to
Spehro Pefhany

We have written a compiler for the 12 bit PIC's. The PC is accessible as a register on these parts. We have written a compiler for a processors that didn't have indirect jump and the PC was not available to be modified.

Good point. Indirect accesses was missing from my The COP8 has all indirect accesses could only be executed in the current 256 byte page.

No, but an interesting optimization assignment for a compiler course. It would initiate some abstract thinking.

There are at least three move machines around where every operation is a move and there is some logic between some addresses to give it functionality.

The GRI Computer Corp built a minicomputer move machine that I first encountered in Mexico being used to do basic data base work in local public health administration offices. All the code for these as far as I know was written in assembler. I wrote a terminal driver for this project. The processor was in-expensive to build but required a lot of code to do anything.

There was a startup project in Germany about 10 years ago that had a move machine very similar to the MAXQ. The German project was based on a thesis and from various conversations all three of these evolved separately. GRI in the early 70's the German move machine a few years before MAXQ. The MAXQ has been more of a commercial success than I would have guessed when I first heard about it.

Regards,

-- Walter Banks Byte Craft Limited Tel. (519) 888-6911

formatting link

Reply to
Walter Banks

I think there is some mis-representation of the definition of RISC vs CISC. The term "reduced" in RISC refers not to the number of instructions, but rather to the reduction in the number of functions that a single instruction performs.

The idea of RISC is that "smart" compilers can rearrange these simpler (reduced) instructions to perform larger operations more efficiently.

The SUBLEQ instruction as described on the Wikipedia page is *very* CISC in that it performs loads, stores, arithmetic, condition code testing, and branching all in a single instruction.

--
Michael N. Moran           (h) 770 516 7918
5009 Old Field Ct.         (c) 678 521 5460
 Click to see the full signature
Reply to
Michael N. Moran

Walter,

With SUBLEQ instruction you'd have to produce result that is equivalent to SHR. It is doable, but not quite time efficient, i.e. it can be done "manually" by subtracting values an slowly building up value equal to a half of the shifted val (as SHR = DIV by 2). Not efficient at all, but this sort of processing is just a theoretical figure, so don't expect sophistication of such high-level asm instruction like SHR ;) Good book on computer arithmetic/architecture could help here...

With MOVe (UltimateRISC) its easier, as you can hook useful bunch of logic up to the particular addresses and just throw the data between them. It even get its 5 min on the FPL back in '98. @inproceedings{738909, author = {Adam Donlin}, title = {Self Modifying Circuitry - A Platform for Tractable Virtual Circuitry}, booktitle = {FPL '98: Proceedings of the 8th International Workshop on Field-Programmable Logic and Applications, From FPGAs to Computing Paradigm}, year = {1998}, isbn = {3-540-64948-4}, pages = {199--208}, publisher = {Springer-Verlag}, address = {London, UK}, }Cheers Krzysztof

Reply to
Krzysztof Kepa

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.