The variable bit cpu

R

Richard Henry 20 years ago

set

Was his name Turing?

Vote

J

Jonathan Westhues 20 years ago

[...]

Are you familiar with the instruction sets of any general-purpose microprocessors (x86, ARM, etc.)? If so, what changes to the instruction set would you make to address data in this format? Remember that memory buses do have to have a fixed (or at least maximum) width--a 16-bit parallel flash chip has 16 physical data pins, i.e. little pieces of metal that solder to the printed circuit board, one for each bit in the word. That is not something you can change at run time! Similarly, registers in a CPU must have a fixed length, or at least some maximum width; each bit in a register corresponds to a physical flip-flop and some other circuitry on the chip.

How do you think this would work with arithmetic operations, for example? The hardware to perform operations like addition and subtraction operates on many bits in parallel, for speed. That means that the chip designer must decide how large of a number the CPU can add in a single instruction (or in a single cycle at least). If he designs a 32 bit-wide adder, then the CPU will be able to add two 32-bit numbers very fast, probably in a single instruction cycle. If he chooses an 8 bit-wide adder, then the CPU would take much longer to do that, at least four cycles. However, the 8-bit adder consumes much less area on-chip than the 32-bit adder. How would you handle that with your 'variable bit' idea?

Your format allows arbitrary-length fields, but it requires 2*n bits to store n bits of data. A fixed and predefined layout would only require n bits to store n bits of data. Can you think of a good compromise between those two extremes? Do you expect many small fields, or fewer large ones? What is the distribution? I think that you will find that almost any scheme is nearly optimal under some set of assumptions, but what kind of assumptions are reasonable? Regardless of the format that you choose, do you think that it is necessary to make changes to the instruction set to support that format? If so, what does it gain you?

The design of a new instruction set is a very specialized area. Most of the people who design new instruction sets have a very good idea of what kind of problems people use computers to solve, and what kind of instructions make life easy for the compiler (and what language someone will want to write a compiler for), and so on.

If you are interested in this field then maybe you would find it more rewarding to look at little problems. For example, your pocket calculator probably has some sort of custom microprocessor in it. What kind of instructions would you need if you only want to build a four-function pocket calaculator, and nothing else? That is a much more manageable sort of thing to design, because the problem that you are trying to solve is much more constrained.

Jonathan

Vote

D

Don Taylor 20 years ago

I had similar ideas after I saw designers repeatedly believe that "disk drives could NEVER hold more than X", for some shortsighted X.

At least since CP/M decades ago, and the idea that byte sized numbers could hold the largest possible number of sectors we would ever see, we have seen limitations built into the numbering used for the geometry of disk layouts that have come back to bite us.

If there were arbitrarily extensible numbers used for this, perhaps not extended by a single bit at a time, maybe instead by a nibble, and all this numbering system was buried within the drive and OS then we might have avoided the 256 sector boundary, the (what was it, my mind is failing me, was it the) 850 meg boundary, the 1.6 gig boundary, the 137.5 gig boundary, the FAT 16 boundary, the... Now many more of these were there that we have hit?

The problem is that I just don't learn from experience, even when I have the experience and tell myself that I won't fall for that one AGAIN... and again... and again... and again... AH, but this time, this time for sure I won't fall for that one. Just wait and see.

(but for CPU novelty, I want a complete Burroughs 5000 mainframe implemented in a single FPGA, with a DDRAM ram stick giving me silicon "tape drives" and thousands of times the original speed, running ALGOL and the original OS, and all neatly package in a regulation kitchen match box. Now that's novelty)

Vote

S

Skybuck Flying 20 years ago

Hi,

I think I might have just invented the variable bit cpu :)

It works simply like this:

Each "data bit" has a "meta data bit".

The meta data bit describes if the bit is the ending bit of a possibly large structure/field.

The meta bits together form a sort of bit mask or bit pattern.

For example the idea is best seen when putting the data bits and meta bits below each other.

data bits: 01110101110101101010101 meta bits: 00000000100010001100001

In reality the data bit and meta bit are grouped together as a single entity which can be read into the cpu since otherwise the cpu would not know where to start reading the data or meta bits. Now it simplies start with the first data + meta bit pair.

Because a cpu might need to know the length of the bit field up front the cpu/algorithm works simply as follows:

The cpu starts reading data and meta bits until it reaches a meta bit of 1.

All bits that form the variable bit field are now read and can be used etc.

The above example then looks like this:

data bits: 011101011#1010#1101#0#10101 meta bits: 000000001#0001#0001#1#00001

(The # sign is too indicate to you where the variable bit fields are.)

Notice how even single bit fields are possible.

The reason for the variable bit cpu with variable bit software is too save costs and to make computers/software even more powerfull and usefull ;)

For example:

Currently fixed bitsoftware has to be re-written or modified, re-compiled, re-documented, re-distributed, re-installed, re-configured when it's fixed bit limit is reached and has to be increased for example from 32 bit to 64 bit etc.

Example are windows xp 32 to 64 bit, the internet IPv4 to IPv6.

Bye, Skybuck.

Vote

K

Ken Smith 20 years ago

In article , Jonathan Westhues wrote: [... variable bit length CPU ..]

You could deal with all the bits serially by lets say marking them on tape. The basic CPU could be basically a state machine that moves the tape forwards one, backwards one, marks a logical true, or marks and logical false depending on its state.

An old friend of mine had an ideal like this a few years back.

-- kensmith@rahul.net forging knowledge

Vote

J

Jonathan Westhues 20 years ago

What do you mean by serial? Surely you don't mean a bit at a time; memory throughputs of a few gigabits per second are not a big deal these days, but off-chip clocks of a few GHz most certainly are. Of course you can do block transfers of the width of your physical bus, as many as you need to get whatever bitstring you are after, but that is a nasty piece of hardware to design; and again, when you read that 600-bit field, where do you put it?

It could, but that defeats the purpose of having registers. The reason why a processor has registers, instead of just always working on external memory (or cache or ...) is that it is possible to make registers that can be accessed very, very quickly. It is not practical to make a memory of any size that can be accessed that quickly. A typical RISC processor can barely do anything with a piece of data without loading it in a register first. That has turned out to be a very practical design choice; witness the success of the ARM7, for example.

[...]

It really seems a lot like you are imagining that 1 bit moves every clock. That is good for academic examples (hence the jokes about the Turing machine; that is a very simple design for a computer that is often used for theoretical purposes), but it does not tend to make a practical design.

That doesn't make any sense to me at all. If you want to add a 50 bit number to anything using a 32-bit adder, then you have to add at least twice. Anyways, if you were only moving 1 bit every clock cycle, then you could just use a serial adder and you wouldn't care about this. You could build a machine with no registers at all, that always worked on external memory. It would be slow as hell, but you could. If it read the data one bit at a time then you could use one-bit arithmetic units too. At that point you have a one-bit computer, not a variable-bit computer....

So that you have multiple instructions executing at once, you mean? Of course big 'desktop' processors can and do schedule instructions on their own, so that they could do a bit of that. They are not very good at that, though. There are DSPs and other specialized processors that let you do the scheduling yourself; these work extremely well for the narrow application for which they were designed (lots of convolutions at once, or whatever). They are not very easy to program for anything else.

Let me put it this way: the most exciting new instruction set in 'recent history,' looking at commercial success and not weird academic projects that I don't know anything about, is probably the ARM. It does not even support unaligned 32-bit loads. That means that if you want to access a 32-bit variable at an address that is not evenly divisble by four, then you must write special software; you can't just load or store. I have not seen anyone upset about that. If you need it then you just specify __packed or whatever, and the compiler generates appropriate instructions.

I think that it is often a good idea to use variable-length data formats. I think that a variable-length register is insanity. I'm sure it's been done though. Does anyone know where?

[...]

There is nothing in the instruction set of any processor that I have ever seen that makes it impossible to implement a variable-length field. It is just a matter of whether the programmer chooses to. The programmers are working so many layers of abstraction above the instruction set that I doubt that it makes much of a difference to them.

[..]

I did not mean the order, although you might also be able to exploit that sometimes. If you expected many fields of just a few bits, then your scheme might be plausible. If you expected that all of your fields would be between

100 and 200 bits in length, then you might use a string of 8-bit (unsigned binary integer) lengths instead of your Turing machine style unary representation. That would consume 8 bits extra per field, instead 100 to 200 bits extra per field. If you expected that the fields would usually be under 256 bits long but not always, then you could specify an 8-bit length, but say that length==255 means 'read another length octet, add 255 to it, and use that for the length.' People do things like this all the time, in software.

Jonathan

Vote

D

Don Taylor 20 years ago

...

Not exactly the implementation you describe but the Inmos Transputer had a tag bit that allowed the processor to put together longer operands when the tag bit told it to append the next part to the operand being built.

Vote

S

Skybuck Flying 20 years ago

A little bit.

in this format?

Possibly little changes since the cpu can detect the size of each variable bit field.

width--a 16-bit parallel flash

register

Assuming the CPU and the main memory are seperated the communication between these two could take place in serial as pointed out by others thank you ;)

The CPU could act upon "virtual registers". These registers could be located in the main memory at certain logication and changed in location and/or size after or during boot.

The main memory chips might have extra logical to perform local memory copies inside the memory chip itself or maybe even other operations.

Alternatively the CPU could just have a few bits to do it's thing.

Start simple ;) then look how to improve upon it... that might be the best way to go along :D

on

in

adder

handle

Assuming there are two variable bit fields that need to be added the process is as follows:

I'll call these two fields: field A and field B.

Field A flows through the system via serial bit stream line 1.

Field B flows through the system via serial bit stream line 2.

The CPU for example or whatever you want to call it scans both bit streams until the end bit is detected.

The CPU could use fixed bit registers under water.

Suppose Field A is 4000 bits big and Field B is 50 bits big.

Suppose the CPU has 32 bit registers underwater.

Only the last 32 bits of Field A and Field B will remain in the two register after the scan is complete.

For this to work the registers need to use what I call a "push out" system.

those two extremes?

Dynamic Compression like huffman but sometimes that might even produce more bits ;)

Good but maybe a tricky question ;)

Current software has little choice but to choose small fields :D

I expect fields to grow when needed, allowing a never before seen flexibility in communication, storage, processing etc ;)

scheme

With distribution you mean many little fields followed by a big field etc ?

Why is this a relevant question ? I assume maybe for some optimizations ?

support

Maybe not :) see above ;)

Incredible flexibility of binary software which impacts storage, communication, processing etc.

the

of

pocket

thing

As I grow older and gain more (programming) experience etc I find myself more interested in what happens in lower layers :)

For different reasons... understanding/solving/exploiting/modifieing performance bottle necks, inflexbility, vulnerabilities,system workings etc ;)

Bye, Skybuck.

Vote

S

Skybuck Flying 20 years ago

First alternative would be static huffman compression and seperating the data and meta bits into bytes.

Actually with static huffman compression there are 9 possibilities:

meta bits: compressed encoding:

1: 0000 0000 0000 2: 0000 0001 0001 3: 0000 0010 0010 4: 0000 0100 0011 5: 0000 1000 0100 6: 0001 0000 0101 7: 0010 0000 0110 8: 0100 0000 0111 9: 1000 0000 1000

So only 4 bits are required for the meta bits by using static huffman compression:

So that's "only" 1.5 as much bits needed ;)

Second alternative would be only applieing the meta bits to a length prefix field.

The length prefix field indicates the length of the next field which is the data.

Bitstream: 1100011010101001110011100010101000 Metadata: 0001 001 0001

This would mean:

length data length length data 12 - 6 5 - Bitstream: 1100#011010101001#110#011100#0101#01000 Metadata: 0001# #001# #0001#

The meta data bits could be grouped together into a seperate bitstream

or

The meta data bits could be interleaved with the length data

one bit of meta data, one bit of length data, one bit of meta data, one bit of length data, etc.

This last option would be good for communication protocols where otherwise packets might go lost and important length data might go lost ;)

The final interleaved bit stream would look like:

Interleaved Bitstream: 10100001#011010101001#101001#011100#00100011#01000

45 bits in total.

11 bits for length data

11 bits for meta data 23 bits for data

Let' see how many bits are required for say value 2 million

255 - 8 bits

512

1 k 2 k 4 k 8 k 16 k 32 k 64 k another 8 bits

128 k

256 k 512 k 1 m 2 m another 5 bits

That's indeed 21 bits... ;) every 10 bits is 1000 decimal ;)

To store length value 21 bits

1 6 bits are needed 2 4 8 16 32

So using this scheme another 6 bits for meta data is a total of 12 overhead bits.

So 12 overhead bits vs 21 data bits.

There is a clear saviour/reduction there ;) :)

With huffman about 12 overhead bits would be needed as well. 4 bits for each byte.

Now let's take a bigggggg data type.

Let's say 1 megabit. 1 miljoen bits of data.

20 bits for a length field with another 20 bits for a meta data field.

Now only 40 bits are required for a data field which is 1 miljoen bits long.

Huffman would require half which is 0.5 milion overhead bits My original idea would require 1 milion overhead bits.

This new encoding

prefix length + prefix length marker + data would only require 40 overhead bits ;)

And is still as flexible as any encoding ;) :)

Bye, Skybuck.

Vote

S

Skybuck Flying 20 years ago

Actually since there are now two different encodings possible

A type field and type marker could be added in front ;)

This type field indicates which encoding was used ;)

So that's 2 bits overhead for the type field + type marker ;)

The encoder can now choose which encoding it would like to use ;)

Bye, Skybuck :D

Vote

S

Skybuck Flying 20 years ago

streams

be

number

Yes something like that I ll try to explain a bit further:

Think of the register as a sliding register.

It can slide over the big bit field in both directions.

Since the meta bits go from left to right

The sliding register has to move to the right first to find the end bit.

Data: 01010110101010111010101 Meta: 00000000000000000000001

Let's assume a 8 bit register.

It starts here:

Data: #0101#0110101011101010101 Meta: #0000#0000000000000000001

The register has to slide to the right

Data: 01010110#1010#11101010101 Meta: 00000000#0000#00000000001

Until it finds the 1 bit meta marker which indicates the end of the field.

Data: 0101011010101110101#0101# Meta: 0000000000000000000#0001#

Now that the cpu/register has found the end of the field... which is actually the start of the value... since the least significant bit is at the right.

Now the cpu can start adding the field to the other field... so now the register has to move back to the left:

Data: 010101101010111#0101#0101 Meta: 000000000000000#0000#0001

Until it again reaches the far left

Data: #0101#0110101011101010101 Meta: #0000#0000000000000000001

Though the fun part is that this might have a problem...

Going left is easy since the end bit is at the right..

But going right might be a problem since there is no end marker to the left except when there is a previous field located there.

If the begin address of the memory is found this wouldn't be a problem.

But assuming that the variable bit field could be located anywhere in memory this could be a problem ;)

A kinda dirty solution might be the following:

Right at the start of the operation the first meta bit is set to 1

Data: #0101#0110101011101010101 Meta: #0000#0000000000000000001 1

After the addition/operation has completed this one is changed back to zero ;)

There is one exception.

If the cpu has detected that the first meta bit was set at the start of the operation this indicated a 1 bit data field.

In this case after the operation has completed the first meta bit should not be reset to zero.

Bye, Skybuck.

Vote

C

Clifford Heath 20 years ago

I think I heard of him, the idea seems turing a bell.

Vote

D

Don Taylor 20 years ago

Ah, forgot that one.

I had a PDP-8. When I wasn't using it and some poor guy at Catapillar Tractor had their's fail and they desperately needed to get it working I agreed to sell them mine on the condition that they not just scrap it.

A few years later they went bankrupt. What I should have done was have an agreement that I would loan it to them and when they weren't using

8's anymore that they quietly pack all of the stuff up and ship it back to me.

Vote

K

Ken Smith 20 years ago

In article , Don Taylor wrote: [..disks vs numbers..]

The disks surface has to have groups of bits written onto it. So long as we have to refer to sectors on a disk we are doomed because at some point the format of the disk will have to change when the numbers or groups of numbers gets too big to fit in a sector.

IIRC, you missed the early 64 sectors per track one.

[...]

Why not a PDP-8. It was a RISC machine before anyone heard of the term. You could fairly easily code an entire PDP-8 in VHDL in under a day.

-- kensmith@rahul.net forging knowledge

Vote

K

Ken Smith 20 years ago

In article , mc wrote: [...]

The Intel IPX432 chip set was an example of this method. The variables were all given "object numbers". The object number was used to look up the nature of the variable in a table as well as its physical address.

The made a machine that would throw an exception if you tried to add characters or what not. It also prevented you from addressing memory that lapped over two variables. This was held out as a great improvement in the safety of software.

The product failed because it was way too slow and power hungry. You needed at least two many legged glowing red chips to make a system and the

286 would out perform it.

-- kensmith@rahul.net forging knowledge

Vote

K

Ken Smith 20 years ago

The bus could tranfer 32 bits at a gulp and just ignore any bits beyond the marker. This would mean that 32 bit transfers would happen nearly back to back if the variable was more than 32 bits long.

-- kensmith@rahul.net forging knowledge

Vote

R

Richard Henry 20 years ago

I keep my old 432 databooks on my bookshelf to haul out when the Intel rep starts making glowing promises of the future.

Vote

D

Don Taylor 20 years ago

...

I was doing compilers and interpreters at the time the 432 was being done. After it was cancelled I talked with one of the people on the team. He said that their first version of the compiler would validate references to memory, sometimes again and again, in the execution of a single statement. He claimed that if they had been given the chance that the next version of the compiler was slated to optimize away all the redundant validations and significantly increase the speed of the code.

So I question a bit some of the claims why the project was cancelled.

And it was the first attempt at the processor, somewhat like the 8086 was the first attempt at the processor. The 286 then benefitted from doing a second, or third if you count the 80186, turn and cranking up the speed.

And you might consider the truly horrible code that was generated by the first few generations of 80x86 compilers. It was truly awful, big, slow, and compiler writers were all talking about these mythical huge gains that the Intel x86 architecture was supposed to have given them.

I came very very close to owning a complete 432 development system after it was all over. I questioned whether I should have done that or not.

Vote

K

Ken Smith 20 years ago

In article , Richard Henry wrote: [...]

Unfortunately mine went away in a witless clean up.

The series-4 development system was also a great charmer. I think it is what put Intel out of the development system market. In some ways it is a shame.

The MDS-800 was their first 8080 development system. It ran an OS called ISIS-II. ISIS-II was years ahead of its time as a small systems OS. From that point forwards, every new system they introduced was better than the next.

-- kensmith@rahul.net forging knowledge

Vote

K

Ken Smith 20 years ago

In article , Don Taylor wrote: [...]

They still had the hardware requirement of making a fetch from the table to get the address before you could access the variable. IIRC, if you accessed a variable twice in a row, you could save on the table look ups but if you were in a loop processing 4 variables, you ended up with 4 extra memory reads per loop.

We were not looking at a specific compiler but at the instruction set timing its self when we concluded that there was no good reason to go with the 432. We could almost get the throughput we needed with a 186 and were sure to with a 286.

[...]

I'd certainly count the 186 even though they were both planned at about the same point, they followed one another into the market.

I, however, was doing the 8086 in assembly. Did you know that the 8086 is not as fast as the 8051 for many of its instructions. The 8086 required a trip through the ALU to address a variable or do a jump or call instruction. The result was a fairly large number (22IIRC) clock cycles to do a jump. If you wanted to do things like memory moves and block I/O, you were best off adding an 8089 to the project. The 8086 was basically a second micro with a much simpler instruction set targetted at doing I/O quickly.

BTW: The last time I looked at compiled 8086 code, I was impressed that they had greatly improved things. Today, to looks like hand coding would only gain a factor of 2 or maybe 3 on the through put and perhaps not enough decrease in size to bother with.

I'm sure that at some point one will be worth money.

-- kensmith@rahul.net forging knowledge

Vote

The variable bit cpu

Join the Discussion

Didn't find your answer?