CPU design

For implementing the higher level protocols for my Spartan 3E starter kit TCP/IP stack implementation, I plan to use a CPU, because I think this needs less gates than in pure VHDL. The instruction set could be limited, because more instructions and less gates is good, and it doesn't need to be fast, so I can design a very orthogonal CPU, which maybe needs even less gates. The first draft:

formatting link

It is some kind of a 68000 clone, but much easier. What do you think of it? Any ideas to reduce the instruction set even more, without the drawback to need more instructions for a given task?

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
Reply to
Frank Buss
Loading thread data ...

Why not use PicoBlaze, which is freely available ? Or MicroBlaze if you need more speed? Peter Alfke, from home.

Frank Buss wrote:

Reply to
Peter Alfke

Peter Alfke schrieb:

To Peter,

I can answer

1) PicoBlaze is too small 2) MicroBlaze is not free

the OP is really going to try to make a full SoC with DDR memory controller and ethernet! as much as I have understood his reasons.

sure it would be WAY CHEAPER to just use MicroBlaze !!! cheaper means in terms of money. The time and effort to make anything comparable to what you can achive with EDK and a few mouseclicks, defenetly costs more than 495USD unless your personal time doesnt count at all.

To Frank,

I was wondering (what you are up)

well doing some 16 bit doesnt make much sense, a small 32 bit RISC isnt much larger. you could also use OpenFire and add wishbone interfaces, makes more sense then trying it all from scratch. unless you just want todo everything by yourself (and that is your goal, not achiving the best with least effort)

BTW - 16 bit, I was looking at ColdFire, and well there is no coldfire FPGA clone yet, but that may make sense (kind 68000, but more RISClike,

16 bit instruction bus)

Antti

Reply to
Antti

Peter Alfke schrieb:

To Peter,

I can answer

1) PicoBlaze is too small 2) MicroBlaze is not free

the OP is really going to try to make a full SoC with DDR memory controller and ethernet! as much as I have understood his reasons.

sure it would be WAY CHEAPER to just use MicroBlaze !!! cheaper means in terms of money. The time and effort to make anything comparable to what you can achive with EDK and a few mouseclicks, defenetly costs more than 495USD unless your personal time doesnt count at all.

To Frank,

I was wondering (what you are up)

well doing some 16 bit doesnt make much sense, a small 32 bit RISC isnt much larger. you could also use OpenFire and add wishbone interfaces, makes more sense then trying it all from scratch. unless you just want todo everything by yourself (and that is your goal, not achiving the best with least effort)

BTW - 16 bit, I was looking at ColdFire, and well there is no coldfire FPGA clone yet, but that may make sense (kind 68000, but more RISClike,

16 bit instruction bus)

Antti

Reply to
Antti

PicoBlaze looks a bit like my idea:

formatting link

But it has more instructions and it is not as much orthogonal as my CPU, so I think I can synthesize my CPU with less gates. But using memory instead of registers means that it is slower than PicoBlaze, but this is no problem for me. But maybe the main reason is, that it is fun to design CPUs :-)

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
Reply to
Frank Buss

There are a number of RISC cores at opencores.org. The full featured (complete with GNU toolchain) is OpenRisc1000, but that may be way more than you need.

Do you want a processor you can simply instantiate, or are you willing to tweak so you get the features you want? If so, you could take one of the less ambitious cores and adjust the instruction set to optimise it for your application.

Cheers

PeteS

Reply to
PeteS

Also look at the Lattice Mico8, and PacoBlaze. The CPU is the easy part, finding a compiler and debug software will be harder. You also need to match the memory interface to the core, once you go out of block ram, these CPUs get less elegent.

One idea that appeals to me, is to optimise a FPGA_CPU to operate from the Fast serial flash that are very cheap and small, with negligable pin cost. That probably means 16 bit opcodes ( down from the 18 allowed by Block Ram), and 32 bit registers, with plenty of size-extended opcodes, and skip opcodes. Next would be a way to load and lock a BRAM or 2 with interrupt and speed critical codes.

-jg

Reply to
Jim Granville

For me it looks like it is too large :-)

Yes, and mainly for learning VHDL, so using finished products doesn't help me and is not as much fun as doing it all by myself.

The first use case for this CPU will be executing programs from block RAM, for accessing all the hardware of the Spartan 3E starter kit. For this 32 bit is not needed, but I'll use generics for the bit width, because when using more memory than 64 kB or for more complicated tasks, 32 bit may be more useful.

Do you have a link? Searching for OpenFire at Google returns only ads for fireplaces :-)

ColdFire looks interesting, but even more complicated than PicoBlaze, with all the old 68000 commands, like traps.

Looks like all these CPUs are using registers. I know it is faster to do calculations with registers instead in memory and opcodes may be a bit smaller, when using registers as source or destination, but are there any other drawbacks using no registers? I really like my idea using only memory.

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
Reply to
Frank Buss

A few years ago I made a programmable DMA controller on a system that had no interrupt capability (too many sources) so all I/O needed to be polled and packed into larger chunks of data. It has a few instructions like read, mask and, mask or, jump conditionally, write to the host. It is smart enough to be able to deal with an E1 chip. As I write this, this controller is at work in thousands of cards...

--
Reply to nico@nctdevpuntnl (punt=.)
Bedrijven en winkels vindt U op www.adresboekje.nl
Reply to
Nico Coesel

formatting link
for ISA

Reply to
jacko

jacko schrieb:

ALL project folder are EMPTY !! :(

Antti

Reply to
Antti

Look at a uC like the Zilog Z8 - that has registers, but also a register frame pointer. So you use 4 opcode bits, but can shift that window, a across all memory. The Intel 196 used a 256 byte register file

Registers are used to keep the opcodes smaller, but you are right that it is a simple trade off. PICs use only RAM and accumulator.

A drawback of registers is that the step-up from reg to memory can give quite a code hit, and with FPGA BRAM, there is no speed penalty in a memory block much larger than most uC register fields.

Some processors allowed a split register frame, so you could (eg) half-shift the register map, giving 8 scratch registers, and 8 register parameters, in a procedure call. To me a very good idea, but seems to not be widely used - would map very well onto FPGA BRAM.

-jg

Reply to
Jim Granville

Ethernet is not included in the MicroBlaze price, but is $1500 extra. (Still a lot cheaper than doing it yourself if you place any reasonable value on your time, and if you don't want to learn from it.)

OpenCores has several CPUs that include full gcc support.

I am pushing around pieces of Opencores to try to get a CPU + Ethernet

  • DDR + Application-specific system together on Wishbone. I am being stymied by just learning VHDL and knowing no Verilog at all, and not being able to understand the various error messages the ISE spits out. It would be nice if there were an assembled system that I could take apart and modify.

(I found a more restricted system under OpenCores as rs232_syscon that includes a PIC microcontroller and a serial-port-device that lets you prod at the Wishbone bus, which might help.)

--
David M. Palmer  dmpalmer@email.com (formerly @clark.net, @ematic.com)
Reply to
David M. Palmer

i'm only one person, and so i am doing the documentation first

cheers.

feel free to contribute.

Reply to
jacko

I wonder if there is some more scientific study about this. When I'm trying to write a typical piece of code, it looks like registers are really better:

; swap 6 byte source and destination MACs .base = 0x1000 p1: .dw 0 p2: .dw 0 tmp: .db 0 move #5, p1 move #11, p2 loop: move.b (p1), tmp move.b (p2), (p1) move.b tmp, (p2) sub.b p2, #1 sub.b p1, #1 bcc.b loop

40 bytes with my instruction set.

The same with something like a 68000 instruction set:

move #5, a0 loop: move.b $0(a0), d0 xchg.b $6(a0), d0 move.b d0, $0(a0)- ; register indirect with displacement and post-dec bcc.b loop

12 bytes, if I need 2 bytes per instruction for the larger range of addressing modes with registers. How much logic gates do I need for supporting registers? Maybe not too much, if I can design it without too much special cases.

I don't need it, but for a really fast CPU something like MIPS should work:

formatting link

Every instruction, including arguments, is 32 bit. When reading it from 32 bit block RAM, this should be really fast. How much memory needs a program?

xor $1, $1, $1 addi $1, $0, #6 loop: lb $3, ($1) sb $3, ($2) addi $1, $1, #1 addi $2, $2, #1 xori $1, $4, #6 bne loop

32 bytes (but maybe shorter, I don't know MIPS assembler very good).

And something like the good old 6502:

ldx# 6 loop: lda 0, x tay lda 6, x sta 0, x tya sta 6, x decx bcc loop

13 bytes.

Maybe a CPU like MIPS, with fixed 32 bit instructions, but as easy to write assembler for it like for the 68000 would be a good idea?

Forth looks interesting, too:

formatting link

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
Reply to
Frank Buss

Also look at the Lattice Mico8, and PacoBlaze. The CPU is the easy part, finding a compiler and debug software will be harder. You also need to match the memory interface to the core, once you go out of block ram, these CPUs get less elegent.

One idea that appeals to me, is to optimise a FPGA_CPU to operate from the Fast serial flash that are very cheap and small, with negligable pin cost. That probably means 16 bit opcodes ( down from the 18 allowed by Block Ram), and 32 bit registers, with plenty of size-extended opcodes, and skip opcodes. Next would be a way to load and lock a BRAM or 2 with interrupt and speed critical codes.

-jg

Reply to
Jim Granville

Before you comparing ISA for size, you need to look into the actual FPGA area needed for different ISA. A full 68000 would most likely fill up most of your Spartan3E device and that is without the ethernet MAC. Also the Ethernet MAC is usually much larger size than the CPU. On the opencores IP, just add the size of the CPU and Ethernet MAC before you do any decision.

Unless you want to spend hours and hours hand-assemble the TCP/IP stack code, you need to find a CPU which has a full C compiler.

If your ethernet speed doesn't need to be at maximum speed, I would pick the ethernet lite from Xilinx since the size is much smaller than most available cores. If you need maximum ethernet performance then you need the full ethernet MAC which is much larger than the lite version. You will need a TCP/IP stack and lwip is most likely the best choice. For the CPU, a MicroBlaze will most likely be the smallest choice for you.

The decision also depends on how you value your time and how money you want to spend on this. If the interesting part is to create this solution without any time limits than you should create most from scratch. If the interesting part is to use the solution than I would spend money to speed up the development.

Göran Bilski

Reply to
Göran Bilski

Frank Buss schrieb:

formatting link

search openfire in google groups comp.arch.fpga and you get it a first hit

Antti

Reply to
Antti

Maybe compiler has to be taken into account as well. Although very nice hand-made code is possible, most of code is generated by compiler. For a new instruction set, it's possible to modify gcc for that but I am not sure the efficiency.

/Wayne

Frank Buss wrote:

Reply to
quickwayne

Adjusting the instruction set to the problem domain is a good idea. I'll try to write the functions, first, maybe using domain specific instructions (like a block copy command), and then I'll implement the core for it.

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
Reply to
Frank Buss

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.