Home grown CPU core legal?

Followup to: By author: "Glen Herrmannsfeldt" In newsgroup: comp.arch.fpga

The PDP-11 is still very much a CISC archtecture... I think it would require a lot more logic than necessary.

This below is my design notes for my hacked-up architecture, currently called "NanoRISC."

I have no way to know how this is turning out. My current goal is to make sure it implements in < 1000 LEs on Cyclone, without using blockRAM for the register file. Fundamentally it's a personal research hack project.

-hpa

NanoRISC goals - Minimal hardware consumption - Technology independent - Free licensing

-> 16-bit addressing, data width, instruction word

-> Single issue in-order RISC

-> Short pipeline (probably 3 stages)

-> Deterministic timing (1 cycle/insn, taken branch 2 cycles?)

-> Separate ports for I and D to take advantage of dual-port RAM

0000 NNNN NNNN NNNN - IMM (supplies upper 12 bits of q or Is field) 0001 0000 SSSS DDDD - JMP Rd,Rs (PC
Reply to
H. Peter Anvin
Loading thread data ...

Aren't there already several open source FPGA CPUs avaiable? Anyone have a few links handy?

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX
Reply to
rickman

You should try

formatting link

Erez.

Reply to
Erez Birenzwig

optional

Reply to
Peter Alfke

In doing a 'clean slate' FPGA small core, there is merit in choosing an opcode width that matches the FPGA Block RAM / Multiplier widths. ( eg I've seen 9 bit opcodes used ) Did you look at that ?

-jg

Reply to
Jim Granville

Followup to: By author: Peter Alfke In newsgroup: comp.arch.fpga

Since my affiliation is "neither" (I just happen to own a Cyclone board since that was the biggest FPGA I could get with free tools) I guess it's more of a :-| than either of those :^)

Unless Xilinx' tools are complete crap, which I'd find unlikely, I would expect that the tools would infer the use of LUT-RAMs for the register file if synthesized for a Xilinx part. It's all part of "no vendor lockin."

Also, this is mostly a project I'm doing for fun. If it happens to be useful at some point in the future, so much the better, if not, I've still achieved my goal of grokking FPGA synthesis better.

-hpa

--
 at work,  in private!
If you send me mail in HTML format I will assume it's spam.
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
Reply to
H. Peter Anvin

Or

formatting link

Petter

-- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?

Reply to
Petter Gustad

Followup to: By author: snipped-for-privacy@designtools.co.nz In newsgroup: comp.arch.fpga

Some vendors have 9/18-bit blockRAMs, some don't. I'm trying to be as generic as possible. It also makes it easier to port tools like gas/binutils/gcc.

-hpa

--
 at work,  in private!
If you send me mail in HTML format I will assume it's spam.
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
Reply to
H. Peter Anvin

& Off chip memory is also easier....

As FPGAs get ever cheaper, and Block RAM gets larger, and factoring in relative speeds, there is scope to define a CPU that takes a coarse approach to cache, like :

- reserves a BlockRAM (or 2) for CODE for SW interrupt loops, and Cache-locked code This gives very fast responses, and lowers RFI and total Power (minimum off-chip BUS/eternal memory activity)

- uses another Block RAM for code cache, where it is allowed to pause while it loads from slower memory. Dual Port RAM would allow a FIFO style load. External memory could be WORD, BYTE or even serial ( FPGA_Stamp :)

- Other Block RAMS are standard DATA rams, including fast context register switching for interrupts / param passing.

Design ends up with a single CPU, but two distinct areas of FAST and SLOW code and data.

Does anyone know of work using this HW focus on FPGA cores ?

- jg

Reply to
Jim Granville

Isn't some form of BlockRAM a defacto standard on all 'consider for new design' FPGAs - so not using that would restrict your options ?

-jg

Reply to
Jim Granville

So are hardware multipliers these days. I believe all the latest chips have them as well as multi-standard IOs.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX
Reply to
rickman

I am convinced that a generic version would inevitably be inferior in performance and/or price, compared to the dedicated one. I know that Ken and Göran used many Xilinx-specific features when they designed PicoBlaze and MicroBlaze. And I assume that the Altera guys were operating in a comparable way when they designed Nios. The generic ones will be the "worst of both worlds", unless you really believe in clairvoyant synthesis.

Peter Alfke, Xilinx

Reply to
Peter Alfke

But, there's nothing like rolling your own for fun/educational purposes and then having something useful at the end of it all. I think it would also be easier to add new features and enhancements to your own design since the code was developed in your own way of thinking and coding style. In the end, maybe it will just be "yet another RISC core" on opencores.org, but at least it's yours and you'll have a good understanding of it's capabilities...even if all it can do is blink an LED!

BP

Reply to
Bruce P.

Followup to: By author: snipped-for-privacy@designtools.co.nz In newsgroup: comp.arch.fpga

Some form thereof, yes, but I tend to run out of blockram a lot faster than running out of LUTs. Note that it's not that I'm saying you couldn't use it, I'm saying I want to be at < 1000 LE without using blockram. About 300-400 of that would be replacable with a blockram.

-hpa

--
 at work,  in private!
If you send me mail in HTML format I will assume it's spam.
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
Reply to
H. Peter Anvin

Followup to: By author: Peter Alfke In newsgroup: comp.arch.fpga

Of course. But it would have the advantage that it could run on either.

-hpa

--
 at work,  in private!
If you send me mail in HTML format I will assume it's spam.
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
Reply to
H. Peter Anvin

"Clairvoyant Synthesis", now that sounds like a good product! Is there a startup somewhere working on that?

I can see product announcements touting the new FPGA CS that eliminates the need for product planners, designers and even testing as it would already know that the design was ready for production! No specs to write, no coding to compile and even simulation could be skipped. Just think of the design you want and out pops a bit file. Boy, what will they think of next?

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX
Reply to
rickman

When I was in school we worked on a paper design of a microcoded processor as a teaching tool. We had homework on it and had to design new features on our exams. I even had a question about it on my Masters comprehensive exam. I approached my professor about designing a simulation of it to run on the Univac mainframe for the undergrad students to learn from. But I guess I was ahead of my time as he did not see the value in that. Or maybe he had the foresight to see the complications it might create :)

I guess this is a pretty common thing at Universities now. All they have to do is get you a FPGA design package with a simulator.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX
Reply to
rickman

Hello Jim,

For the most part, I agree with your legal analysis regarding cloning cores. The primary issues have to do with whether or not an architecture is protected by patent and if not, how you market or use the resulting clone so as not to infringe existing copyrights, trademarks, trade dress or confidentiality/license agreements.

But in my view, this is not the main issue with cloning an existing architecture. The main issue has to do with how you are going to debug it once you get it into an FPGA wherein it is completely embedded with no address or data lines coming out.

Accordingly, I'd like to take this opportunity to simply state that I've posted developmental versions of both my 8051 and 6805 microcontroller cores for free downloading at

formatting link
The

8051 is in original Verilog RTL format and includes on-chip JTAG real-time monitoring and debug logic, including 144-channel trace buffer.

I've successfully synthesized them using Synplify and Quartus II web edition.

If anyone would like to start a thread about how the JTAG real-time monitor works, I'd be happy to engage.

Regards,

Jerry D. Harthcock QuickCores p.s., I also have the 9-bit RISC which I'd be happy to post if any> >

Reply to
Jerry D. Harthcock

Both Brand A and Brand X have midsize (8-16 bit wide + parity, with

as well.

Thus it is safe to have parameterized cache and register file with instantiates the correct size memories, as part of your design, and still remain vendor neutral. You WANT to use these devices for both register file and memory.

The thing that Brand A is missing are the SRL16/LUT as RAM features which give very small memories (16-64x1b), while Brand X all the BlockRAMs (midsized memories) are the same size while Brand A's memories come in different sizes.

--
Nicholas C. Weaver                                 nweaver@cs.berkeley.edu
Reply to
Nicholas C. Weaver

I think generic will be inferior, but not THAT inferior, given the register files and caches can and should be done in the "everyone has" BlockRAMs.

But in order to make it generic, these structures will probably need target-specific parameters and options (dual ported or not, size range) which are instantiated.

Also, the other big disadvantage in the generic version is going to be a lack of placement. Placement is good for 10-30% performanec increases.

--
Nicholas C. Weaver                                 nweaver@cs.berkeley.edu
Reply to
Nicholas C. Weaver

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.