What's Your Favorite Processor on an FPGA?

- R
- rickman
  
  Contact options for registered users
posted
10 years ago

Sat, Apr 20, 2013 10:42 PM

I have been working on designs of processors for FPGAs for quite a while. I have looked at the uBlaze, the picoBlaze, the NIOS, two from Lattice and any number of open source processors. Many of the open source designs were stack processors since they tend to be small and efficient in an FPGA. J1 is one I had pretty much missed until lately. It is fast and small and looks like it wasn't too hard to design (although looks may be deceptive), I'm impressed. There is also the b16 from Bernd Paysan, the uCore, the ZPU and many others.

Lately I have been looking at a hybrid approach which combines features of addressing registers in order to access parameters of a stack CPU. It looks interesting.

Anyone else here doing processor designs on FPGAs?

--

Rick

- B
- Bill Sloman
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sat, Apr 20, 2013 11:11 PM

Sounds like something where you'd get more responses on comp.arch.fpga.

Are you cross-posting?

--
Bill Sloman, Sydney

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sat, Apr 20, 2013 11:12 PM

My guys have been ragging me for years to do designs that have soft-core CPUs in FPGAs, but I've been able to convince them (well, I am the boss) that they haven't made sense so far. They use up too much FPGA resources to make a mediocre, hard to use CPU. So we've been using separate ARM processors, and using a bunch of pins to get the CPU bus into the FPGA, usually with an async static-ram sort of interface.

There's supposed to be a Cyclone coming soon, with dual-hard-core ARM processors and enough dedicated program RAM to run useful apps. When that's real, we may go that way. That will save pins and speed up the CPU-to-FPGA logic handshake.

If the programs get too big for the on-chip sram, I guess the fix would be external DRAM with CPU cache. There goes the pin savings. At that point, an external ARM starts to look good again.

--

John Larkin                  Highland Technology Inc 
www.highlandtechnology.com   jlarkin at highlandtechnology dot com    

Precision electronic instrumentation 
Picosecond-resolution Digital Delay and Pulse generators 
Custom timing and laser controllers 
Photonics and fiberoptic TTL data links 
VME  analog, thermocouple, LVDT, synchro, tachometer 
Multichannel arbitrary waveform generators

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sat, Apr 20, 2013 11:47 PM

in

processors

go

The choice of an internal vs. an external CPU is a systems design decision. If you need so much memory that external memory is warranted, then I guess an external CPU is warranted. But that all depends on your app. Are you running an OS, if so, why?

The sort of stuff I typically do doesn't need a USB or Ethernet interface, both great reasons to use an ARM... free, working software that comes with an OS like Linux. (by free I mean you don't have to spend all that time writing or debugging a TCP/IP stack, etc)

But there are times when an internal CPU works even for high level interfaces. In fact, the J1 was written because they needed a processor to stream video over Ethernet and the uBlaze wan't so great at it.

I get the impression your projects are about other things than the FPGA/CPU you use and cost/size really aren't so important. Then you have less reason to squeeze on size, power, unit costs, but rather minimize development cost. If so, that only makes sense.

My next project will be similar in hardware requirements to a digital watch, but with more processing...

--

Rick

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 12:16 AM

in

processors

go

FPGA ram is expensive compared to the SRAM or flash that comes on a small ARM, like an LPC1754. Something serious, like an LPC3250, has stuff like hardware vector floating point and runs 32-bit instructions at 260 MHz. Both the ARMs have uarts, timers, ADCs, DACs, and Ethernet, for $4 and $7 respectively.

We generally run bare-metal, a central state machine and some ISR stuff. I've written three RTOSs in the past but haven't really needed one lately.

Yeah, we use the GCC compilers. Stuff like Ethernet and USB stacks are available and work without much hassle. I don't know what the tool chains are like for the soft cores.

We do a fair amount of "computing", stuff like signal filtering, calibrations with flash cal tables, serial and Ethernet communications, sometimes driving leds and lcds. There have been a minority of apps simple enough to use a microblaze, and I didn't think that acquiring/learning/archiving another whole tool chain was worth it for those few apps, what with an LPC1754 costing $4.

Sometimes you can just do the computing "in hardware" in the FPGA and not even need a procedural language. So the use case gets even smaller.

I am looking forward to having a serious ARM or two (or, say, 16) inside an FPGA. With enough CPUs, you don't need an RTOS.

--

John Larkin                  Highland Technology Inc 
www.highlandtechnology.com   jlarkin at highlandtechnology dot com    

Precision electronic instrumentation 
Picosecond-resolution Digital Delay and Pulse generators 
Custom timing and laser controllers 
Photonics and fiberoptic TTL data links 
VME  analog, thermocouple, LVDT, synchro, tachometer 
Multichannel arbitrary waveform generators

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 1:14 AM

CPUs in

async

processors

may go

That is not a useful way to look at RAM unless you are talking about buying a larger chip than you need otherwise just to get more RAM. That is like saying the routing in an FPGA is "expensive" compared to the PCB. It is there as part of the device, use it or it goes to waste.

If you need Ethernet, then Ethernet is useful. But adding Ethernet to an FPGA is no big deal. Likewise for nearly any peripheral.

No point in discussing this very much. Every system has it's own requirements. If external ARMs are what works for you, great!

What do you do for the networking code. If you write your own, then you are doing a lot of work for naught typically, unless you have special requirements.

available

the

So you are using networking code, but no OS?

The soft cores I work with don't bother with that sort of stuff. The apps are much smaller and don't need that level of complexity. In fact, that is what they are all about, getting rid of unneeded complexity.

Ethernet comms can be a hunk of code, but the rest of what you describe is pretty simple stuff. I'm not sure there is even a need for a processor. Lots of designers are just so used to doing everything in software they think it is simple.

Actually, I think everything you listed above is simple enough for a uBlaze. What is the issue with that?

I find HDL to be the "simple" way to do stuff like I/O and serial comms, even signal processing. In fact, my bread and butter is a product with signal processing in an FPGA, not because of speed, it is just an audio app. But the FPGA *had* to be there. An MCU would just be a waste of board space which this has very little of.

Xilinx has that now you know. What do they call it, Z-something? Zync maybe?

How about 144 processors running at 100's of MIPS each? Enough processing power that you can devote one to a serial port, one to an SPI port, one to flash a couple of LEDs and still have 140 left over. Check out the GreenArrays GA144. Around $14 the last time I asked. You won't like the development system though. It is the processor equivalent of an FPGA. I call it a FPPA, Field Programmable Processor Array. It can be *very* low power too if you let the nodes idle when they aren't doing anything.

--

Rick

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 1:37 AM

CPUs in

async

processors

may go

ARM,

We got an Ethernet stack somewhere. It's flag driven into the central state machine. If there's an input buffer full, we get a flag, and the state machine processes it when it gets around to it. It may build an outgoing message and queue that into the stack, which runs mostly at interrupt level. A typical system would parse incoming commands and generate replies. It's awfully simple. We usually share the whole parser/executor/reply generator code among multiple ports concurrently, like USB and Ethernet and serial; the buffers and flags all look alike.

available

the

Right.

whole

Imagine a 16-channel thermocouple simulator.

even

My ideal computer would have one CPU that's just the OS, and a hundred or so assignable cores, one for each device, file system, program, or thread. The OS would be a few thousand lines of code, if that. With serious hardware protection, it would be totally virus/trojan/crash immune.

--

John Larkin                  Highland Technology Inc 
www.highlandtechnology.com   jlarkin at highlandtechnology dot com    

Precision electronic instrumentation 
Picosecond-resolution Digital Delay and Pulse generators 
Custom timing and laser controllers 
Photonics and fiberoptic TTL data links 
VME  analog, thermocouple, LVDT, synchro, tachometer 
Multichannel arbitrary waveform generators

- B
- bitrex
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 1:56 AM

CPUs in

and

async

processors

may go

handshake.

an

ARM,

I've

simple.

all

available

the

calibrations

whole

even

Something like this:

formatting link

?

- R
- reg
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 10:28 AM

maybe?

ZYNQ. There is a rather low-cost eval board, named Zedboard (

formatting link

$395 ) which comes with Linux pre-installed on a SD card. The ZYNQ chip onboard contains a hard dual-core Cortex-A9 and ~1M gates worth 7th generation logic.

Regards, Mikko

- V
- Vladimir Vassilevsky
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 1:23 PM

Soft core is fun thing to do, but otherwise I see no use. Except for very few special applications, standalone processor is better then FPGA soft core in every point, especially the price.

Vladimir Vassilevsky DSP and Mixed Signal Designs

formatting link

- V
- Vladimir Vassilevsky
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 1:32 PM

Utter nonsense.

N cores means N^2 interfaces with associated version hell. As for protection, is not technical problem; it is paradigm problem.

Vladimir Vassilevsky DSP and Mixed Signal Designs

formatting link

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 2:42 PM

OS

Better than having all N processes in the same memory space. Much better.

The architecture that I propose could have absolute hardware sandboxing of any process, even drivers and things that can do DMA. Each processor would have memory management - loaded by the OS - that knows the difference between code, data, and stack. The OS would be tiny, absolutely protected, known reliable. Just note all the things that Wintel did wrong, and don't do that.

--

John Larkin                  Highland Technology Inc 
www.highlandtechnology.com   jlarkin at highlandtechnology dot com    

Precision electronic instrumentation 
Picosecond-resolution Digital Delay and Pulse generators 
Custom timing and laser controllers 
Photonics and fiberoptic TTL data links 
VME  analog, thermocouple, LVDT, synchro, tachometer 
Multichannel arbitrary waveform generators

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 4:05 PM

The annoying thing is the CPU-to-FPGA interface. It takes a lot of FPGA pins and it tends to be async and slow. It would be great to have an industry-standard LVDS-type fast serial interface, with hooks like shared memory, but transparent and easy to use.

Something like ARM internal to an FPGA could have a synchronous, maybe shared memory, interface into one of those SOPC type virtual bus structures without wasting FPGA pins.

--

John Larkin                  Highland Technology Inc 
www.highlandtechnology.com   jlarkin at highlandtechnology dot com    

Precision electronic instrumentation 
Picosecond-resolution Digital Delay and Pulse generators 
Custom timing and laser controllers 
Photonics and fiberoptic TTL data links 
VME  analog, thermocouple, LVDT, synchro, tachometer 
Multichannel arbitrary waveform generators

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 5:29 PM

Everyone is entitled to their opinion, but this is *far* from fact. The CPUs in my designs have so far been *free* in recurring price. They fit in a small part of the lowest priced device I can find.

Most people think of large, complex code that requires lots of RAM and big, fast external CPUs. I think in terms of small, internal processors that run fast in a very small code space. So they fit inside an FPGA very easily, likely not much bigger than the state machines John talks about.

BTW, have you looked at any of the soft cores? The J1 is pretty amazing in terms of just basic simplicity, fast too at 100 MHz. They talk about the source just being 200 lines of verilog, but I don't know how many LUTs the design is, but from the block diagram I expect it is not very big. I'm not sure I can improve on it in any significant way.

--

Rick

- R
- Ralph Barone
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 5:34 PM

OS

and end up doing making new and innovative mistakes (just channeling Murphy here).

- L
- langwadt
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 6:09 PM

om

ly.

b16

s

It

ins and

dard

parent

ared

out

xilinx Zynq, arm9 with an fpga on the side

-Lasse

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 8:22 PM

so

OS

any

code,

DEC wrote operating systems (TOPS10, VMS, RSTS) that ran for months between power failures, time-sharing multiple, sometimes hostile, users. We are now in the dark ages of computing, overwhelmed by bloat and slop and complexity. No wonder people are buying tablets. DEC understood things that Intel and Microsoft never really got, like: don't execute data.

--

John Larkin                  Highland Technology Inc 
www.highlandtechnology.com   jlarkin at highlandtechnology dot com    

Precision electronic instrumentation 
Picosecond-resolution Digital Delay and Pulse generators 
Custom timing and laser controllers 
Photonics and fiberoptic TTL data links 
VME  analog, thermocouple, LVDT, synchro, tachometer 
Multichannel arbitrary waveform generators

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 8:24 PM

and

transparent

We gave up on Xilinx a few yeas ago: great silicon, horrendous software tools. Altera is somewhat less horrendous.

--

John Larkin                  Highland Technology Inc 
www.highlandtechnology.com   jlarkin at highlandtechnology dot com    

Precision electronic instrumentation 
Picosecond-resolution Digital Delay and Pulse generators 
Custom timing and laser controllers 
Photonics and fiberoptic TTL data links 
VME  analog, thermocouple, LVDT, synchro, tachometer 
Multichannel arbitrary waveform generators

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 8:40 PM

Microsoft

You really should stick to things you understand. Every Intel processor since the 8086 has included protection mechanism to prevent the execution of data. But they have to be used properly... Blame Microsoft and all the other software vendors, but don't blame Intel. Actually, this is an issue just like so many that are determined by the market place. When users put value on these features and spend their money accordingly, the market will respond. So don't buy Windows anymore if you don't like it. Microsoft will either respond or go out of business. But that's not going to happen. People just like to complain about MS while they continue giving them their money.

--

Rick

- U
- upsidedown
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Apr 21, 2013 9:07 PM

I was responsible for some VMS-11 systems and I forgot to boot the system every summer, when no-one was around. Booting the system the system next year and everyone were happy :-).

Microsoft

PDP-11/RSX-11M+ (early 1970's) had separate I/D spaces, VAX/VMS (mid

70's) had executable program sections.