Tiny CPUs for Slow Logic

G

gnuarm.deletethisbit 7 years ago

I'm not designing anything at the moment. I'm trying to discuss an idea. Have you never brain stormed techniques and methods?

Rick C.

Vote

T

Tom Gardner 7 years ago

Many many times, professionally often in formal settings.

Generating ideas is relatively easy.

Refining an idea and defining it in sufficient detail that it can be assessed is a more difficult task. Very few ideas survive.

It is /always/ up to the "champion" (of an idea or product) to be able to convincingly explain the advantages and acknowledge the disadvantages.

Vote

G

gnuarm.deletethisbit 7 years ago

So I guess we need to find a champion.

Rick C.

Vote

O

oldben 6 years ago

Where do find the memory for the program and data? On the FPGA, external or floating on a cloud? Oldben

Vote

R

Rick C 6 years ago

don't need to happen at high speed. Simple CPUs can be built into an FPGA using a very small footprint much like the ALU blocks. There are stack bas ed processors that are very small, smaller than even a few kB of memory.

be interested? Or is a C compiler mandatory even for processors running v ery small programs?

ago on an I/O board for an array processor which had it's own assembler. It was very simple and easy to use, but very much not a high level language . This would have a language that was high level, just not C rather someth ing extensible and simple to use and potentially interactive.

This is a bit of an old thread. I don't recall having anything in mind. I started out just trying to consider what might be useful, but I really don 't recall. All the CPUs I've designed had local memory.

Rick C. - Get 2,000 miles of free Supercharging - Tesla referral code - https://ts.la/richard11209

Vote

J

jim.brakefield 6 years ago

t don't need to happen at high speed. Simple CPUs can be built into an FPG A using a very small footprint much like the ALU blocks. There are stack b ased processors that are very small, smaller than even a few kB of memory.

ne be interested? Or is a C compiler mandatory even for processors running very small programs?

rs ago on an I/O board for an array processor which had it's own assembler. It was very simple and easy to use, but very much not a high level langua ge. This would have a language that was high level, just not C rather some thing extensible and simple to use and potentially interactive.

I started out just trying to consider what might be useful, but I really d on't recall. All the CPUs I've designed had local memory.

Missed this thread back in March. My interest was/is in treating EDIF as the machine language. If you can simulate all the logic in under, say, 50 usec. that's faster than human reaction times and suitable for controllers. So at 200MHZ that's 10K instructions, or ~10K logic gates per cycle.

So an FPGA uP design using one block RAM and under 200 LUTs is sufficient. Duplicate the uP if you have more logic than this. 200 LUTs is much less t han $1.

Jim Brakefield

Vote

H

HT-Lab 6 years ago

Are you trying to emulate a small FPGA on a microcontroller? This sounds like an overly complicated especially as an EDIF is normally full of complex (and not always fully documented) primitives which will be a real pain to simulate.

Perhaps I am wrong and this is a brilliant idea? I would be interested to hear some more,

Regards, Hans

formatting link

Vote

J

Jon Elson 6 years ago

Gee, emulating a small CPU on an FPGA might be a lot better way to go. I have used smallish FPGAs to do some jobs where typically a microcontroller would be used, and they worked pretty well.

I've also used Xilinx CPLDs from the 9500 and CoolRunner II family for simple small logic needs, and they have done quite well. These cost just a couple $ in small quantity. The smallest, 9536XL is just over $1 in single quantity at Digi-Key.

Jon

Vote

R

Rick C 6 years ago

When it comes to the Xilinx CoolRunner II parts, only the small ones are cheap. The larger ones get very expensive for what they can do.

Rick C. - Get 1,000 miles of free Supercharging - Tesla referral code - https://ts.la/richard11209

Vote

J

jim.brakefield 6 years ago

nt.

ss than $1.

My experience with EDIF was the output of VHDL/Verilog compilers for FPGAs. EDIF output was lots of simple gates and black boxes for block RAM. The FPGA vendors then write a mapper, placer and router into their silicon. So for the application with low duty cycle gates it's more efficient to emula te the gates via a small CPU and its single block RAM. For a while Xilinx supported a similar approach via HDL to "C" and run on their ARM or PPC har d cores. Now EDIF is just a bunch of black boxes, simple gates or as complex as desi red, wired together. There are applications, such as industrial control that run the control log ic at kilohertz rates. Very low duty cycle for FPGA LUTs. So one is tradi ng speed for density. As a side note, ASIC logic simulators have some of the same issues, however , one wants to run the ASIC simulation as fast as possible, essentially in the megahertz range.

In summary, there are a range of applications that do logic "simulation" ov er a wide range of cycle rates; from millisecond human reaction times all t he way up to "as fast as possible". Would argue that there needs to be too l chains that support the six order of magnitude range of logic cycle rates . In particular, not much attention to the low end of cycle rates, which c urrently is supported by real-time embedded tools.

Jim Brakefield

Vote

R

Rick C 6 years ago

ient.

less than $1.

s

s. EDIF output was lots of simple gates and black boxes for block RAM. Th e FPGA vendors then write a mapper, placer and router into their silicon. So for the application with low duty cycle gates it's more efficient to emu late the gates via a small CPU and its single block RAM. For a while Xilin x supported a similar approach via HDL to "C" and run on their ARM or PPC h ard cores.

sired, wired together.

ogic at kilohertz rates. Very low duty cycle for FPGA LUTs. So one is tra ding speed for density.

er, one wants to run the ASIC simulation as fast as possible, essentially i n the megahertz range.

over a wide range of cycle rates; from millisecond human reaction times all the way up to "as fast as possible". Would argue that there needs to be t ool chains that support the six order of magnitude range of logic cycle rat es. In particular, not much attention to the low end of cycle rates, which currently is supported by real-time embedded tools.

I get what you are saying. But why would anyone first design something in HDL only to have it compiled and then simulated in a CPU? Why would such l ow bandwidth processing not be coded in a sequential language conventionall y used on CPUs, like C? Skip the hassle of compiling in an HDL tool and th en importing to a simulator running on the target CPU? Where is the advant age exactly?

I will say that if you use the language output from the place and route too ls you will get something more like LUTs which are likely to simulate faste r than individual gates. Remember that unless you have some very tiny amou nt of logic that can be implemented in some sort of immense look up table, every connection between gates is a signal that will need to be scheduled t o "run" when the inputs change. Fewer entities means less scheduling... ma ybe.

Rick C. + Get 1,000 miles of free Supercharging + Tesla referral code - https://ts.la/richard11209

Vote

J

jim.brakefield 6 years ago

:

s.

.

icient.

h less than $1.

nds

d

GAs. EDIF output was lots of simple gates and black boxes for block RAM. The FPGA vendors then write a mapper, placer and router into their silicon. So for the application with low duty cycle gates it's more efficient to e mulate the gates via a small CPU and its single block RAM. For a while Xil inx supported a similar approach via HDL to "C" and run on their ARM or PPC hard cores.

desired, wired together.

logic at kilohertz rates. Very low duty cycle for FPGA LUTs. So one is t rading speed for density.

ever, one wants to run the ASIC simulation as fast as possible, essentially in the megahertz range.

" over a wide range of cycle rates; from millisecond human reaction times a ll the way up to "as fast as possible". Would argue that there needs to be tool chains that support the six order of magnitude range of logic cycle r ates. In particular, not much attention to the low end of cycle rates, whi ch currently is supported by real-time embedded tools.

n HDL only to have it compiled and then simulated in a CPU? Why would such low bandwidth processing not be coded in a sequential language conventiona lly used on CPUs, like C? Skip the hassle of compiling in an HDL tool and then importing to a simulator running on the target CPU? Where is the adva ntage exactly?

ools you will get something more like LUTs which are likely to simulate fas ter than individual gates. Remember that unless you have some very tiny am ount of logic that can be implemented in some sort of immense look up table , every connection between gates is a signal that will need to be scheduled to "run" when the inputs change. Fewer entities means less scheduling... maybe.

|>But why would anyone first design something in HDL only to have it compil ed and then simulated in a CPU? Was thinking of EDIF as a universal assembly language. Parallel processing via multiple interconnected processors. Hard real-time: Easy to determine worst case delay = # of instructions ex ecuted per cycle. Very few processors are under $0.10.

|>every connection between gates is a signal that will need to be scheduled to "run" when the inputs change Was thinking in terms of synchronous simulation where each "gate" is evalua ted only once per clock cycle.

Vote

R

Rick C 6 years ago

te:

ers.

le.

fficient.

uch less than $1.

ounds

f

a

ted

FPGAs. EDIF output was lots of simple gates and black boxes for block RAM. The FPGA vendors then write a mapper, placer and router into their silico n. So for the application with low duty cycle gates it's more efficient to emulate the gates via a small CPU and its single block RAM. For a while X ilinx supported a similar approach via HDL to "C" and run on their ARM or P PC hard cores.

s desired, wired together.

ol logic at kilohertz rates. Very low duty cycle for FPGA LUTs. So one is trading speed for density.

owever, one wants to run the ASIC simulation as fast as possible, essential ly in the megahertz range.

on" over a wide range of cycle rates; from millisecond human reaction times all the way up to "as fast as possible". Would argue that there needs to be tool chains that support the six order of magnitude range of logic cycle rates. In particular, not much attention to the low end of cycle rates, w hich currently is supported by real-time embedded tools.

in HDL only to have it compiled and then simulated in a CPU? Why would su ch low bandwidth processing not be coded in a sequential language conventio nally used on CPUs, like C? Skip the hassle of compiling in an HDL tool an d then importing to a simulator running on the target CPU? Where is the ad vantage exactly?

tools you will get something more like LUTs which are likely to simulate f aster than individual gates. Remember that unless you have some very tiny amount of logic that can be implemented in some sort of immense look up tab le, every connection between gates is a signal that will need to be schedul ed to "run" when the inputs change. Fewer entities means less scheduling.. . maybe.

iled and then simulated in a CPU?

executed per cycle.

ed to "run" when the inputs change

uated only once per clock cycle.

I'm pretty sure that does not exist. Race conditions exist in simulations if you interconnect gates as you are describing. VHDL handles this by intr oducing delta delays which are treated like small delays, but no time ticks off the clock, just deltas. Then each gate can have a delta delay associa ted with it. This in turn requires that each signal (gate output) be evalu ated each time any of the inputs change. Because of the unit delays there can be multiple changes at different delta delays.

Otherwise the input to a FF much be written as an expression with defined r ules of order of evaluation.

Or do you have a method of assuring the consistency of evaluation of signal s through gates?

Rick C. -- Get 1,000 miles of free Supercharging -- Tesla referral code - https://ts.la/richard11209

Vote

Tiny CPUs for Slow Logic

Join the Discussion

Didn't find your answer?