Dual-stack (Forth) processors

I want to bring my knowledge about Forth processors up to date, so I'm posting some questions.

Who is currently selling Forth processors?

What happened to forthchip.com?

Is there a community that is actively involved in discussing and/or developing FPGA-based Forth chips, or more generally, stack machines?

Has anyone done any substantial DSP work in Forth? Are there libraries of code available?

How about hardware Forth implementations that include dedicated DSP hardware?

Thanks in advance!

--
Davka
Reply to
Davka
Loading thread data ...

You will get a lot of replies from at least one of the groups you posted to. But from what I can tell, there is only a small collection of Forth chips or cores that have been done. The effort is not mainstream and so it is not cohesive in any way that I can see.

If you want to reach into the past, HP used to make a minicomputer that was stack oriented. I don't know anything about the design other than it was in the days of LSI rather than VLSI. Geeze, we must be working on SuperUltraLSI by now!

Again, I think you will find that forth is very much not mainstream for DSP. In general, DSP does not favor any typical processing archtecture. That is why they design chips just for DSP. If you want to do DSP, then I suggest that you learn about DSP. If you want to use Forth, then do that. But I would not expect to see Forth be a significant benifit when doing DSP.

That would not be hard to do in an FPGA. Or you can run Forth on a DSP chip. The latter might gain you more benifit depending your DSP application. Some DSP apps are much better done on an FPGA. It depends on whether you can make use of multiple MAC units or if just the typical one or two found in a DSP chip will do.

Davka

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX
Reply to
rickman

Our motion control system runs a subset of Forth in a PLD. It is a very simple 16 bit two stack Harvard architecture RISC processor that uses less than 128 macro cells. Execution speed with the slowest parts is 25Mips Code 6Mips Forth. It has 128 spare macro cells that run at 50Mhz for customer options. such as encoders(6@10Mhz), step&direction(4axis@1Mhz), data capture(100M samples/sec), PWMs, etc.

Do you want the Forth to supervise a set of DSPs, or be a DSP?

jrh

Reply to
jrh

Interesting - what PLD / Speed spec gives the above numbers ? You probably should watch for the Altera MAX II family, when they release.

Reply to
Jim Granville

MPE is selling a VHDL clone of the RTX2000 for use in FPGAs.

Stephen

-- Stephen Pelc, snipped-for-privacy@INVALID.mpeltd.demon.co.uk MicroProcessor Engineering Ltd - More Real, Less Time

133 Hill Lane, Southampton SO15 5AF, England tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691 web:
formatting link
- free VFX Forth downloads
Reply to
Stephen Pelc

I want the Forth to direct the operation of a multiply-and-accumulate module, and to have access to a fast complex multiply.

-Davka

Reply to
Davka

[...]

Have you seen

formatting link
?

I used to have a rather soft spot for the Harris RTX processors, which is more than I can say for FORTH which I regard as an invention of the devil ;-)

-- Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * Perl * Tcl/Tk * Verification * Project Services

Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, Hampshire, BH24 1AW, UK Tel: +44 (0)1425 471223 mail: snipped-for-privacy@doulos.com Fax: +44 (0)1425 471573 Web:

formatting link

The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.

Reply to
Jonathan Bromley

De gustibus non disputandum est.

--
Julian V. Noble
Professor Emeritus of Physics
jvn@lessspamformother.virginia.edu
    ^^^^^^^^^^^^^^^^^^
http://galileo.phys.virginia.edu/~jvn/

   "God is not willing to do everything and thereby take away
    our free will and that share of glory that rightfully belongs
    to us."  -- N. Machiavelli, "The Prince".
Reply to
Julian V. Noble

An RTX2000 clone core is available from us in VHDL for FPGAs. A C compiler is also available. The CPU runs at 20 MIPs in a Xilinx Spartan. This is twice as fast as the original Harris (Intersil) part, with an interrupt latency of 200ns before starting useful work.

Stephen

-- Stephen Pelc, snipped-for-privacy@INVALID.mpeltd.demon.co.uk MicroProcessor Engineering Ltd - More Real, Less Time

133 Hill Lane, Southampton SO15 5AF, England tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691 web:
formatting link
- free VFX Forth downloads
Reply to
Stephen Pelc

I just read the online RTX2010 manual. Does the RTX2000 also have the multiply-and-accumulate logic?

Do people buy these chips nowadays for DSP?

Reply to
Davka

Dr. Ting has a few thousand MuP21 and MuP21h VLSI chips that date back to 94 and 95. He was always charging about what they cost him to make but you might be able to get a deal on them now that they are getting rather old. He also still has some stock on RTX parts and kits.

His latest projects include P8,P16,P24,P32,and P64. He has a nice development board with a P32 that uses about 75% of the FPGA on the board so there is room for adding custom instructions or custom I/O hardware to the design. The board also has RAM and FLASH, a color LCD interface and LCD and software for a PC for development. I believe that board is about $300 and has a 400Mhz part.

Patriot has various models of their chip ranging from 100 to 350Mhz or so. There is a family of tiny 4Mhz 4-bit bus 16-bit Forth chips manufactured in Europe. These and other Forth chips are listed on my Forth chips page referenced in another post.

There are mailing lists but the hardware list has been silent for a long time. There are discussions sometimes in #forth or #FIGUK chat rooms, even in c.l.f from time to time, but mostly people talk about Forth software not hardware.

You might also consider that there are Forth systems that run on DSP hardware. These are not Forth chips per se but might meet your needs.

The Harris RTX 2001 had the one cycle multiply-accumulator, many FPGA can support the inclusion of single cycle multiply-accumulate circuites. Some can hold quite a lot of them as you probably know. P32 does 32x32->64 and 64/32 but with multiply and divide steps. But with larger FPGA specialized DSP circuits or coprocessors can be added without too much trouble.

I can't say too much at this time about our current work in custom VLSI Forth processors and they are not available for public sale anyway. Best Wishes

Reply to
Jeff Fox

Tha Java Virtual Machine is stack based. There are some projects to build a 'real' Java machine. You can find more information about a solution in an FPGA (with VHDL source) at:

formatting link

It is sucessfully implemented in Altera ACEX 1K50, Cyclone (EP1C6) and Xilinx Spartan2.

Martin

Reply to
Martin Schoeberl

No.

The main use of the RTX2xxx was in rad-hard satellite applications. The key feature of the family is interrupt response.

Stephen

-- Stephen Pelc, snipped-for-privacy@INVALID.mpeltd.demon.co.uk MicroProcessor Engineering Ltd - More Real, Less Time

133 Hill Lane, Southampton SO15 5AF, England tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691 web:
formatting link
- free VFX Forth downloads
Reply to
Stephen Pelc

Following link may help:

formatting link

A synthesizable 16-bit CPU with development tools, FORTH compiler and CPU for FPGA

with best regards,

Peter Seng

############################# SENG digitale Systeme GmbH Im Bruckwasen 35 D 73037 Göppingen Germany tel +7161-75245 fax +7161-72965 eMail snipped-for-privacy@seng.de net

formatting link

"Davka" schrieb im Newsbeitrag news:T%XXb.70$ snipped-for-privacy@news.uswest.net...

Reply to
Peter Seng

The 2000 and 2001 were the same except for lower price and power on the 2000 the single cycle multiply on the 2001. (I hope I didn't reverse the 2000/2001 models) and the 2010 was popular for aero-space applications because of the rad-hard model despite the very high price tag.

Harris switched production from 2000/2001 to 80C286 because the CMOS 286 was popular for laptops at the time and had a larger market. That gives you an idea how long ago Harris moved on from the RTX-2000/2001 production. They continued to produce

2010 for a long time in limited quantity but it is no longer in production either. RTX was done in standard cell technology but it was so long ago that the original patents have expired so one can make knock offs today.

Best Wishes

Reply to
Jeff Fox

It would be intresting to see results for a version that cached the top of the stackand used a more realistic memory interface

--
	Sander

+++ Out of cheese error +++
Reply to
Sander Vesik

Like

formatting link
?

Reply to
Britt Snodgrass

and/or

to

formatting link

and

Hallo Sander,

In this design the stack is cached in a multi level hirarchy:

TOS and TOS-1 are implemented as register A and B. The next level of the stack is local memory that is connected as follows: data in is connected to A and B, the output of the memory to the input of register B. Every arithmetic/logical operation is performed with A and B as source and A as destination. All load operations (local variables, internal register, external memory and periphery) result in the value loaded in A. Therefore no write back pipeline stage is necessary. A is also the source for store operations. Register B is never accessed directly. It is read as implicit operand or for stack spill on push instructions and written during stack spill and fill. This configuration allows following operation in a single pipeline stage: ALU operation write back result fill from or spill to the stack memory

The dataflow for a ALU operation is: A op B => A stack[sp] => B sp-1 => sp

for a 'load' operation: data => A A => B B => stack[sp+1] sp+1 => sp

An instruction (except nop type) needs either read or write access to the stack ram. Access to local variables, also residing in the stack, need simultaneous read and write access. As an example, ld0 loads the memory word pointed by vp on TOS: stack[vp+0] => A A => B, B => stack[sp+1] sp+1 => sp

This configuration fits perfect to the block rams with one read and one write port, that are common in FPGAs. A standard RISC CPU needs three data ports (two read and one write) to implement the register file in a ram. And usually one more pipeline stage for the ALU result to avoid adding the memory access time to the ALU delay time. And for single cycle execution you need a lot of muxes for data forwarding.

As summary: In my opinion a stack architecture is a perfect choice for the limited hardware resources in an FPGA.

About the 'more realistic memory interface':

I don't see the problem. The main memory interface is a separate block and currently there are three different implementations for different boards: a low cost version with slow 8 bit ram, a 32 bit interface for fast async. ram and Ed Anuff added a 16 bit interface for the Xilinx version on a BurchED board. Feel free to implement your interface of choice (SDRAM,...).

Sorry for the long mail, but I could not resist to 'defend' my design ;-)

Martin

Reply to
Martin Schoeberl

Corrections:

The RTX 2000 had two 16-bit 256 element deep stacks (Return & Data), a 2-4 cycle interrupt response, and a bit-mutltiply instruction which required 16-cycles to do a full 16-bit general purpose multiply.

The RTX 2010 had all of the above, plus a hardware multiply/accumulator and barrel shifter. It could do one-cycle 16-bit multiplies, 16-bit multiply accumulates, and one-cycle 32-bit barrel shift. This was the version Harris/Intersil used to make radhard, and wich NASA and APL (Applied Phycsis Laba in Columbia, MD) bought and used in their space missions.

The RTX 2001 was a watered down version of the RTX 2000. It had only

64 element deep stacks, and the multi-cycle 16-bit multiplies. It was originally supposed to be a cheaper/faster version of the 2000, but like a Celeron versus a Pentium, why buy the neutered version when you can have the real thing for about the same amount of money. Plus, reducing the stacks from 256 to 64 elements really reduced the chips ability to perform multi-processing and process stack switching.

I used the RTX2000 and RTX 2010 extensively when I worked for NASA at Goddard Space Flight Center in Greenbelt, MD (1979-1994).

I hope this sets the history straight on the differen versions.

Jabari Zakiya

Reply to
Jabari Zakiya

Corrections:

The RTX 2000 had two 16-bit 256 element deep stacks (Return & Data), a 2-4 cycle interrupt response time, and a bit-mutiply instruction which could perform a complete general purpose multiply in 16-cycles. It was rated a 8 MHz (but they could easily run at 10 MHz [which meant it took a 20 MHz clock] at least at room temperatures).

The RTX 2010 had all of the above, plus a one-cycle hardware 16-bit multiply, a one-cycle 16-bit multiply/accumulate, and a one-cycle

32-bit barrel shift. This was the version that Harris/Intersil based the radhard version upon, which NASA and APL (Applied Physics Lab in Columbia, MD) used for its space missions. They both still have a stash left, the last that I heard.

The RTX 2001 was a watered down version which was basically the 2000, but with only 64 element deep stacks. It was intended (according to Harris) to be a cheaper/faster alternative to the 2000, but like the Celeron vs the Pentium, if you can get the real thing at basically the same price, why use the neutered version? Plus, the reduction of stacks from 256 elements to 64 element greatly reduced the ability to do multi-tasking and stack switching.

I used the RTX 2000/2010 extensively when I worked at NASA GSFC Goddard Space Flight Center in Greenbelt, MD) from 1979-1994.

I hope this helps set the history straight with regards to the differences between the RTX versions. Too bad Harris didn't know how to market them.

Jabari Zakiya

Reply to
Jabari Zakiya

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.