Application processor with fast parallel I/O

I'm looking for a Cortex-A class processor that has reasonably quick parallel I/O that might be hooked up to an FPGA. I'm aware of the existing Zynq and Altera SoC FPGAs, but looking for something different.

By 'parallel I/O' I mean ideally a memory interface - either bidirectional eg 64-bits or separate 32-bit tx and 32-bit rx. GPIO with data valid signals/strobes is another possibility. By 'quick' I mean hundreds of MHz, ideally with low latency.

I know there are things like the TI PRU in eg the OMAP family, but they seem to have a limited number of pins (16 bit tx/rx).

Can anyone suggest anything else in this space?

Thanks Theo

Reply to
Theo Markettos
Loading thread data ...

I think you are not going to find anything other than the memory interface. What's wrong with that? I assume you are referring to a processor that runs in the GHz range, but even then it would be hard for it to push data out of parallel I/O at "hundreds of MHz". Since this would be a very atypical use of parallel I/Os, I can't imagine a chip maker who would put the I/O pins on the fast local bus. Rather they are typically connected through a slower bus for the peripherals.

Can I ask why you don't want to use high speed serial I/Os (which are intended for this) or a combined FPGA/CPU chip? Is using the memory interface too obvious or is there a reason to not use that?

--

Rick C
Reply to
rickman

I don't have a problem with using a memory interface, just would rather avoid something like PCIe - I don't have a transceiver interface to receive it. I just wasn't aware of anything that exported a 'simple' high speed memory interface.

The reason for not using a combined FPGA/CPU chip is that I'll be using a dev board rather than making my own board (buying serious FPGAs in small quantities isn't fun and I'd rather not have to do the DDR3/etc layout). On the other side of my bridge CPLD/FPGA is a 1.5V parallel interface: most ARM FPGA dev boards hardwire their pins to something other than 1.5v, so I can't simply use the FPGA on the combined chip. I'm also physically constrained which rules out a lot of dev boards.

Theo

Reply to
Theo Markettos

Have you looked at data sheets? The last time I looked it seemed like there were ones out there that came with built-in SDRAM interfaces. That may not be ideal ('cuz you'd have to make your FPGA that pretend to be SDRAM), but it should support the bandwidth you'd want.

I have to admit I didn't look hard -- I wanted a microcontroller with a Cortex A core; I basically stopped looking when I saw that I'd need to deal with external memory and whatnot if I wanted that core.

--
Tim Wescott 
Wescott Design Services 
 Click to see the full signature
Reply to
Tim Wescott

Here. Poof -- first one I looked at. Has high speed "conventional" memory interfaces as well as SDRAM interfaces. Some shopping should get you some hits.

--
Tim Wescott 
Wescott Design Services 
 Click to see the full signature
Reply to
Tim Wescott

Hi Tim, not having fpga experience (though I have designed plenty of logic, used cpld-s, written a logic compiler for cpld-s and another for GAL-s back in the day etc.) I wonder if it would not be easier to use PCIe than pretend to be DDRAM. I have seen fpga-s (soon I may have to use one with DDR etc. if I want to have a display controller which I seem to do...) advertising PCI-e which looks somehow "readily available" to the customer, perhaps this could be a way? I have also seen them having DDR but it seems to be the wrong way around for this task (not for mine with the display though).

Dimiter

------------------------------------------------------ Dimiter Popoff, TGI

formatting link

------------------------------------------------------

formatting link

Reply to
Dimiter_Popoff

I did think that, and should have commented. One of the features of life in a world with big FPGAs is that you can just go out and buy IP to do things like talk to SDRAM or PCI. The last time that I was involved in such a project the PCI in question was "plain old", but there has to be PCI-e cores out there for sale.

"Reverse" SDRAM would be more rare, but the actual core should be simpler (and answers the OP's question more closely).

Were it me, I'd certainly leave PCI-e on the table until I'd done some design studies.

--
Tim Wescott 
Wescott Design Services 
 Click to see the full signature
Reply to
Tim Wescott

PCIe is relatively easy to use, but does involve a much more heavyweight stack at both ends - needs a full PCIe root complex and an FPGA that supports high speed transceivers. Everything ends up much larger/more expensive - both for the FPGA and for the CPU. Many ARM application processor SoCs don't have PCIe, for instance, which means you end up having an Atom-class SoC or larger.

I hadn't thought about pretending to be SDRAM - will think about that. Which CPU did you find, Tim? I suspect finding a dev board that exposes SDRAM pins but wires DDR3 internally is going to be tricky.

Theo

Reply to
Theo Markettos

In this specific case:

I have no FPGA transceivers I can interface to. Space constrains me to a physically small FPGA with a limited number of pins. PCIe latency (~500ns) is too high for small messages

Theo

Reply to
Theo Markettos

I have used and programmed PCI a few times and assuming that PCIe is very similar from a programmers point of view I'd say you can do it without too much pain. If your two sides will be aware of each other (i.e. not discover what is there on the bus, things allowed to do, set address range etc.) you can do it quite easily, you will only need the bus handshakes. And if you restrict them to a known size (say, 32 bit) it becomes even easier.

Should be doable, they do put 2 DIMMS per board after all (still so?). But you are likely to get yourself into a nightmare of problems, trial and error etc. - whereas the PCIe link will just work unless you abuse it too much (not so long ago I switched from ATA to SATA - well, the SATA part were just 4 signal wires and connecting them in a decent way it just worked for me).

You might want to look at the Freescale (now NXP) QorIQ parts, like the t1040 or the t1042, large and not really cheap things but I think they had smaller and cheaper there, too.

Dimiter

------------------------------------------------------ Dimiter Popoff, TGI

formatting link

------------------------------------------------------

formatting link

Reply to
Dimiter_Popoff

What I will probably do one of these days is a display controller doing up to and included 4k video. Just the bitmap -> serial stream, hdmi and probably raw display module lvds. From what I have seen on the first page of the datasheets if not even before that there are fpgas which have DDR, PCIe and even lvds drivers (but I think the latter are not fast enough, yet to check on that). The idea is the DDR to hold the display bitmap, the processor to write it via PCIe (will also read it but rarely if at all). No fancy stuff with maintaining windows etc. involved, the processors nowadays are fast enough to do all that (and I am not making a

3d gaming console). But let me see when (if...) I'll manage to get around to that....

Dimiter

------------------------------------------------------ Dimiter Popoff, TGI

formatting link

------------------------------------------------------

formatting link

Reply to
Dimiter_Popoff

I don't quite understand what you are looking for. You seem to be saying you want an off the shelf board? I don't know how you would interconnect an ARM board and an FPGA board at that high speed.

--

Rick C
Reply to
rickman

PCIe is completely different at the electrical level: you need 2.5G, 5G or

8Gbps high speed serial transceivers. The lower tiers of FPGAs don't support such transceivers, so you're into the hundreds of dollars territory already. Then these fancy FPGAs only come in 500 pin BGAs... because they're fancy. That means you can't hand solder them, you have to contract that out. Then the 500 pin BGAs need an 8-layer PCB stack to get the ball escapes. And so on - the costs and complexity keep rising.

I was kinda hoping to do this with an off-the-shelf SOM and a $10 CPLD/bargain bucket FPGA on a 4 layer board...

It's not quite that simple.

It turns out that NAND or NOR flash interfaces are a better bet than SDRAM because a lot more of the protocol stuff is left to software, while on SDRAM you have to dodge the controller helpfully reordering accesses and inserting refresh cycles. However a lot of SOMs don't export them, and when they do it's only 8 or 16 bit. 16 bits at ~100MHz is something, I suppose.

The retransmission masks a vast multitude of sins... (people don't realise how terrible their $1 SATA cables are until you start measuring their packet loss and crosstalk)

Wow, that's a family where marketing really need to get a grip - 32 bit, 64 bit, Power, ARM11, Cortex A, low power, baseband, server - throw them all into the same brand so the customer is thoroughly confused. I'm not sure I see anything distinctive there, but I'm mostly baffled by how they've organised them.

Theo

Reply to
Theo Markettos

Off the shelf SOM (probably Cortex A but something non-ARM in that rough landscape is possible) to Custom carrier board with CPLD/small FPGA (space constrained) to

1.5V parallel TX/RX interface

The question is about picking a suitable SoC family to live on the SOM (lack of SOM availability being a secondary but relevant issue).

Theo

Reply to
Theo Markettos

I don't know what to tell you. I would contact the makers of the Zedboard of one of the equivalent Altera based parts and ask about setting I/O voltage of 1.5 volts. The Zedboard seems to have provision to set the voltage on at least two banks of I/Os although 1.5 volts is not indicated. I expect a simple part change will get you 1.5 in place of 1.8 volts. Call the maker... I think this problem will be easier than getting the throughput between two boards that you need.

--

Rick C
Reply to
rickman

That shouldn't be true. I know some time back Lattice came out with low end FPGAs with SERDES and I thought X and A had to follow suit.

It has been a long time since you could hand solder any FPGA other than possibly the 144 pin TQFP which is a pretty large package.

Check out the Lattice FPGAs. If you are designing your own FPGA board, why do you care if the FPGA is $10 or $50? That cost will be swamped by the cost of making a board.

Flash interfaces don't run at 100's of MHz.

Motorola (now Freescale... I mean NXP) has always been terrible at making the differences and similarities clear in their processor product lines. I think that partly comes from targeting the really large customers where they get *lots* of support to explain just what parts will suit their needs.

--

Rick C
Reply to
rickman

I couldn't find where you asked, but here's the processor I found:

formatting link

I'm absolutely not saying "use this one" -- it's just the first one I found, and it had a conventional memory bus. I suspect that if you look around you'll find something better.

I looked at the general-purpose memory interface and it doesn't look fast enough for you -- it's calling out 100MHz or 50MHz clock on a 16-bit wide bus, and I'm not sure when you can drive it at 100MHz.

DDR, OTOH, will go up to a 200MHz clock (with 400MHz data rate) -- that's why I was suggesting it, if things are simple enough on the FPGA side.

--
Tim Wescott 
Control systems, embedded software and circuit design 
 Click to see the full signature
Reply to
Tim Wescott

That "faster DDR clock means faster data" assertion assumes that you have big chunks to send -- it'll probably be slower to send single words, but you'll gain a lot if you can use burst mode.

--
Tim Wescott 
Control systems, embedded software and circuit design 
 Click to see the full signature
Reply to
Tim Wescott

on Microzed the Vcco is in the connector so you can set it to 1.5V if you need

-Lasse

Reply to
lasselangwadtchristensen

]> > I'm looking for a Cortex-A class processor that has reasonably quick ]> > parallel I/O that might be hooked up to an FPGA. I'm aware of the ]> > existing Zynq and Altera SoC FPGAs, but looking for something different.

]

formatting link

The TI part has a DRAM and a separate 2nd memory port with 7 chip selects. A lot better than a single memory port: DDR timing much different from FPGA IO port capabilities. With 7 chip selects one can have distinct FPGA read and write pins with tri-states on the read pins. Again, helps with timing in my experience.

According to the literature the xilinx tools completely handle the internal interfaces within the Zynq part and Vivado/ISE gives you your timing pass/fail. I'd go with a SOC FPGA in a minute, given the timing troubles we had with a distinct ARM chip.

Reply to
jim.brakefield

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.