Info on FPGA routing algorithms?

I've been reading papers about routing in island-style FPGAs. Most cite Xilinx architectures, though I've looked at a few papers about H-tree networks. Often, there is a simplified model being used. There is mention (kind of dated) that commercial routers use derivatives of maze routing, with some more recent mention of channel routing. Is there some papers that can give a good idea of how the real industry software does global and detailed routing, what algorithms are actually used? What is the typical lag time between the advent of certain approaches in conference/journal papers versus uptake in commercial routers? I'm kind of curious how much I can should trust the papers as an indication of actual practice. As well, I am still in rummaging mode, and have yet to rummage into a paper that shows how the switches in the switch boxes are actually explored to get detailed routing, given a non-full crossbar. I've looked at

Wu &Tsukiyama et al.: Graph analysis of 2D FPGA routing

but I'm hoping to rummage into something more applied.

Thanks.

Fred

--
Fred Ma
Dept. of Electronics, Carleton University
 Click to see the full signature
Reply to
Fred Ma
Loading thread data ...

Hi Fred,

These days, the best academic routers do combined global + detailed routing using negototiated congestion. The base algorithm employed is known as Pathfinder -- see "Placement and Routing Tools for the Triptych FPGA" by C. Ebeling, L. McMurchie, S.A. Hauck, and S. Burns. A widely used tool is VPR (by Vaughn Betz), which performs clustering, followed by placement (via simulated annealing), and then timing-driven negotiated-congestion based routing. It uses an improved form of the Pathfinder algorithm, but what exact improvements it adds I forget, since VPR was commercialized and I no longer remember what was in the original academic version! I *think* that VPR (or some improved version of it) is still the best academic router, quickly providing small channel widths and good timing.

There is a book "Architecture and CAD for Deep-Submicron FPGAs" by Betz, Rose, and Marquardt that covers the algortihms employed in VPR and T-VPACK (a timing-driven clusterer by Marquardt). It also covers the architecture of island-style FPGAs and various experiments on the detailed routing design. And there are a slew of papers at FPGA, ICCAD, FPL and other conferences on FPGA routing. Go to

formatting link
to see information on the "purple book" plus a list of papers on VPR by Vaughn -- the refernces in these papers should serve as good starting point.

Regards,

Paul Leventis Altera Corp.

Reply to
Paul Leventis (at home)

thanx for the info, was looking for the same information ...

thx

routing

C.

VPR

to

point.

Reply to
Yttrium

Thanks, Paul,

I did in fact encounter Vaughn Betz's papers on VPR. There is pretty intuitive description of the VPR, and Pathfinder is described as an iterative application of shortest path. I'm still gnawing breadth-first on promising references from web-found papers. I'll look more closely at Pathfinder to see if it details how the switchbox settings are determined. Thanks for the pointer to Vaughn's book.

If anyone can comment on how relatively wide-spring are the various algorithms, both in academia and industry, that would be helpful. Papers often reference other papers, but don't actually indicate which algorithms are used by which commercial tools, and how prevalent are the various commercial tools. Vaughn's website says Right Track is now part of Altera, so maybe Altera's own tools may start using ideas from VPR. I have yet to come across information about what was the algorithm prior to this. What about Xilinx? Is it the case that they don't like to disclose the actual inner workings of their native tools? How common is it to use 3rd party tools for place-and-route rather than those from the vendor? In general, information about which commercial tool (vendors) use which algorithms seems much rarer than just the algorithms presented in isolation.

Fred

--
Fred Ma
Dept. of Electronics, Carleton University
 Click to see the full signature
Reply to
Fred Ma

Hi Fred,

switchbox

I'm not quite sure what you mean by this. If you're referring to how the topology of the switchbox is determined (an architectural decision), there are some other papers I can point you to.

From a routing algorithm perspective, it is really quite simple. You represent the entire chip as a graph where each node represents a routing resource -- a block input, block output, or routing wire. Directed edges are placed between these nodes to represent the presence of a (programmable) connection from one resource to another. So a swich box that is not a complete cross-bar is encoded in this graph as edges. For each net, the router starts at the source block output, traverses the edges from that node, and assigns a desirability/cost to each of the nodes seen and places them in a heap. It then removes the best node from the heap and repeats from there until it hits the desired destination (a block input). This is known as an a-star or best-first traversal, and clearly all the work/intelligence lies in how you cost the nodes (including a prediction of future cost), how you guide the router to correct congestion, etc. For a multi-terminal net, the routing thus far is placed on the heap and the router expands from there. There are optimizations that can be done to reduce the amount of re-expansion of the wavefront, and other techniques for quickly routing high-fanout nets.

Global routing (not worrying about precise wires and switch box settings) is done using the same algorithm, except that rather than nodes representing one wire, you represent a group of wires with a node of capacity = channel width. A well-written global + detailed router does not require a global route and performs fast enough that there really isn't any point to look at global routing (in FPGAs) anymore.

We don't like to tell each other what we do in our tools for obvious reasons :-) VPR served as the basis for Right Track CAD's work, and Right Track provided Altera with enhanced place & route results for their FLEX10K products in MaxPlus II. What the algorithms were before or what they've become since is not a matter of public record.

Regards,

Paul Leventis Altera Corp.

Reply to
Paul Leventis (at home)

Do any placement algorithms try to make regular structures? For datapaths, for example? This is how a human would do it, and might give excellent results in some cases.

Although, you certainly want to architect FPGAs with enough routing resources so that regular structures are not needed.

--
/*  jhallen@world.std.com (192.74.137.5) */               /* Joseph H. Allen */
int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0)
 Click to see the full signature
Reply to
Joseph H Allen

1 On optimum switch box designs for 2-D FPGAs Hongbing Fan; Jiping Liu; Yu-Liang Wu; Chak-Chung Cheung; Design Automation Conference, 2001. Proceedings , 18-22 June 2001 Pages:203 - 208 2 General models and a reduction design technique for FPGA switch box designs Hongbing Fan; Jiping Liu; Yu-Liang Wu; Computers, IEEE Transactions on , Volume: 52 , Issue: 1 , Jan. 2003 Pages:21 - 30 3 Not necessarily more switches more routability [sic.] Yu-Liang Wu; Chang, D.; Marek-Sadowska, M.; Tsukiyama, S.; Design Automation Conference 1997. Proceedings of the ASP-DAC '97. Asia and South Pacific , 28-31 Jan. 1997 Pages:579 - 584 4 The effect of switch box flexibility on routability of field programmable gate arrays Rose, J.; Brown, S.; Custom Integrated Circuits Conference, 1990., Proceedings of the IEEE 1990 , 13-16 May 1990 Pages:27.5/1 - 27.5/4 5 General models for optimum arbitrary-dimension FPGA switch box designs Hongbing Fan; Jiping Liu; Yu-Liang Wu; Computer Aided Design, 2000. ICCAD-2000. IEEE/ACM International Conference on , 5-9 Nov. 2000 Pages:93 - 98 6 Graph based analysis of 2-D FPGA routing Yu-Liang Wu; Tsukiyama, S.; Marek-Sadowska, M.; Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , Volume: 15 , Issue: 1 , Jan. 1996 Pages:33 - 44 7 On improving FPGA routability applying multi-level switch boxes Jiping Liu; Hongbing Fan; Yu-Liang Wu; Design Automation Conference, 2003. Proceedings of the ASP-DAC 2003. Asia and South Pacific , 21-24 Jan. 2003 Pages:366 - 369 8 On computational complexity of a detailed routing problem in two dimensional FPGAs Yu-Liang Wu; Shuji Tsukiyama; Malgorzata Marek-Sadowska; VLSI, 1994. 'Design Automation of High Performance VLSI Systems'. GLSV '94, Proceedings., Fourth Great Lakes Symposium on , 4-5 March 1994 Pages:70 - 75 9 On optimal hyperuniversal and rearrangeable switch box designs Hongbing Fan; Jiping Liu; Yu-Liang Wu; Chak-Chung Cheung; Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , Volume: 22 , Issue: 12 , Dec. 2003 Pages:1637 - 1649 10 A global routing model for universal switch box design Hongbing Fan; Jiping Liu; Yu-Liang Wu; Electronics, Circuits and Systems, 2000. ICECS 2000. The 7th IEEE International Conference on , Volume: 1 , 17-20 Dec. 2000 Pages:78 - 81 vol.1
Reply to
Tom Seim

I sort of got lost with the heap, but don't worry, I was just getting a rough idea. If needed, I will look up A* (I have papers on it). The representation for wires/switches above matches that in Pathfinder, and I noticed that Betz's VPR also has tricks to cut down some work in restarting the wave front for multiterminal nets. The use of arcs to represent switches seems to get rid of the division between detailed and global routing.

Yes, I was noticing that in recent papers.

Aaawwww. OK. Thanks. I understanding why there is a paucity of such information.

Fred

--
Fred Ma
Dept. of Electronics, Carleton University
 Click to see the full signature
Reply to
Fred Ma

Thanks for the references, Tom. I looked briefly at the abstracts of a few, and they are quite relevant. I will look at them in more detail.

Fred

--
Fred Ma
Dept. of Electronics, Carleton University
 Click to see the full signature
Reply to
Fred Ma

Joseph, I'm no expert in the area, but seem to recall seeing this. Look up the authors of C compilers for Garp, I think one of them works on a fast linear placement algorithm. It might be from arraying bit slices. I expect the granularity of one's slice to be highly dependent on the target platforms array architecture.

That's the constant balance, it seems. Deciding ow to focused to make your application domain, which determines the suite of applications (and kinds of operations) you need to support, which helps you avoid excessive flexibility in the array architecture. It's pretty vague, but it seems to be the general motivation for reconfigurable logic.

Fred

--
Fred Ma
Dept. of Electronics, Carleton University
 Click to see the full signature
Reply to
Fred Ma

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.