graphic card accelarator vs. FPGA: which is better for the following task?

Question

Dear all,I guess this is a ray-tracing problem... But I need to do this task in ashigh as possible speed/throughput. Here is my problem:Suppose I am given 25 rays and I am given a 3D cube and all parameters ofthese rays and cube are given...I need to compute the length of the intersecting segment of the rays withthis cube as fast as possible. If some rays completely fall outside of thecube, then it outputs 0, otherwise gives the length.I heard there are some very good graphic card with accelerator... and Iheard about the bus bandwidth to be as high as 500MHz... I am not sure ifthey have good accelaration function for doing my task?I also think of doing this using an FPGA which is hooked onto a Intel PCwith Linux... I don't know the details, but I guess it uses PCI or other busto interact with the CPU and serve as an coprocessor...I want to know which method is better?Considering that after solving this throughput problem, the next bottleneckwill be a 1GB memory that I need... I wonder if the graphic card has 1GBcache/memory inside it? Since a lot time it needs to do triple-buffling, Iguess... it should have a high speed huge memory, right?I also don't know what is the maximum processing speed of a high-endgraphical card comparing with a high end FPGA implementation?Can anybody give me some comments/suggestions/advice/hints/pointers on this?Thanks a lot,-Walalal

Andras Tantos · Accepted Answer

Hi!- Are the vertices of the cube parallel to the coordinate axles?- Is the anything special about the cube (size, orientation, rotation,location, etc.)?- In what format are the rays and the cube defined?Possibly no. First, you would have to get data *back* from the acceleratorwhich is something they are not designed for. As someone said to me once,they operate in a 'write and forget' mode. Second, none of the acceleratorsI know of do ray-tracing. However your question is not a complete ray-traceproblem, so you might be able to tweak the functions of an accelerator togive you your answer.busThat's a possiblity. You can find PCI FPGA prototyping cards for thispurpose.BTW, if you need to process 1GB of data (assume that's the total amount oftraffic) you would need at least at least 7.75 seconds just to transfer thedata over a 33MHz PCI bus, not counting other PCI traffic, and other issues.If that's too slow, you would need a) 66MHz b) 64bit c) PIX-X bus and ofcourse a PC that supports...

walala · Answer

Hi, Andras,Thank you very much for your answer!I guess the first thing I need to make myself clear is that what isthe essence of this problem? Is it a ray-tracing problme or collisiondetection problem?I need to identify the name of the problem first then I can go out andsearch for similar application cases...Can you help me on that?Thanks a lot,-Walala

Andras Tantos · Answer

Hi!I would think your problem is an intersection problem, but only you can findout the true nature of your problem.Andras

Andras Tantos · Answer

Hi!I would think your problem is an intersection problem, but only you can findout the true nature of your problem.Andras

Kolja Sulimma · Answer

Let the cube be given by three normal vectors n1 to n3 and six points p1 to p6 on the six planes. (Actually you can use the same point multiple times) Assume your rays start in the origin and are given by a vector r of length 1. Then the interscetions with the first plane happens at a distance d of d1= n.p1/(n1.r)= n.p1 * (1/(n1.r)) See

formatting link

Then you order the planes according to d. If the ray does not cross the three front planes first, the cube is missed, otherwise the difference between the fourth and the third distance is the length of the intersection.

So for each ray you get three devisions, four multiplications and a couple of minmax cells. (Many ore optimizations due to symmetries possible.)

With integers you should be able to do that in a small Spartan-III in a pipeline a lot faster than you can get data into the chip.

With floating point numbers it should be still very fast in an FPGA, but the design gets a lot more complicated and larger.

Have fun,

Kolja Sulimma

Roger Larsson · Answer

Graphics cards use AGP, now x8 that means AGP x8 interface for 2.1 GB/sec bandwidth Since this is MUCH faster than PCI bus they handle huge amounts of data a lot better.

BUT suppose someone builds a AGP x8 board with a fast FPGA. Then you suddenly have reduced the data transfer bottleneck.

Suppose you use a Xilinx Pro, or add a PowerPC as an option together with monitor out circuits. Then you will get quite an interesting board that could be used to run the X server for Unix/Linux :-)

But you can still buy PCI graphic boards...

/RogerL

walala · Answer

Thanks a lot, Koja,Very informative,,... I need to digest your answer...-Walala

graphic card accelarator vs. FPGA: which is better for the following task?

Join the Discussion

Didn't find your answer?