Need ideas for FYP

- T
- Theo Markettos
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Oct 16, 2014 11:58 PM

It isn't, that's the point. /If/ you can write in a high-ish level language like CUDA or OpenCL and achieve your goal with some commodity hardware you can buy in every city, why would you want to use an FPGA? Why would you want to worry about synthesis and meeting timing and state machines and debugging with a logic analyser? Maybe your problem doesn't fit the GPU model and is better suited to an FPGA, but you really ought to stop and think first.

It's not the HDL as a language as such (though Verilog's lax syntax makes it easy to introduce bugs), it's that the abstraction is not sufficiently high enough to make any useful progress. If somebody wants to experiment with architecture they should be able to do that without having to manage all the underlying complexity, and then be able to go back later and refine the code for performance. But this is awkward in (eg) verilog unless you have a very good test suite - it's too easy to introduce control flow bugs.

I've read the code of a web browser written in assembler. That's an example where the abstraction was not sufficiently high - hand management of registers and bit twiddling of memory made it simply impossible to keep control of the complexity. Development eventually ground to a halt because it was simply impossible to develop. Likewise verilog gives you ultimate control, and that's not always what you want when you're just evaluating ideas.

All I'm saying is that verilog/VHDL are insufficiently high levels of abstraction for architectural exploration. I'm not saying all HDLs/HLS are bad, just that you need to pick the right language.

Agreed. Some problems are about relatively simple heavily-parallel compute, and if especially if they can be easily pipelined then they fit FPGA nicely. Likewise if they have Gbps of external I/O FPGA will leave a GPU standing (or if the I/O is not in a PC-friendly format).

However if they need heavy floating point, like a lot of scientific compute, this starts eating up area rapidly. If they're memory-bound, then you're up against the limits of DDR3, which is a lot less bandwidth than GDDR5. Or if you want to do iterative development: many-hours FPGA synthesis times are not conducive.

Horses for courses and all that. My point is that you should do the work on your algorithm to see how it best fits the technologies available to you (CPU, GPU, FPGA), and then refactor it to suit. You may get substantially more performance by refactoring the algorithm for a given technology, rather than simply jumping in and implementing a naive algorithm. Once you've done this, only then implement it. But then be prepared to (repeatedly) refactor your architecture again in the light of that experience.

I'm not saying 'FPGA bad, GPU good', I'm saying implementing an FPGA design for scientific compute is a lot of work. So you need to have a clear reasoning why you're doing it. Just doing it 'to make my Matlab go faster' is not a good enough reason, because there's a lot less painful ways to achieve that.

Theo

- A
- awaish2011
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Sat, Oct 18, 2014 2:01 PM

I have Xilinx 14.5 full licensed and spartan 6 xc6slx45 kit available in my college lab. And I also want to get a bit challenging thing for project.

- A
- awaish2011
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Sat, Oct 18, 2014 2:08 PM

And I have also available this card in my lab

formatting link

And I may also move to open CL. ANy good projects in open CL???

- P
- Petter Gustad
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Sun, Oct 19, 2014 2:16 PM

I agree. I rarely do architectural exploration in HDL. I will typically use some HLL like Common Lisp or Matlab for such purpose. Sometimes I'm lucky and can even turn the architectural model into a DSL which can generate HDL, if not implement the architecture in HDL/HLS.

BTW The Arria 10 has hard FP DSP blocks. Many DSP engineers seem to use FP out of habit since that's what they are used to from their DSP processors and what they use in Matlab. But of course some applications actually need FP.

An advantage of the FPGA that you're *not* limited by DDR3. If you have heavy bandwidth requirements you might want to use serial memory like MoSys Bandwidth Engine or even HMC.

For iterative development I spend more time using the simulator. For continous integration with software one are bound by FPGA synthesis and P&R time.

//Petter

--
.sig removed by request.

- M
- Mike Field
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Sun, Oct 19, 2014 8:53 PM

I'm not so sure dynamic programming is really needed. There are all these f ancy FSM based dynamic programming solutions, and everybody is missing the obvious - that dynamic programming is a crutch used by CPU matchers because they can't score an (almost) arbitrary long match all at the same time.

So I think that there is scope for final year paper, to take a massively pa rallel approach to DNA matching by making things simpler - not just by re-i mplementing CPU algorithms in an FPGA.

Mike

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Oct 20, 2014 1:39 AM

(snip, I wrote)

(snip)

Well, I suppose it isn't because most people aren't using it.

Most often, you can get a long enough exact match to use a simpler either hash based or FSM based search.

If you can get 90% of the matches in 10% of the time, that is often good enough.

In some cases, though, the less exact matches are the most interesting, and so might be missed. There might be lots of data in GenBank that could be searched though and find unknown matches.

The actual story is that BLAST was written to show that hardware based methods aren't needed. But again, 90% of the matches, and maybe 10% of the time.

-- glen