Help with a face recognition system

Hi all,

I am a Computer Science student in my final year. My graduation project is to build a face recognition system (based on Principle Component Analysis and possibly Artificial Neural Networks) on FPGA. Since software is the focus of our college study, hardware is not quite my domain of expertise. I have been doing some reading on the subject, but of course it's nothing in comparison to years of experience. So, I was wondering if you can provide me with some guidance on the following points:

- Which would be a better approach: implementing such system in HDL, or using a soft microprocessor, which if I understand correctly will make it possible to implement the system in assembly or even C. What about mixing them, which I think is referred to as a hardware/software co-design approach; would that be too hard to accomplish? What would its advantage be over either of the two approaches?

- If the use of a microprocessor is suggested, what are the recommendations for the type of microprocessor or the specific implementation?

- I already went and bought an XESS XSA-200 prototyping board, which operates a Spartan-II XC2S200-5FG256 (200k gates) FPGA. After spending the last couple of months with it, I realize it might be a bit low- end. The question is, would it be possible to fit the probably-complex image processing system on it, or is it possible that I would reach a point where I can't fit my design on it no matter what?

I realize this is a relatively long post, so I'd be grateful for any answers to any part of it.

Thank you for reading this far, and thanks in advance for any replies.

Best Regards, Islam Ossama

4th year, Computer Science Dept. Faculty of Computer and Information, Helwan University, Cairo, Egypt.
Reply to
Islam Ossama
Loading thread data ...

Islam, If I were you, I would: first explore the best algorithm then see whether it can be implemented on a (any!) microprocessor, achieving reasonable performance. If the microprocessor implementation is too slow, I would look at a way to speed it up with an FPGA, and I would use an existing board of reasonable performance. The Xilinx University Program has a better board, based on Virtex- IIPro, that is surprisingly inexpensive for universities. I would contact the Xilinx University Program about details.

Challeng> Hi all,

Reply to
Peter Alfke

Peter,

First of all, thanks for replying :-).

We (team of 5) spent the last semester exploring and evaluating the different algorithms. We came to the conclusion that PCA would be the best compromise between complexity and accuracy, with the recommendation that Neural Networks be added into it if possible, to increase the accuracy even further.

We did evaluate different implementations of the algorithm, some on MATLAB and some running natively, and the performance was as we expected: not suitable for real-time, especially for large face databases.

This only confirmed what was suggested at the beginning of our research, which was the main reason we wanted to explore an FPGA implementation to make up for the performance shortcomings while maintaining (or better, increasing) the level of accuracy.

Right now we have in fact divided up the work between us, and 3 of my team are working on the algorithm (it being the main focus of the project), while one is working on the neural network, and myself working on the FPGA.

What remains to be determined is the kind of design that the algorithm will be implemented with. Whether it would be solely VHDL, a soft microprocessor and asm/C/C++ code, or some combination of the two. And, of course, whether the hardware currently at hand would support that design.

Also, after sending my earlier post, I stumbled upon a project on opencores.org called Java Optimized Processor

formatting link
which is essentially a soft processor for Java bytecode. I was wondering how good/robust/ flexible it is, and whether someone here has actually used it in any way on actual hardware. I might try to implement it and download it to my FPGA, if I can get it to compile on Webpack 9.1i without trouble.

I did check out the Xilinx University Program. The "Virtex-II Pro Development System" seems like an extreme overkill in our case, since the use of FPGA isn't standard curriculum in our faculty; we are mainly doing this as a unique, single-case approach. It seems like an excellent choice for an engineering faculty, though. Unfortunately, I don't think our faculty (Computer Science) would be willing to make such a purchase based on a single case requirement, especially taking into account the relatively high currency exchange rate (1 USD ~= 5.7 EGP).

Thanks again for your response, and I hope I haven't bored you with my long reply...

Best Regards, Islam Ossama

Reply to
Islam Ossama

That would seem to be the key issue. FPGA should be able to scan one face quite quickly, but trawling for a match becomes a data-scan problem.

Just how large are these databases ?

-jg

Reply to
Jim Granville

We (Illiac 6 research group at the University of Illinois) are currenlty working on porting one of the more popular face recognition programs for an application that is going to run on our "Communications Supercomputer", which involves a Virtex II-Pro FPGA. These is no benefit from implementing a processor in the FPGA and then running the existing C code on it, becuase there is no way it will come close to the performance of a processor in ASIC. Your only option of utilizing the FPGA for speedup is going to the roots of the algorithm(s), finding parallelism, and writing HDL code to exploit that parallelism. That is the phase we are currently in. Remember that an algorithm that may not be the best in a sequential environment may shine in a highly parallel environment, so it's best to look at all algorithms for possible parallel structure.

---Matthew Hicks

Reply to
Matthew Hicks

Matthew,

Parallelism was also a factor in choosing PCA for implementation on FPGA, and Composite-PCA can even increase that parallelism.

I think it's a very good point what you said about speed concerning a C implementation, which means we'll probably take your suggestion and do it entirely in VHDL. The part of the team working on the algorithm is already breaking it down into parallel parts; hopefully this would make the algorithm really "shine", as we definitely need it to. Also, for the sake of comparison, I'm thinking we can implement the same algorithm on a standard PC with threading and run it in real-time priority, and compare the results to see what was gained through the FPGA implementation. I'm sure the results would be interesting either way.

And in response to Jim's question, this is more of a research project; so, naturally, we want it to support as large a database as possible. The plan is to keep testing it with databases of increasing size until we get it to reach the maximum size possible without breaking the real- time requirement.

Thanks all for your responses...

Best Regards, Islam Ossama

Reply to
Islam Ossama

I think you should deliberately use a board that is an "overkill", so that you can stay away from any limited resources, and concentrate on the job at hand. Regarding cost: Universities can buy this board for less than $500, which is a fantastic bargain... Good luck with your project. Peter Alfke

Reply to
Peter Alfke

Ok :-). I'll be sure to look into it. I was kinda set back by the retail number, which would be closer to 10 grand in local currency. I'll contact the university on Monday and see if I can work this out. Thanks!

Reply to
Islam Ossama

You could opt for an hybrid solution... do all the massively parallelizable things with FPGA fabric (after all, this is what FPGAs are all about when applied to high-speed processing) and do the more sequential/supervisory/etc. stuff on a CPU, preferably one of the PPC405 cores present in V2P and 4VFX FPGAs - these real on-chip CPU cores will provide far better performance than any soft-CPU you can possibly come up with, the only caveat is that you will only have two such CPUs available at most.

Reply to
Daniel S.

Thanks for the suggestion, I'm seriously taking it into consideration. I already contacted the local Xilinx supplier and working out the details of getting the XUP board.

I just hope I can live up to the level of this project, all this hardware stuff is new to me and I'm kinda starting to long for the comfort and warmth of software implementations and having the OS take care of all the dirty details for me. I guess that's why the idea of using the PPC processors would be attractive to me, though I'd still have to take care of some low-level details myself (unless I load a tiny linux kernel on one or both of the processors, maybe? hmmm, it'll definitely take some careful (re)thinking).

Well, thanks again to everyone, your responses have all been extremely helpful.

Best Regards, Islam Ossama

Reply to
Islam Ossama

Honestly I think that if you have no hardware experience, this will be quite a challenge. Not to say that it's impossible but you're certainly going to have a few sleepness nights...

In your case, I would rather suggest an all-software solution. One idea would be to use a PS3 and harness the power of the Cell processor (you can run linux on it I believe). The issue is to parallelize your algorithm enough to harness the power of the 9 cores (similar problem than if you were going to an FPGA solution). If one PS3 is not enough, maybe you could use 2, or 4... You could build a smaller cluster for not too much money.

My 2 =A2.

Patrick

Reply to
Patrick Dubois

PS3? I'm sure you haven't tried to code for the PS3 before. Even IBM officials admit that PS3 coding is a very painful experience. In the same vein, I would recommed you look into programming on a video card. NVIDIA teaches a new course here that gives an idea how to take advantage of the massively parallel nature of the video card architecture. The idea of using the video card in this manner is still relatively new (there was a Stanford project a few years ago that worked on this issue) so the software support is still limited, but probably still better than working with the PS3. Also if you baulk at shelling out the money for an FPGA dev board, a PS3 is beyond your price range. But, you may be able to get some high-end SLI video cards if your school teaches a grapghics course. Actually, I bet the video card solution would be much better than any other proposed solution.

Good luck,

---Matthew Hicks

Reply to
Matthew Hicks

officials

Really? That's too bad. The Cell seems like a nice processor. I met a guy recently at a conference who was considering using the Cell for an hyperspectral imaging application. I thought it was a good idea but I was not aware of its difficult programming. But programming FPGAs is not easy either...

Good idea. One guy in my lab did a summer project to do FFTs in Matlab on a vid card:

formatting link

You might also want to check out these links:

formatting link
formatting link

If you still insist on using FPGAs, maybe you could consider using a tool like System Generator. There seems to be an interesting presentation titled "Introduction to the DSP Video Starter Kit and Video Co-processing Kit" here:

formatting link

Patrick

Reply to
Patrick Dubois

On a sunny day (2 Apr 2007 14:45:45 -0700) it happened "Patrick Dubois" wrote in :

PS3 has only 1 power processor and _6_ SPE cores.

formatting link

And it sucks 200W if fully loaded.

Reply to
Jan Panteltje

Nope, 1 central PPC core and 8 Synergistic Processor Unit:

formatting link

Reply to
Patrick Dubois

On a sunny day (3 Apr 2007 08:21:24 -0700) it happened "Patrick Dubois" wrote in :

Nope, in the PS3 only 6 are available.

Reply to
Jan Panteltje

Alright, I don't want to argue about this but I think we can fairly say that the info on the web is not clear... Just for fun, here's a link directly from Sony with the PS3 Cell specs :)

formatting link

Reply to
Patrick Dubois

On a sunny day (4 Apr 2007 05:43:42 -0700) it happened "Patrick Dubois" wrote in :

OK, all good and well, but here some facts: PS3 runs Linux in a 'hypervisor'. The hypervisor limits access to whatever Sony pleases to allow access too. One SPE is in use for the PS3 graphics, and no way you can touch it from Linux. The story goes IMB had yield problems, so Sony settled for chips with one working core less. That leaves 6 available from Linux. The wikipedia article is up to date and quite correct. Version of Linux that runs on PS3: Yellow dog Linux. If you are in Europe, there is a special C'T magazine release out with YD Linux for PS3 including some of the IBM development tools:

formatting link

All I can tell you now.

Reply to
Jan Panteltje

Linux for PS3

Ok, thanks for the clarification.

Reply to
Patrick Dubois

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.