FPGA based database searching

Hello,

I've been searching the internet for days now and still I'm not sure about what I am trying to do. Okay now, I've got a software implementation in ANSI-C for a complex database searching. The database is a proprietary format where I am saving data, which has to be given as a result, depending on the input data. Problem is, the software implementation is far to slow.

Now I am looking for alternative ways for a faster implementation and came across the idea to implement the whole database searching as electronic circuit in an FPGA. The database is of course far too big to save it inside an FPGA, e.g. the BlockRAMs of a Spartan3. Therefore external memory has to be used, slowing down the throughput. Target is a database searching of

262144 elements with 16 bit each in maximum 220 ms.

Does anybody have experience with FPGA based database handling or similar tasks? Your urgent response will be highly appreciated!

Regards

Norman Bollmann

Reply to
Norman Bollmann
Loading thread data ...

How slow is the software solution? Are you sure it can't be done on a faster PC? Have you tried searching in parallel? i.e. 64-bits at a time. That would mean you need to search 64k entries in 220ms. That's

3us per entry. You can execute quite a few instructions on a PC in that amount of time. A dataset of that size will fit in to your cache too.

Jon

Reply to
Jon Beniston

Hi Norman, If I understand correctly, you want to do one 16 bit compare per 800ns. That's easy to do with an FPGA and some memory. You can go maybe 3 orders of magnitude faster than that with (say) a 64 bit DDR2 external memory. The FPGA companies and others sell development boards with memory on them. HTH., Syms.

Reply to
Symon

In which case the algorithm is probably bad. Modern database technology is pretty darned good, and is largely limited by storage access speed. I think you should start by looking at better ways to index your database, rather than going for more brute-force speed.

Do you really mean this? Only half a megabyte of data? That's easily small enough to fit into main memory of any reasonable computer. At 220ms per 256K words you have nearly a microsecond to work on each word - enough time for dozens or even hundreds of machine instructions. Are your numbers wrong? If not, there's something sadly wrong with your software.

I'm sure it can be done, and I'm vaguely aware of hearing of people doing such things, but you haven't yet made a case for needing it. FPGAs can provide impressive speedup for some types of sorting and indexing tasks, but it is often harder to decide how to keep the datapath busy than it is to decide what to do with the data when you've got it inside the FPGA fabric. Database problems tend to have patterns of address activity that fly around all over the place, as a strong function of the data itself, under control of complicated algorithms. That's something that software tends to be much better at than hardware.

--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which 
are not the views of Doulos Ltd., unless specifically stated.
Reply to
Jonathan Bromley

Is the concept of "Content addressable memory" known to you?

formatting link

Reply to
RCIngham

Get a faster server, load linux. FPGA's are OK for data filtering or statistics. If you need "complex database searching" you need a computer.

-- Mike Treseler

Reply to
Mike Treseler

There are only 65536 possible values of your 16 bit value. What is the output of your algorithm? Found y/n? Then use a lookup table! I doubt an FPGA is the right answer to your problem. On a modern CPU you've got thousands of cycles for each element -- assuming you even have to consider all of them for any given key.

--
Ben Jackson AD7GD

http://www.ben.com/
Reply to
Ben Jackson

The bottleneck for most searches has to do with how many compares can be done per unit time, which usually boils down to how fast data can be brought into the search. Unless you have a significantly faster mechanism to access the data than the CPU, you probably won't be able to search it any faster. That's assuming you have the undivided attention of the CPU. If the CPU is busy doing other things while your FPGA could be doing the searching, that might improve overall performance.

Block ram usually won't help much, since you can only access one or two addresses at a time. If you had a multi-word key, then pulling the data into a multi-way structure (flops or replicated block rams) such that the different parts of the key could be compared to multiple data words in parallel, then you'd be getting somewhere.

Andy

Reply to
Andy

If you are searching 16 bit datums then your search space is limited to

65536 different datums. Why would you want to store more than this ? A hashtable comes to mind, for instance.

What is your searching algorithm ? Why is it that complex ? Can you post a description ?

Taking well-designed software as example, your 220 ms time budget is enough for Postgresql to perform about 3000-5000 simple SQL queries returning 1 row with btree index lookup on a table with millions of rows including network overhead (as long as it doesn't need to hit the disk), in 220 ms Xapian can make multiple full text phrase searches with ranking on a multi gigabyte text dataset. These are per core of course.

So, I suggest reviewing your search algorithm ;)

Reply to
PFC

What type of searches are you performing? What other operations are required? There are data structures that do an identity lookup in constant time and interval searches in logarithmic time. A CPU should do that in less than a microsecond.

Kolja Sulimma

Reply to
Kolja Sulimma

Hi Robert, Do you have any examples of any 'proper' CAM devices that I can research? I know that Altera's ESBs can be used as small CAMs, and that Xilinx have an app. note design that can do single cycle reads and multi-cycle writes using a BlockRAM. It would appear that Micron have dumped their 2Mb 'Harmony' device long ago. I found various dead links to products, but nothign active. Thanks, Syms.

Reply to
Symon

Useful CAM structures don't fit well in FPGAs and the vendors have quit trying. Some sort of hash table may be a better fit.

formatting link

-- Mike Treseler

Reply to
Mike Treseler

I suggest you optimize your software algorithm. 262144 elements fits in the cache of any modern PC CPU. You'll have a hard time creating an FPGA design that will perform better.

--
Programmeren in Almere?
E-mail naar nico@nctdevpuntnl (punt=.)
Reply to
Nico Coesel

Hi Mike, Right, but are there any custom CAM devices out there? Thanks, Syms.

Reply to
Symon

Most have dropped out. Check idt and netlogic.

Reply to
Mike Treseler

Looks like they're called "network search engines" these days. Thanks mate!

formatting link
formatting link

Reply to
Symon

And Google can search the entire freaking Internet :-)

What he said.

Describe problem more please.

G.

Reply to
Gavin Scott

Or systolic array if you are doing many such searches at the same time.

formatting link

-- glen

Reply to
glen herrmannsfeldt

Thank you very much for your feedback, so far!

Oh my.... I'm sorry. That was definitely a spelling mistake: it's supposed to be "26290144 elements". Sorry 'bout that!

Regards

Norman Bollmann

Reply to
Norman Bollmann

d

Even with 26M elements to search, as long as they're only 16bits, that's still only 65k choices. A hash table based thing as someone else pointed out would be a good option.

Can you give some details about the search you need to perform?

Chris

Reply to
Chris Maryan

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.