Video Blob Analysis on FPGAs

B

Brad Smallridge 22 years ago

I have been doing some investigation into blob analysis algorithms and would like to know if there are any designs that have been renedered into FPGAs. Thanks for your help.

Brad Smallridge AIVISION.COM

415-661-8068

Vote

B

Brad Smallridge 22 years ago

Still looking for some suggestions.

Vote

M

Mike Treseler 22 years ago

Consider google.com

formatting link

-- Mike Treseler

Vote

B

Brad Smallridge 22 years ago

What I did for Google was this:

FPGA OR ASIC "blob analysis"

which gave about 165 hits, but none of them seem to point me in the direction of any source code or even to any theoretical discussions. If you think I should try another search then please do suggest something.

Someone was king enough to email me personally and suggested looking for some articals by Sumitomo, which produced a blob analysis chip some years ago. I don't know if that chip still exist and I can't find the articles mentioned. I added Sumitomo to the search above and received only three hits, none of them talked about blob analysis.

Vote

C

Chuck McManis 22 years ago

Are you trying to what the CMU CAM does? Have you read the CMU papers? Blob analysis doesn't mean anything to me so I'm guessing here, but perhaps you meant center-of-mass type things?

The image processing stuff is generally DSP based as the algoritms are to delineate an area using a standard convolver algorithm and then to track its "center of mass" using standard graphics techniques for finding the center of a polygon. (See Eberly's FAQ for that code).

Off the top of my head I'm guessing you could do some clever stuff whereby you keep an array of pixels in a the block ram of an FPGA, and use some form of iso-bar type system to connect regions with similar pixels. I'm not sure what the logic would look like, something like

for (pixel = 2048 downto 0) begin if ( NEIGHBOR(0,pixel) or NEIGHBOR(1, pixel) or NEIGHBOR(2, pixel) or NEIGHBOR(3, pixel) or ... NEIGHBOR(7, pixel) ) then group[pixel]

Vote

E

Erik Widding 22 years ago

Brad,

We are doing some work in this area for a robotics application. When we first started our investigation of the literature, we found that the existing implementations did not meet with our requirements for implementation in an FPGA.

In our system we require that the pixels be presented to any image processing engine only once, or we will get killed on memory bandwidth. Most of the existing literature details algorithms that are stack based that iterate through the image in a pattern that can not be known a priori. Additionally, these algorithms tend to require a variable amount of time, depending on the number of blobs that are identified, and the shapes of these blobs.

A run length encoding approach, which I have seen detailed in the literature, is a good starting point for the requirements that we had, especially since we could ignore non-convex blobs in our application. Though, it should be noted, a linescan blob finder can find non- convex blobs, as long as it is capable of merging two or more blobs that are found to overlap later in the image.

For any discussion to go forward, it would be helpful to understand what your requirements are. If this is a commercial application, please feel free to contact me offline, as we may be able to adapt an existing core from our library.

If this is to be a public exercise, please offer the group an idea as to the following: performance requirement application maximum number of objects that you wish to find in a given image whether you need to just find objects, or track across images do you need to find non-convex blobs?

Blobfinding can be done in guaranteed time in a linescan fpga implementation, the details of the implementation are highly dependent upon the answers to the above questions.

Regards, Erik Widding.

Birger Engineering, Inc. -------------------------------- 617.695.9233 100 Boylston St #1070; Boston, MA 02116 -------- http://www.birger.com

Vote

B

Brad Smallridge 22 years ago

Hey Chuck,

Always good to hear from fellow roboticist.

I have read some CMU documentation but nothing about blob analysis. I had believed that the CMU can only track one blob whereas I need to collect data about lots of blobs. Center of gravity is good. Long axis, area, etc. If you suggest I will look deeper into the CMU docs and web sites. Thanks.

What is iso-bar?

On one of your other post you are comparing a std_logic_vector to another. I suggest you repost and state what libraries you are using. I remember a difficulty in the original VHDL spec that didn't allow for this that may have been patched up by a library. VHDL has been a work in progress for many years with most improvements added via libraries.

Brad AiVision.com

Blob

its

form

sure

function

Vote

B

Brad Smallridge 22 years ago

Hi Erik,

Another roboticist. Thanks so much.

You are exactly right. I want an algorithm that is in sync with the pixel clock. This is generally how I think about and build all my algorithms. And so I would generally expect to keep a several lines of binarized data in the FPGA but not an entire frame. Or perhaps we could use line scan cameras and not be thinking in terms of frames at all.

I could imagine having up to 100 blobs on any line and I don't see why I should limit myself there. The maximum number of active blobs would be one half the horizontal resolution.

I have done some armchair thinking about how to keep and combine blobs if they should merge in susbsequent lines. This may be done by keeping start and stop points of the various blobs in a RAM LUT. But maybe this isn't the best approach?

I saw mention of run length encoding in a web page by Wintress. I do not know what advantage that encoding has over a fifo that would give you the pixel threshold on every pixel clock. I do know that some camera manufacturers like Basler make cameras that threshold and run length encode their outputs presumably to cut down on cable bandwidth.

I am not sure what you mean by the term non-convex. To a newbie that is concave but you don't use the word concave so perhaps this has a different meaning.

I have also come across 4 neighbor or 8 neighbor terminology which I assume represent the rules for making a connection, based on a tic-tac-toe arrangement of pixels. I would suppose that the 8 neighbor approach would be a tad more robust but may lead to narrowly connected blobs with only a point in common. Whis is better?

I do not need to "track across images" if you are talking about some sort of temporal tracking from one frame to the next.

On my web searching I have tried to refine my search to just source code by using unique VHDL keywords like downto and inout. However I didn't find anything with this search: downto inout blob. Any suggestions to better search techniques or Verilog keywords will be appreciated.

Do you want to share with the group what Birger does?

Brad Smallridge AiVision.com

Vote

C

Chuck McManis 22 years ago

That is basically true, but I expect it was a limitation of the compute cycles on the scenix processor more than anything else.

Comes from the weather service, imagine that you have a "cloud" of pixels that you characterise by color (or some other attribute, intensity perhaps). Now all of the pixels that fall in the desired "range" get mapped with a line that defines the edge of the group (where a neighbor is not in the range) On a weather map you use pressure ranges, on a topographic map you use altitude, but the effect is the same, you define a blob by creating a closed line around all pixels that are "in range" This algorithm can convert raw video into something with "blob" semantics (a number of closed polygons), note however you can still get polygons within polygons (donuts).

Back in the day at the Image Processing Institute at USC a couple of the grad students were using techniques like this to help generate maps from arial photographs. ETAK picked up the technology and did some of the first digital maps (actually DARPA used it first, but ETAK was the commercial application).

Until you posted I hadn't really thought about building pixel engines in FPGAs for image processing type applications. (Which is silly considering I did some simulation work on Intel's 82786 Graphics chip which had a built in BLIT engine). The thing I can't visualize is how you could get the information out of the FPGA fast enough to act on it.

This is one of those times where leaving it for a year or so and coming back to it has given me the opportunity to use better tools. Xilinx ships the Coolrunner II in a 56 ball grid array that is just 6 mm x 6 mm (yes about

1/4" on a side) and it occurred to me that you could use that to build a kick-arse PWM unit that was tuned to robotic motor applications. Mounting it on an 14 pin DIP carrier with appropriate 5v tolerance stuff should make it possible for non-SMT capable hobbiests to use it. Then with the left over gates I was thinking quadarature decoder/counter and it seems like a nice little bundle.

So I did a quick layout in latest WebPack which insists on making inputs and outputs to VHDL modules Std_logic_vectors but my counter code was using unsigned.

--Chuck

Vote

J

Jonathan Bromley 22 years ago

hi Brad,

yeah, it's the robot vision community you need to talk with. A few years ago I did a blob analysis module that needed only one pass over the pixels. It used a kind of runlength encoding, although in fact you can do the runlength stuff on-the-fly quite easily. My technique coped correctly with arbitrary blob topologies, BUT at the expense of building a fairly complicated tree as the data structure. It was VERY quick (I got a 30x speedup by recoding someone else's blob finder, and mine was more general!). The application was a bit messy - identifying fish on a conveyor belt, so that the de-heading machine could pick them up by the correct end. We used a line-scan camera, with conveyor motion providing the other direction of scan, but the technique is just as good on a 2-dimensional image.

The basic algorithm is easy to map onto FPGA because it simply looks once at each pixel, but the problem is that it needs a very variable amount of working storage and FPGAs generally aren't very good at that sort of thing. It's easy only if you can plan in advance what features you want to extract - centre of gravity? centre of gravity of the hull (outermost enclosing blob, including all its holes and any blobs within those holes)? centre of gravity of convex hull (quite a bit harder)? bounding box? It's also VERY helpful if you can provide a nice clean margin to the picture, so that there's an unambiguous "background" area extending to the limits of the useful image. If any blobs overlap the image edges, you need to work out what that means and how to handle it.

I'll try to look out the old papers I did on it, although I think the C code is long gone - it wasn't mine to keep. You can guess the vintage because it ran on one of the embedded AMD29K processors (29500?????)...

Nice to know that people are still messing with that kind of stuff.

Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL, Verilog, SystemC, Perl, Tcl/Tk, Verification, Project Services Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, BH24 1AW, UK Tel: +44 (0)1425 471223 mail:jonathan.bromley@doulos.com Fax: +44 (0)1425 471573 Web: http://www.doulos.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.

Vote

E

Erik Widding 22 years ago

We identified the option that you described, and a second approach that involved keeping track of an ID for each blob that was in process and storing with each pixel in a line the blob ID that the pixel corresponded to. We did this as we had to be able to both fill the voids in blobs to get an accurate centroid of the entire blob, and also find an accurate mass and centroid for the voids in the blobs as well. We are basically looking for black circular blobs, that contain six white circular voids of varying mass. By decoding the locations of the voids, we identify one of 729 possible marks. This is used for localization of a robot, as an interim step to scene based localization.

Exactly. In some vision applications, very few features need to be found, and this can drastically reduce the bandwidth.

The term evolved from the notion that blobs are inherently convex. A non-convex blob has a concave feature somewhere on its boundary.

Which mode of connection you choose is entirely application dependent. Four-connected blobs are sufficient for convex objects, eight-connected are more important if the feature size to pixel size ratio is quite small.

I do not suspect you will find any source code, but please don't use my skepticism as a reason not to look.

Actually, our involvement with robotics has been limited to the development of a vision system for a robot. Birger is primarily involved in the development of vision and compression algorithms on FPGAs. The vision system that we have devloped has the following characteristics: PC/104+ form Xilinx XC2VP20-FG676 FPGA (up to a XC2VP40 will fit) 64MB DDR SDRAM (32 bit datapath) Xilinx SystemAceCF VGA LCD interface w/HV inverter for backlight Stereo Audio CODEC Nine VGA camera inputs (one color, eight mono) MJPEG compression, and color converion for color camera Stereo disparity search on four QVGA camera pairs

Features that we are currently working on: Dewarp - 8 x VGA input, 8 x QVGA output Blobfinder/FiducialFinder MJPEG decompression

Everything runs at 15FPS. Many of the engines run at 30+FPS. Due to power considerations, we developed the VGA cameras from scratch, which burn approximately 200mW each, and are approximately 20mm on a side. Is this hardware/IP that other robotcs developers would find useful? This has been developed for a government agency, and we are investigating how we might productize it.

Regards, Erik Widding.

Birger Engineering, Inc. -------------------------------- 617.695.9233

100 Boylston St #1070; Boston, MA 02116 --------

formatting link

Vote

Video Blob Analysis on FPGAs

Join the Discussion

Didn't find your answer?