image processing, object recognition (embedded / resource limited)

M

Markus Fischer 15 years ago

Hello Everybody,

I'm looking for literature on object recognition algorithms on very limited systems.

A low resolution image (10k..50k pixels) has to be processed to get the position of an object, and to check it's dimensions.

The application shall run in a single chip microcontroller.

Thanks.

Vote

L

linnix 15 years ago

Use ray-tracings for feature extraction.

=A0

Use USB webcam with jpeg. For example: Philips SPC230.

Use USB host controller with 96K ram. For example: PIC24FJ256.

Vote

R

Rafael Deliano 15 years ago

It very much depends on the object and the application.

Hu?s Moment Invariants ?

Covered in all the better books on image processing Or in slightly more detail in:

Mukundan, Ramakrishnan"Moment Functions in Image Analysis" World Scientific 1998

But most of the interesting stuff is in journals as usual. Moment invariants have been used on tracking the ping-pong ball for Andersson "A Robot Ping-Pong Player" MIT Press 1988 That machine used a custom chip to calculate the moments. A similar frontend was later in 1990 implemented by ETH Zürich on an early XC3090 Xilinx FPGA. Probably 512x512 pixel, 8 bit greyscale, 50 pictures/second. Apart from the "moment generator" that can be a FPGA there is a microprocessor needed for the arithmetic part. Moments in some form or other have been used in "robot vision" in the 80ies for objects with simple shapes. Some people claim Hu?s Moment Invariants are usable for simple character recognition.

MfG JRD

Vote

D

D Yuniskis 15 years ago

I guess "very limited" depends on your frame of reference... :-/

That, too, can mean a lot of things. :<

Is this *one* object (multiple copies thereof)? Or, do you need to be able to re-train the system for a different object, e.g., "tomorrow"?

Is the image monochrome, greyscale, color?

How sophisticated need the recognizer be -- are you looking at two dimensional objects and just checking for rotation/translation? Or, is it a complex 3D object that can be positioned with infinite variability along all three axis? Can you interact with the object?

How fast does the recognizer have to be? Will it be working from still images (flash photography) or live video?

Does the image give you the dimensional resolution you need?

Besides dimensional reasons, are there other "contaminants" that get into the input stream (e.g., what if someone puts a "rubber ducky" in front of your recognizer)?

Look into self-organizing maps and see if the approach fits your needs.

Sounds like fun, regardless!

Vote

C

Clifford Heath 15 years ago

I don't know about "very limited systems" but the recent work (and patents) by Numenta Inc are very interesting. They really seem to be cracking this problem.

Clifford Heath

Vote

M

Markus Fischer 15 years ago

Hi Rafael,

he =

this might be too computational expensive. Since it is a simple shape an= d =

has very good contrast, I consider processing just the edges.

I found only few information about this book.

What about J. R. Parker "Algorithms for Image Processing and Computer =

Vision"?

the initial target was to avoid external memory, but maybe I can suggest= a =

FPGA.

Vote

M

Markus Fischer 15 years ago

Hi "D",

Vote

W

WP 15 years ago

=A0

Looks similar to what I have worked on. In my case we were able to apply a simple threshold to turn the image into the 2 level binary, with few false pixels, so generally white pixel means the object and black pixel is the background. Then a simple way we chose was to find the mass and center of mass of the object. The mass is just the number of white pixels, the center of mass is the average of white pixels coordinates. These two values give the area (easily convertible into dimension) and position. You did not want to calculate the moments but you have so small image that it may be reasonable to do it anyway.

Finding the rotation may be trickier and depend on the object shape. Often a simple working way is to find the pixel which is the farthest from the center.

Knowing that there is a little deviation, you can just check the parts of the image which are uncertain and assume that there is a certain object interior (e.g. a rectangle) always white, and some known certain rectangle exterior always black. So you never look inside the small rectangle nor outside the big rectangle. There is only a narrow frame of interest.

Wojciech Piechowski

Vote

L

larwe 15 years ago

As a data point, this kind of thing (including color recognition) has been implemented by Sunplus in some of their tiny 8-bit chips (6502- type core) for toy applications. For example, can recognize triangle, circle, square, and basic colors. Maybe you can use a canned solution of this kind and just layer your additional logic on top?

Vote

M

Markus Fischer 15 years ago

Hi larwe,

in any case, my next step is improving my basic knowledge about the topic, therefore the demand for literature.

Vote

D

D Yuniskis 15 years ago

OK

[questions still on the table]

Yes, but is that resolution sufficient to act as go/no-go? I.e., if you are "off" by one pixel, does that mean the object is out-of-spec?

Are your camera and objective in fixed places (i.e., is there any variation in apparent scale from image to image)?

Are there any easily recognizable "features" (grrrr... I had preferred not using that term, here -- how about "characteristic" instead!) in the object(s)?

What comes to mind is a feature () based recognizer similar to that used in gesture recognizers -- if you can identify features that are efficiently computable. (E.g., Treat the outline as a simple shape. Then, "walk" around that from some convenient starting point and compute the "tangent" (dy and dx, not the

*true* tangent) at each point. Take this array of values and compare it against a template ("sliding" it along -- which corresponds to rotation).

Trick is to come up with scale/rotation invariant features which can be used to determine those things. Once you have identified the orientation of the object, you can map it onto your model and note discrepancies.

I recall stumbling on a crude (gesture) recognizer that was rotation-invariant (something that was unacceptable to my needs at the time). I will see if I kept the paper in my files...

Vote

D

D Yuniskis 15 years ago

Ah, this sounds promising! I would be concerned as to how precisely the OP can identify the *actual* edge of the object, though. If he has some knowledge of the intended shape, perhaps it would be easy to fit an appropriate curve to the "hypothesized" points on the perimeter to give better data for the "fuzzy" points (i.e., a more exacting form of antialiasing). With as few as 10K image pixels to work with, being off by half a pixel all around the perimeter can represent ~0.5% measurement error or more.

That would depend on the actual shape of the object -- the outer "black" (background) would have to accommodate the convex hull in any conceivable orientation (e.g., a triangular object would be annoying).

Likewise, the "white interior" could end up being equally unrestrictive (e.g., consider a long, slender object at 45 degrees)

Sounds like an interesting problem!

Vote

R

Rafael Deliano 15 years ago

Didn?t get around finishing it, yet. But my incomplete description might give you an idea how it works ( guess you don?t mind that the text is in German and buggy ):

formatting link

My picture is 16x64 pixel black & white coming from a vintage OCR-"pistol" Siemens Nixdorf from the 80ies. Original intended hardware was 6502, several of them. But they are insufficient. Waiting for an ARM7 till i have another go at it.

Don?t have it, can?t comment. My two recomendations are: Pratt "Digital Image Processing" Gonzales Wintz "Digital Image Processing" Also ok: Jähne "Digital Image Processing" has an german version too: Jähne "Digitale Bildverarbeitung" These have short descriptions on Hu.

There is not much literature. Collecting on an exotic topic is more a job of years then days. But if you go for Hu and are in Germany i might loan you the box with the photocopies & books i am hoarding. Will take some years till i get around finishing my Hu-machine for the

formatting link

Do you have to investigate alternatives ?

there are variants on moments like * Zernike * ( Prof Burkhardt ) Fourierdeskriptoren
there are variants on Walsh: R-transform by Reitboeck Brody ( not recomended for actual use here )

Actually Prof Burkhardt might be the man most knowledgeable on that topic in Germany

formatting link

He started out with Walsh, but soon got working on alternative methods: / H. Burkhardt. Transformationen zur lageinvarianten / Merkmalgewinnung. VDI-Verlag, 1979 (Im Sekretariat / zum Selbstkostenpreis von 5 EUR erhältlich).

MfG JRD

Vote

M

Markus Fischer 15 years ago

Hi Rafael,

I understand the idea, but I think I don't even need invariant results.

32 bit registers make thinks much easier. But what is the justification of ARM7 compared to CM3?

=

I found "Digital Image Processing: PIKS Scientific Inside" dated 2007.

now Gonzalez and Woods. Some reviews say it's somewhat dated.

the English 2011 edition isn't yet available, but the German is, and has= =

good reviews. I will check it.

[...]

Thanks for the offer, after reading about the basics I will decide.

of course. That's the reason I'm asking for literature.

The usual moment methods are simple but I feel that it's not necessarily= =

the best for the problem I'm asked to solved.

Thanks for the literature suggestions!

Vote

M

Markus Fischer 15 years ago

Hi D,

[...]

does it matter? The resolution is "good enough" and given.

no variation.

what is the benefit over the usual summing algorithms?

A rotation invariant feature to determine rotation?

That's the obvious way, isn't it?

Vote

D

D Yuniskis 15 years ago

If the resolution is the *minimum* required to differentiate between "good" and "bad", then you have to be able to accurately map the objects arbitrary position (i.e., unaligned to the camera's pixels) to that grid translated by a fraction of a pixel.

I.e., take a ruler ("scale") that is graduated in some arbitrary unit. Define that unit to be the resolution you claim to need. Now, place that ruler on top of an arbitrary object. Neither edge of the object will be guaranteed to perfectly line up with the graduations on the ruler. In fact, the exact position of each edge will further be blurred so that you can't easily judge where the edge lies *between* two adjacent graduations. *Now*, measure the object TO THE PRECISION OF THE SCALE (ruler).

[see my point?]

Depends on the features you need to compute and the cost of computing them. You only have to "walk the edge" instead of the entire "body" of the object.

Do you think that sounds ambiguous? Consider the "feature" that Wojciech suggested: distance (of each point on the periphery) from center of mass. This is rotation invariant:

- center of mass computation does not rely on object orientation

- distance of point to CoM does not rely on object orientation

OTOH, a feature like "bounding box aspect ratio" varies with rotation (consider how this would vary for a long, skinny rectangle as it is rotated). Furthermore, that variation isn't necessarily germane to determining the actual orientation of the object (imagine that rectangle has a small notch cut in one side -- which does not affect the aspect ratio of the bounding box yet clearly makes "notch up" vs. "notch down" very different orientations)

[we've not mentioned mirror images -- i.e., if the part is placed on the inspection surface upside down and isn't symmetrical]

Vote

R

Rafael Deliano 15 years ago

Exhibits at the "vintage computer festival europe" should be at least 10 years old ...

Thats true of many books that cover moments, Hu dates from 1962. Old algorithms that were viable in the

70ies on minicomputers are now ok for microcontrollers. They are usually very basic and simple. And have often a (modest) track record of successful application.

The alternatives i mentioned fall in the two usual categeories:

Zernike may get somewhat better results at much added complexity.
R-transform may be insufficient powerfull for the real world. I have never heard of industrial use. But if you have to write a fat report with a chapter "alternatives" then it may be usefull to have a look at them. Less so for practical implementation.

Correlators did exist:

formatting link

( text in German ) But rarely things are that simple.

If you ask for outside opinions ( i mentioned Freiburg ) you have to work on defining what the device is supposed to do and what are the constraints ( academic project: one prototype ; industrial project: volume production ; ... ). People can only get so much out of newsgroups as they put in.

MfG JRD

Vote

M

Markus Fischer 15 years ago

Hi Rafael,

not more complex with precomputed weighting factors. Have to check which factors give better results.

not that different from moment methods, already on the list of solutions to investigate.

For now, reading the literature is the way to get a better (broader) knowledge of the possibilities. Then I decide whether to rely on my capabilities.

Vote

image processing, object recognition (embedded / resource limited)

Join the Discussion

Didn't find your answer?