"Filtering" pointing device data

- D
- D Yuniskis
  
  Contact options for registered users
posted
14 years ago

Tue, Jul 7, 2009 6:55 PM

Hi,

I have no idea what the best forum for this question might be -- so, I'll start here... I'd appreciate pointers to other fora that may be more appropriate!

[My comments are lengthy -- intended to anticipate problems with some solutions that might be offered.]

----

I'm looking for ideas on how best to implement a "filter" for preprocessing of 2D point data (x,y) from a "pointing device".

Pointing devices can include things like mice, digitizing tablets, "nibs" (trackpoint), touchpads, ocular mice, "air mice" (gyromouse), etc.

Data are sampled, equally spaced in time. Assume the data represent absolute (x,y) coordinates in some arbitrary space (i.e., devices that inherently generate relative data -- like mice -- can be trivially mapped into this space). For the time being, assume the size of a unit distance is unknown but constant (i.e., it may vary between devices but will not for a particular device). Assume that data space is (effectively) boundless.

The characteristics of the filter may need to change based on the pointing device used. E.g., a mouse's ballistics are different than that of a digitizing tablet. Indeed, each axis may require different processing (e.g., Y-axis motion is more difficult for a mouse than X-axis; the converse is true for a pen interface!). Also, the parameters governing a particular filter may require tuning to fit the individual, etc.

Assume the digitizing device provides "accurate" data. So, the purpose of the filter is not to compensate for hardware problems but, rather, the user's capabilities and/or limitations.

Having said all that, the question that remains is: "Why is a filter necessary?".

First, consider that different pointing devices have different "problems". As mentioned above, a user will typically be much more comfortable making side-to-side motions with a mouse than vertical ones (try it!). Likewise, pen users will find vertical motions much easier than side-to-side motions (again, try it!).

Second, users have different physical capabilities. These capabilities change with age. Those very young users may have problems with fine positioning. The motion may exhibit lots of starting and stopping. They may overshoot their intended mark, etc. The same issues may hold true with elderly users. User's with tremor (e.g., Parkinsonian) may have problems controlling the trajectory of a particular motion. Others may have strength or stamina issues that come into play. Or, other physical disabilities (e.g., use of a mouth stick).

Finally -- and most importantly -- a pointing device may be used for purposes other than "pointing", per se.

There are lots of approaches I can try that will mask these problems to varying degrees. But, that is often a side-effect of the filter implementation instead of a deliberate compensation.

For example, one could simply compute a running averages (sliding window) of the X and Y coordinates (independently). The size of the window and the weighting applied to each datum in that window will determine how "heavy" the filter action is.

But, this is a naive approach, IMO. What is the rational for such an approach (besides being easy to implement)? Why choose one particular set of weights over another? How does it correlate to each of the particular "user issues" mentioned above (e.g., if the window isn't wide enough, how do you address very low frequency Parkinsonian tremors)?

Most of the prior art that I have located glosses over the criteria they used (if any!) in the design of their filters. Nor do they analyze either the effectiveness or the value of their particular filter implementation(s). It's as if someone, on Day One, decided they should include "some sort of a filter" without giving any further thought to how that filter would act and what it would do for their implementation!

I'd really like a solution(s) that have some sort of reasoning behind the design/approach.

Thanks!

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Wed, Jul 8, 2009 2:50 PM

Your questions are really about human factors more than filtering. You might try comp.dsp for some insight.

When you get right down to it, you may be doing something unique enough that there's no experienced people out there, although I would expect that anyone who works with the neurologically impaired would have some opinions.

--
http://www.wescottdesign.com

- H
- Hans-Bernhard Bröker
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Wed, Jul 8, 2009 7:35 PM

As of this point of your question, you've already rendered it most likely meaningless. As you point out yourself, not just the artifacts such a filter could be meant to weed out, but the very need for one in the first place, is totally dependant on the type of pointing device. That means it's pretty much impossible for one type of such filter to be useful for all pointing devices.

You seem to have set out to design a panacea. Doesn't work.

That's not a job for a filter. It's a job for visual feedback to enable the user's hand-eye coordination to work with. That takes training on the user's side, including adaptation to the type of pointing device, and I seriously doubt any filter will work to make that noticeably easier. Actually, it's quite likely to make things worse --- the user's hand-eye coordination and your filter would end up fighting each other.

Hmm... didn't you just say we were to assume that the device has no problems (i.e. it's accurate as-is)?

- L
- larwe
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Wed, Jul 8, 2009 7:44 PM

It doesn't really matter. If you were looking for a PhD thesis topic, you apparently found one that meets all the normal criteria - something obscure (so generating "original" research is relatively easy) and something that's probably useless and/or insoluble.

If you were just looking for a topic of conversation, I don't think this particular one will earn you much face-time with members of the opposite sex, but you could always try it out and let us know.

- G
- George Neuner
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jul 9, 2009 6:21 PM

That's a little cynical. I agree the inquiry is unfocused but from the sound of it, the OP is likely working on something for the handicapped.

George

- L
- larwe
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jul 9, 2009 6:29 PM

Did you ever meet an engineer who wasn't cynical?

If the OP does indeed have such a project in mind, then fine - I withdraw the comment, with apologies. It sounded to me however like a vague academic question.

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jul 9, 2009 6:31 PM

[snip]

Yes -- and no. I don't expect folks in this forum to have much (if ANY) exposure to designing for disabilities, etc. It's actually embarrassing how few designers have any REAL experience in that field or curiosity about how to improve their devices to address these other needs. Large firms *might* have a group dedicated to this sort of thing. In which case, if you're not in that group, you're clueless as to the issues involved. Small firms often don't have the resources to throw at the problem so it just gets ignored. I guess the sentiment is that "there are enough non-disabled users out there so why worry about those few (?) that aren't?"

It will be amusing to see how this sentiment changes as this population of designers ages and suddenly realizes how many products (including ones that THEY may have designed) simply are unusable as THEY develop these "disabilities" :-/ Diabetic retinopathy, Parkinson's Disease, Macular Degeneration, infirmity, etc. -- gee, ain't old age grand? ;-)

But, I think these issues also pertain to "normal" users as well. E.g., I can't believe the folks who design micromanipulators, surgical robots, and/or VR/remote actuator technologies "DC couple" the controls to the actuators. It would be just too easy for a simple spasm, twitch or careless motion to result in damaging the item(s) being manipulated (paraphrasing B.B. "Don't tug on that; you never know what it might be attached to!")

I think they can help with the implementation details of a particular filter. But, I suspect most folks over there aren't exposed to as wide a range of applications. Most DSP still is relegated to higher bandwidth issues (whereas I'm talking about infrasonic)

As I mentioned in my post (i.e., the last paragraph quoted above), there obviously is at LEAST a perceived need for such filtering (in some application domains) as prior art *includes* such things. My dismay is that there is no explanation or rationale given for the actual implementation choices made nor an evaluation of their effectiveness.

E.g., one might consider it "good practice" to put a small anti-aliasing filter on an input to an A/DC. But, doing so blindly can be wasteful or counterproductive. If, for example, you are monitoring the voltage on a battery (i.e., very low output impedance) with a constant load, those components don't add to the quality of the data that you get. At the other extreme, fast signals will be distorted/attenuated/phase shifted by such a network whereas a good S&H with appropriate aperture sizing will eliminate their need.

(My point being, blindly adding a filter without a reasoned explanation, justification and EVALUATION is just silly -- it shows a lack of understanding of the issues at hand)

Yes, that is a good insight! Ditto neuromuscular disorders.

Since this was relatively easy to do (i.e., you can find an occupational/physical therapist "in the phonebook" whereas it's hard to find "micromanipulator designers" -- at least not in MY yellow pages! :> ), I contacted a few (three) different therapists and asked the question in a form more suitable to their experiences.

All were able to describe various ailments that cause motor issues. These range from chronic "diseases" like Parkinson's to disease side-effects (various forms of polio) to physical trauma (head or limb injury) to congenital issues. Some were able to give me quantitative figures regarding the amplitudes and frequencies of tremors in certain of these cases (Parkinson's seems to be the most widely documented). Of course, these folks don't think of these tremors in terms of amplitudes, frequencies, etc. but you can easily convert their descriptions to those terms.

However, like most "professionals", their experiences are severely limited to the application domains in which they work. So, the very idea of alternative approaches to alleviating the consequences of these issues doesn't occur to them "unprovoked". They simply accept whatever new innovations come along and work them into their therapies.

As such, they weren't able to relate the needs of their clients ("patients") to the technologies that I am working with (this is what makes MY job so exciting and challenging).

I've talked with two MD's who, once the suggestion was made to them, *could* see how technology could address this issue (perhaps a consequence of broader educational or experience background?). But, can offer no more specific guidance.

It's, as you said, largely "unique enough that there's no (few?) experienced people out there". I suspect I will be able to add one to that list! ;-) I'm currently exploring a promising approach that I hope to test on a small group, soon. (gee, if I were the type that liked to see my name in print, I could publish an article! :> Not worth the effort! ;-)

Thanks for your comments!

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jul 9, 2009 6:32 PM

Sorry, I didn't mean to imply that there was a *single* filter design that could solve all of these problems. Rather, I was trying to enumerate various types of pointing devices that are available (and in common use) as some folks may not have direct experience with some/all of them. Or, may simply be unaware of the variety of such devices available. I am amazed each time I stumble upon a new technology for these things -- who'd have thought there was that much variation in what most folks would just consider as "a mouse"?

At the very least, I wanted to avoid the assumption that "pointing device" was synonymous with "mouse".

(sigh) You're thinking too narrowly -- most probably a consequence of relating my "problem" to the "typical" environment in which you no doubt work with a "pointing device": your "PC". (this was one reason I wanted to avoid the association of "pointing device" with "mouse" as it naturally leads to the assumption that you are dealing with a PC (or, PC-like device)).

You are assuming that there is a direct feedback loop "closed" by the user's eyes (and brain). That's another consequence of the PC/pointing device model that most folks think of. :< I guess I wasn't sufficiently general in my description of the problem space.

Your assumption implies that blind/visually impaired individuals simply can't use "pointing devices" because they can't SEE to close the loop.

Of course, other feedback mechanisms are available for this in the "PC" environment as well as other environments where pointing devices are used. (I *really* would like to get away from the association with PC's -- after all, this *is* c.a.E, is it not?)

But, there are other applications where pointing devices are used without *direct* feedback. E.g., gestural interfaces can be used "eyes-free" (sightless) and still work fine. The feedback isn't direct in that the user doesn't necessarily see where his "pointing device" happens to be at each instant of its trajectory. Instead, his feedback comes when the device interprets his complete gesture. That gesture may be "displayed" in some way, *or*, simply acted upon. If the device's actions aren't what he expected, then he knows his gesture wasn't formed in a manner consistent with the device's ability to recognize same. (and, his remedy is typically to repeat the ENTIRE gesture; so, the immediacy of eye-hand feedback is completely gone -- the system has LOTS of lag!)

Also, consider that some (sighted) users could still find your "eye-hand feedback" solution unusable. Have you ever analytically watched a user with Parkinsonian tremor try to perform fine motor tasks? Sure, he can see the "display" showing his "mouse position" (grudgingly falling back on the PC analogy to simplify this explanation). But, all he sees is a cursor jerking around on that screen. And, furthermore, the screen itself seems (to him) to be jerking around as a consequence of his accompanying head tremors. Do you think he considers this a friendly interface?

Finally, consider how users with physical disabilities have had those devices "adapted" to fit their disabilities. Or, how they have adapted their skillsets to conform to the device's "needs".

A person with Parkinsonian tremor might resort to using his non-dominant hand to "steady" the hand manipulating the pointing device. Or, resting *it* (not just his palm) against a stationary surface. These are MECHANICAL FILTERS.

Persons with other physical disabilities that use "speach boxes" (not speech synthesizers, different animal entirely) find each of the (numerous!) buttons on the control panel bordered by "walls" (i.e., each button effectively sits at the bottom of its own 0.25" deep well). Again, a MECHANICAL FILTER that guides the user's finger tip to a specific point on the panel.

Similar techniques have been applied to allow the use of gestural tablets (like PDA's) for this type of user.

So, your statement that this is "not a job for a filter" seems to be in conflict with current practice. :< If mechanical filters are effective, then digital ones should be equally so. And, digital filters have the advantage of being adaptable to the user, the technology and the application. Mechanical ones much less so (in practice).

["Assume the digitizing device provides 'accurate' data. So, the purpose of the filter is not to compensate for hardware problems but, rather, the user's capabilities and/or limitations."]

I said that the device itself provides accurate data. I.e., there is no "device generated noise" added to the "signal" that is ultimately processed. Any "noise" in the signal is actually *signal* itself (i.e., the result of problems in the user, not the pointing device or the electronics that process that information!).

I deliberately wanted to avoid issues inherent in a particular device technology (e.g., a resistive tablet requires digitizing analog signals to report the X,Y position of the stylus. Noise present on those signals could -- with bad conditioning -- look like errors in the "position data"). For the purpose of this discussion, I wanted to address only "real data".

OTOH, the paragraph that you cited above is intended to point out that different devices have different *usage* "problems". E.g., Make a trackball out of a bowling ball (it's actually an amusing exercise!) and you will discover how severely inertia affects your abilities to make accurate, high speed motions!

I suspect, without actively thinking about it, that most folks haven't considered the difference between pen and mouse usability. At least not in the deliberate terms that I expressed it here (e.g., motion in X vs. Y). OTOH, if you play with a variety of pointing devices, you quickly get an idea for which "feel right" in a given application (try writing your name with a mouse). Yet, you probably never take that a step further to figure out *why* a device "feels wrong" or "feels right" in that context.

As I said in another post, it seems most folks don't have much experience with these sorts of issues. I am hoping that I can find someone who has addressed similar issues in a different application space who might be able to relate the techniques used (e.g., micromanipulators, surgical robots, VR technology, etc.). If not, I'll find a way on my own (that's why they call it "engineering"!) An approach that I am currently following seems to have lots of promise. And, I can, at least, document *why* the various decisions were made instead of just glossing over this (big) aspect of the design.

Thanks for your comments! I hope I've clarified any misunderstandings in my original post.

- H
- Hans-Bernhard Bröker
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jul 9, 2009 9:44 PM

Well, then you picked rather a bad word for it, because "pointing device" has been the widely accepted name for the class of devices best described as "a computer mouse or anything else to be used instead of one" for decades.

If there isn't, there's no valid reason left to refer to this as a "pointing" device. You don't just point, you point _at_ something. Without feedback, you can't know what you're pointing at, so the device can't be a pointing device. It might still be a position acquisition device, though.

That's because they can't. You can't point at something if you have no way of knowing where it is.

Without the display, it would be quite impossible to learn to use such a device. Blind people can't learn to write with a pen on paper, and no amount of filtering their pen's movements will change that one bit.

And without seeing what he gestured, he stands not the slightest chance of ever finding out what was wrong, much less correcting the mistake.

- G
- George Neuner
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jul 9, 2009 10:44 PM

But they can use sound guides. It's painfully slow, but it gets the job done.

No, but the blind can learn to spell and to read and type braille.

Wrong. Most blind people still learn finger spelling, which is done on the palm of the reader's hand. Though I haven't seen it done, I can easily envision using character/stroke recognition with a touch pad to allow free form finger spelling input. The computer can echo back the input in braille, in speech or both.

George

- F
- Frnak McKenney
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Fri, Jul 10, 2009 2:16 PM

(This is the Curse of USENET: Reading a post and seeing one small point that doesn't seem to fit... and and feeling the Need To Correct It. )

If you limit yourself to a simple eye-effector model, this is true, but humans also have touch, and (if I recall the correct term) a "kinesthetic" sense.

Touch typing involves (or so I imagine, since I've never been very good at it) creating reflexive arm, forearm, wrist, and finger movements based on visual feedback from the letters/characters appearing on paper (or a screen). Further, learning to _avoid_ looking at the keyboard is (as I was repatedly told in the 7th grade) critical to learning touch-typing, so in a way it depends on "blindness" as far as the keyboard and fingers are concerned. But, even if your eyes are perfect, I don't see any way you can "think" (say) "a" and have your "fingers" (shorthand for all the above) move to even properly- sequenced positions without touch and this kinesthetic sense.

With my eyes shut, I can move my mouse in a circle at least as well as I (currently) can with my eyes open and tracking a mouse cursor. I'll believe that there are people who can make better circles under both sets of circumstances (I certainly hope so ), but I don't think that that invalidates my argument.

Think about a blind person learning to touch-type on a computer. The visual feedback may be missing, but I suspect he could learn through sound: hitting "a" causes "aaaaaayyy" to come out of the computer speakers. Or -- perhaps better -- he hears a brief 440Hz tone, since it won't slow his typing as much.

After learning the basic (not BASIC ) keyboard layout, he starts up a better application ("Much, MUCH better than new!" ) with a Text-To-Speech engine that translates his typing into words. (Come to think of it, a good typist might prefer the tone feedback on the basis of speed, and leave TTS for later proofreading.)

Agreed, a keyboard is a quantized pointing device (and I've used up my Pedantry allowance, so I won't... er, "point" out that mouse's position is also quantized). But I think that if one were to use a draw/paint package in "freehand" mode, a sighted person could learn a wide variety of mutually-distinguishable gesture-patterns by drawing with one's eyes closed, then looking at the reaults afterward. Yes, it's cheating (a little), but the real point is that learning _can_ take place under these circumstances.

Now, it's true that a sightless person would need theremin-like feedback to use a pointing device to actually _point_, but that's not the same as saying a sightless person can't use it to record gestures. He just needs non-visual feedback.

Frank McKenney

-- The vice of the modern notion of mental progress is that it is always concerned with the breaking of bonds, the effacing of boundaries, the casting away of dogmas. But if there is such a thing as mental growth, it must mean the growth into more and more definite dogmas. The human brain is a machine for coming to conclusions; if it cannot come to conclusions it is rusty. -- G.K. Chesterton: Concluding Remarks on the Importance of Orthodoxy (1905)

-- Frank McKenney, McKenney Associates Richmond, Virginia / (804) 320-4887 Munged E-mail: frank uscore mckenney ayut mined spring dawt cahm (y'all)

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sun, Jul 12, 2009 5:39 PM

You seem to be stuck with this particular notion of "pointing device". See

formatting link

Contrast "pointing device" with "motion controller", for example.

Mice are pointing devices. Not all pointing devices are mice. The Public tends to think mice are the only means of interacting with a computer. Few of them would even recognize the term "pointing device". Even though a good number of them are using touchpads (GlidePoint) and TrackPoints (on their laptops), trackballs, electric whiteboards, etc. Likewise, most folks think of aspirin and other "solid dose forms" of pharmaceuticals as "pills" (they're not; they are *tablets*).

Please note that I made a very deliberate point of describing "pointing devices" as a class of devices and enumerated the various types of devices found in that class. Anticipating confusion on this subject, I deliberately included pointing devices that are *not* "mice" by any stretch of the imagination: "... digitizing tablets, "nibs" (trackpoint), touchpads, ...". And, I did this right at the start of my original post to try to head off any confusion in the balance of the question.

Note that tablets have been around for at least 20 years as I was using a Summasketch MM1201 (?) under AutoCAD in the mid 1980's. At that point, the tablet was already introduced to the "consumer culture" -- not just high end, proprietary CAD systems (ComputerVision, etc.).

The nib (I guess it's official? name is "TrackPoint") has been around since about 1990. Given that the "IBM PC" wasn't released until 1981 (?), I would think it fair to say that both of these "pointing devices" have been around roughly as long as the ubiquitous "mouse".

Touchpads are at least 10 years old -- as the one in my hand has a manufactured date of 1996. I'm not sure how much further back they originated.

Some devices predate the PC (I was using a different sort of tablet on a GenRad system in the late 1970's -- and a lightpen before that!). Other devices are considerably newer (SAW touchpanels, electric whiteboards, etc.).

I'm sure you can find histories of each of the devices that I listed in my original post somewhere online. Possibly even reproductions of advertisements in hobbyist magazines!

No. You might cling to this metaphor but a pointing device is just something that enters spatial data. There needn't be anything "special" about those points. In fact, a mouse specifies only

*relative* position information!

Remember, the *mouse* doesn't point *at* anything! Instead, relative position data from the mouse is used to control the position of a

*cursor* that is displayed on a screen. In the model that you've adopted, the *cursor* points AT something -- but the mouse DOESN'T! E.g., my mouse is currently dangling off the edge of the desk by its cord. What is *it* pointing *at*??

As an extreme example, I can take my gyroMOUSE into the back *yard* (wireless device -- 100' range) and still interact with my computer even though I can't see the screen! I can be facing WEST (the house is EAST of the back yard) so the gyroMOUSE isn't even pointing at the *house* while I am using it.

In case you can't envision such an application, imagine using that device to control the media player application running on my computer: advance to next song, return to beginning of current title, increase volume, etc.

Furthermore, I can design a user interface that does NOT implement the cursor that you keep thinking of and STILL use a mouse to interact with that system. E.g., I could use a gestural interface to convey information (from the "pointing device") to the system. You never need LOOK at the mouse and the mouse never directly manipulates anything "on the screen" that you could use for eye-hand feedback. (this is one mode of operation for the gyromouse; flick the mouse up, shake it side to side, etc.)

I.e., your concept of "at" drops out of the equation entirely!

From the wikipedia article:

"A pointing device is an input interface (specifically a human interface device) that allows a user to input spatial (ie, continuous and multi-dimensional) data to a computer." (gee, sounds like what you are now calling a "position acquisition device"!)

You might enjoy reading that article cited earlier in this reply. It lists many devices along with characteristics of each of them. I'm sure if you think about them individually for a while, you can see that none of them *have* to "point AT" something to be used and useful.

You're stuck in your model of how a pointing device works. If the *system* to which the pointing device is connected forces the user to point AT things, then it is the *system* that is placing this requirement on the user -- NOT the "pointing device".

E.g., in Windows, you point AT things. Note that if the mouse is not detected, you can still *use* Windows -- by a variety of TAB, Alt-Tab, etc. keystrokes to move the FOCUS around the screen. (the pointing device, under windows, is simply a means of more efficiently moving that focus!).

But, just because some (many?) GUI's use the focus model, doesn't mean that all *must*! Yet, pointing devices (including mice!) can still be used in those other approaches.

I'm sure you don't REALLY mean that literally! Of course blind people can write with pen and paper! Lack of vision doesn't prevent their hands from holding pens nor their arms from moving those pens along trajectories that describe glyphs appropriate to the alphabet in question.

Most blind folks don't bother to write (in this way), much. It is considerably more difficult for them to read what they may have written than it would be for them to take advantage of a Brailler or Slate. But, a blind person could easily "leave a note" for a sighted person to read -- assuming the sighted person was unable to read Braille. (e.g., how would a blind person leave a note for the milk man, UPS delivery man, letter carrier, etc. otherwise? Surely these people don't all learn Braille just to cater to one person on their "delivery route"?)

Close your eyes. Get a pencil and paper. Draw a circle with an X in it with your eyes closed. Wow! Amazing! I suspect even blind children are aware of what a circle "looks like" and, given writing instruments, could reproduce a passable example of same on their first attempt. Ditto for an 'X'. Two letters of the alphabet already learned! :>

I am, perhaps, biased since I spent a good deal of time working with blind and visually impaired individuals. They would take great offense at your characterization! :<

(sigh) I think you are also operating under a limited model of what constitutes a "gesture". Granted, the term is heavily "overloaded" (in the sense of C++). But, a gesture need not be a "drawn symbol" as one might encounter when interacting with a PDA. Rather, it can be a "time varying set of spatial position data" -- of which the PDA is a single case (I believe most PDA recognizers exploit this definition though, technically, gesture recognition can also be done as a _static_ operation -- by examining the resulting "ink trail" without regard for the time sequence in which the "ink" was deposited (NB, here, "ink" is a virtual term, not a literal term)). Dynamic gesture recognition usually uses less overall resources than static recognizers and also (can) produce lower recognition latencies.

Gestures can also be *hand* gestures made "in space". An instrumented "data glove" can watch the motions of the individual digits and present them to a computer for analysis. A camera could do likewise. Note that these gestures may be static (e.g., recognizing single ASL "characters") or dynamic (e.g., recognizing the wave of a hand, two hands coming together in a "clap", etc.).

To specifically address your assertion that: "without seeing what he gestured, he stands not the slightest chance of ever finding out what was wrong, much less correcting the mistake" consider a hypothetical system: Gesture1: vertical motion, top to bottom. Gesture2: horizontal motion, left to right. Gesture3: Gesture1 inverted Gesture4: Gesture2 reversed Assume that the device "recognizing" these gestures has four associated "actions" (responses) to each "command". For simplicity, assume the device beeps N [0..4] times where N is the index of the gesture recognized (i.e., 0 indicating "nothing recognized")

Now, assume the user makes *some* gesture. As a result, the device beeps twice. WITHOUT SEEING WHAT HE HAS DONE *NOR* WHAT THE MACHINE "SAW" (i.e., his gestures aren't traced out on a display that he can "watch" while making them), the user hears 2 beeps.

[Note that I haven't even described the APPARATUS used to convey the gesture to the machine! I could be a pointing device, motion controller, camera, etc.]

"Gee, machine thinks I made a left-to-right gesture. But, I know I was trying to make a top-to-bottom gesture. Having never had the gift of vision in my entire life, BUT STILL ABLE TO UNDERSTAND CONCEPTS LIKE UP, DOWN, LEFT, RIGHT and notions like ROTATED, TILTED, etc., I have to deduce that the machine's idea of up/left and my idea of up/left must be skewed. Let me try again..."

Note that this simple system is enough to remotely control a television! In that case, the actions that the device performs might be "volume down", "next channel", "volume up" and "previous channel", respectively. No "beeps" involved!

The challenge in designing gesture interfaces is coming up with gestures that are intuitive (as I think the TV example would be -- think about it) as well as unambiguous (especially to the recognizer) and efficient to process (large gesture sets require enormous processing power compared to something trivial like "moving a cursor to track a mouse's movements").

OTOH, they are highly portable, often position and size invariant, and very efficient. If you ever have opportunity to use a purely gesture-based system, give it a try! You'll be amazed at how quickly you can adapt to its paradigm. And, will, thereafter, curse the encumbrance that the traditional "focus-based" (point-and-click) systems impose on your work.

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sun, Jul 12, 2009 5:44 PM

The problem lies in Hans' introduction of the preposition "at". It implies that there is something that you must *see* -- or otherwise have feedback (directly) from.

As I stated in another reply, the *mouse* doesn't point AT

*anything*! The mouse's data is used to update the position of a "cursor" -- which, in turn, *could* point AT something (on the same surface on which it is displayed). The mouse simply provides (relative) position data that the system has decided will be used to *adjust* the position of the cursor.

Yes, but this avoids the point. A blind person CAN learn to write. They (with current technology) can't *read* what they have written but that doesn't mean that they can't write for *others* to read. (most *affordable* OCR technology, currently, only supports more rigidly defined glyphs -- typewritten or typeset, etc.)

I have blind acquaintances who regularly use typewriters (no, not Braillers, but conventional typewriters). Just because they can't see the keys, doesn't mean they can't learn their positions on the keyboard and how to depress them! (in fact, when taught "touch typing" you are instructed NOT to look at the keys but to act solely from memory of the layout). Current technology *does* allow blind users to read their *typewritten* work. But, if that work is solely for their own consumption, it is more efficient for them to use a Brailler or a slate to leave themselves notes. (of course, if that "typewriter" is also a "computer", then reading their own work is trivial!)

OTOH, if they have to interface with sighted office staff, the typewriter (or, its modern equivalent -- the computer!) makes more sense (since very few sighted folks can read even Grade 1 Braille -- let alone Grade 2 commonly used by The Blind).

Again, I think Hans has preconceived specific notions of what constitutes a "gesture". Have you never seen a blind man use "digitus impudicus"? I'm willing to bet that he was 100% sure of what he was doing at the time *and* that passersby ALSO were able to recognize his intent! :>

Gestures are typically the time varying output of a pointing device. That pointing device could be a mouse (though usually it isn't since it is a poor device for full range of motion that gestures often like to exploit), gyromouse (operating as a motion controller, of sorts), touch pad, tablet/stylus, "hands-in-front-of-a-camera", instrumented glove, etc.

Gestural interfaces are much richer than "mouse and cursor" interfaces -- and considerably more portable as well as intuitive! Gestures, however, take a lot more effort to "recognize" algorithmically (depending on the richness of the gesture set). An examination of the current state of the art shows that most gesture recognition algorithms are pretty much "ad hoc" techniques that *seem* to work -- as long as you don't look too closely! :> (the PDA is probably the most ubiquitous example of a gestural interface. If you have used one for any length of time, you realize that *it* trains *you* -- not the other way around! :-/)

Almost all rely on mechanisms that let the user "fix" (undo) actions that were initiated as a result of misrecognized gestures. This is to compensate for the high frequency of misrecognized gestures. This presents a heavy burden on the rest of the system designer.

Imagine if every keystroke on a keyboard had to be completely "undo-able" in case the keyboard had transmitted the wrong keycode for the key you had depressed. Think about it. Not just a trivial "typing application". Imagine where each key

*did* something (changed an item's color, shape, size, etc.; moved an actuator through a displacement; drilled a hole in a piece of metal -- oops!). Designing a system where you can (must!) undo *any* action is considerably harder than ones where you can't undo any -- or, can only undo certain "chosen" actions.

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sun, Jul 12, 2009 5:53 PM

[^^^^ Hmmm... intentional? Or "keyboard error"? ]

No, the Curse of USENET is having to spend a lot of time arguing over semantics of your initial *question* instead of actually addressing the question itself! ;-)

He also has other ways of understanding/deducing what the "machine thought" of his gesture -- i.e., by evaluating what the machine *did* as a result of it (and knowing the range of actions that the machine is *supposed* to perform for each type of gesture)

I think the prohibition on "looking at the keys" is intended to force (young) learners to remember where things are and what those keys "feel like" (e.g., the 'A' really sucks! :> ). It's real value is as a productivity enhancement -- looking at the keys means you look AWAY from whatever it is you are "transcribing". When you consider many people (at least those of us old enough to remember manual typewriters!) were exposed to typewriters when typewriters were "secretarial tools" -- you were transcribing something that was written/typed+corrected on a sheet propped up next to the typewriter.

E.g., *composing* at the typewriter was a far less frequent activity. If you think about it, whenever an author is portrayed composing a manuscript on a typewriter, he *tends* to watch the keyboard and or "output" from the typewriter. Had he adhered to these teachings, one would imagine him staring off into space while his fingers hammered away at the keys!

I can't "touch type". But, I almost never am in a position where I must "transcribe" something. Most of my keyboard activity involves composition or revision. So, I can freely watch the keys without incurring any "performance penalty".

Consider how a blind person uses a Brailler. They can't *see* what they have "typed"! Instead, they leave their hands on the keys and think about what key COMBINATIONS (up to 6 keys) they must press for the next Braille symbol. Sure, they could stop, "expose" that portion of the paper that they are working on and "read" (feel) what they have written, but that is a HUGE performance penalty -- much more so than just "looking" at what you've typed!

Also keep in mind that (Grade 2) Braille is not a literal one-to-one correspondence between "English letters" and "Braille symbols". There are many "abbreviations" used in Braille to economize on paper (!). [A "standard" line of Braille is only about 40 cells] So, (some) common words like "the" may take just a single cell (plus a cell following it for the "space", punctuation mark, etc.) and "comment" will take *two*. Yet, others -- like "add" -- show no economies (three symbols, in that case).

OTOH, some things carry extra costs. For example, a double dash (which I tend to exploit! :> ) is four cells. Numbers must be prefaced with the "number sign" (not to be confused with '#'). Uppercase letters must be prefaced with the "capital sign".

On top of all of this, keep in mind that a blind user must learn Braille in its "normal orientation" (i.e. the way a sighted person would perceive it when "looking at a page of Braille text") AS WELL AS IT'S REFLECTED IMAGE! The latter is a consequence of how a user interacts with a Braille *slate* -- the individual Braille "dots" are "punched" into the card FROM THE REVERSE SIDE (so they "stand up" when "read" from the front). So, the user has to learn to "write" right to left and "read" left to right! :-/

Imagine having to learn to write *code* like this! :-/ (note that, in regular Braille, there is no way to differentiate between {} and () (or []). So, the writer has to be keenly aware of context to make sense of:

if (foo[i] == 'x') { exit(); }

In fact, '(' and ')' are indistinguishable in Braille (though '[' and ']' are unambiguous). Think about how you could use these various "delimiters" to form a mathematical expression (in BASIC! :> ) and what sorts of ambiguity might ensue! :>

[NB there *is* a "computer Braille" with different codings]

Exactly. Once you have a name for that "figure" (circle -- 'O'), you could reproduce it at will. And, could "visualize" that figure in descriptions of other figures of which it is a component (e.g., "two circles, touching, located one above the other" -- '8')

I think the latter is probably true. Consider how many people find MS's silly "paperclip" annoying. And, how distracting "on-line spell-checking" can be when composing a document. These things interrupt your work flow. It seems easier to go back and "fix" things in "batch mode".

The last point is the important one. He needs *some* form of feedback *if* he the interface is requires pointing AT things (e.g., the Windows GUI). But, that feedback can come in a variety of forms using other senses. [I am always amazed at how Helen Keller was able to learn as much as she did with such profound sensory deprivation!]

E.g., watch a blind person eat a meal. Their motions might be *slightly* more tentative than that of a sighted person. But, you don't end up with the mess that a two-year old would make -- despite the fact that they can't *see* (or HEAR!) "where" the food is! Sure, they could possibly hold the plate to their nose and try to locate items by *smell*. But, that's impractical in a "public" setting.

Instead, they figure out where things are on their plate. This can be done in a variety of ways:

- a sighted companion (or the waiter) can tell them the relative locations (peas at 3 o'clock, steak at 6)

- they can discretely "touch" the plate's contents as they are accepting/repositioning the plate in front of them (e.g., let the index fingers spill over *into* the plate to test the texture of the items located adjacent to each finger)

- they can probe the plate's contents with their eating utensils (steak is hard, mashed potatoes are soft, etc.)

- they can *sample* something on the plate and, from its taste, determine what it is They *remember* what they ordered and they remember where they discover items on the plate. If I told you that your steak was at 6 o'clock, you could easily locate it and slice it into bite-sized pieces with your eyes closed. And, return to that location repeatedly throughout the course of the meal. Furthermore, you could rotate your plate (to make the peas more accessible, for example) and *still* know how to find the steak in its new location!

Memory seems to be a big part of a blind user's interaction with the world around him. No doubt because he can't "refresh" that memory with "just a casual glance" as a sighted user might. :-(