Keyboard / Mouse Input Device Design??

- J
- James Waldby
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Nov 1, 2006 4:32 AM

...

The C language Xlib "SendEvent" protocol request does that, on systems where X Window System is in use. For a Perl version see

formatting link

Jan is correct that not all linux systems and applications use X. I suspect that more than 95% of end-user Unix and linux systems use X, but imagine that a large fraction of servers do not. Eg, the tens of thousands of machines in google's server farms don't need displays. If you plan to address machines such as those, you'd need another set of system calls.

-jiw

- P
- Peter Olcott
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Nov 1, 2006 4:34 AM

No it is not stochastic at all, the whole process is completely deterministic.

You must provide it with the means for knowing the precise pixel pattern of every Glyph that it must recognize, this is typically done by specifying a FontInstance: (a) Font Typeface Name (b) Point Size or PixelHeight (c) Style including (Bold, Italic, Underline, and StrikeOut) (d) Foreground Color and BackGround Color

It can process many different FontInstances simultaneously. This part of the system is operational and fully tested. It can provide 100% accuracy on any FontInstance that is not inherently ambiguous. The default FontInstance for much of MS Windows, Tahoma, 8 point is processed with 100% accuracy. Simple Heuristics can be applied to get very close to 100% accuracy on most FontInstances.

- P
- Peter Olcott
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Nov 1, 2006 4:45 AM

Actually it could be set up to process all font families currently known. The simplest way to do this would be to build the DFA for the lower case vowels of every FontInstance in the colors of black on white. Then the text would be required to be transformed to black on white. Now it could quickly determine the correct FontInstance on its own, and then load up the appropriate full DFA(s). This assumes machine generated text that is not dithered or anti-aliased. With dithering, the problem of transforming the text to black on white becomes more complex, yet still feasible.

- L
- Le Chaud Lapin
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Nov 1, 2006 5:05 AM

one

With those parameters, it is indeed possible to find matches. How could you not? If your software runs on the same computer as the windows that it is monitoring, then certainly if you render a piece of text using the parameters that match what is displayed, you will have an exact match, even with effects of anti-aliasing, transformations, etc.

However, I should point out again. Given that the user of your software has to specify these parameters anyway, and given that text that was not generated by the underlying font system will not, in general, be recognized by your software, it remains that the most important elements of recognition are pieces of text that is generated by the GUI system.

But it is possible to intercept _all_ rendering of such text through well-defined API's. In other words, if I were interested in knowing if there were a window that had the word "JFET" in it, I have to options.

Use your system and enter the above information.
Use my hypothetical system, and just specify "JFET".

Do you see? By interposition into the GUI subsystem, it becomes far easier to describe what you are looking for. Font face, point size, styling, and color become irrelevant, if it doesn't matter.

There is something else that is important. With your system, it seems that you are taking snapshots. The problem with snapshots is that there is a chance you will miss something, unless you are planning to bump up the rate of frame-grabbing so fast that you miss nothing. With my hypothetical system, there would never need to be a need to take a snapshot. You'd always know the state of the system.

-Le Chaud Lapin-

- L
- Le Chaud Lapin
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Nov 1, 2006 5:16 AM

the

Ok, I see what you are doing now. I hate to rain on anyone's parade, especially one where the objective is ambitious, you should know that what you are doing, the ultimate result, could be done in a way that is probably superior in many respect than the image based method.

One example is simple. Let's say that a programmer wants to use your software to know whenever the string "You Have Mail" appears anywhere on the screen, knowing that there is a mail application that pops up a window with this message. He specifies the font family, point size, style, and background/foreground colors of the little window that contains this message. To get this information, he spends 10 minutes repeatedly sending mail messages to himself to force the window to popup, and when it does, he eyeballs the message to ascertain the parameters. Finally he goes to your software and enters arguments for these parameters. Then he tells your software to run, and specifies a rate-of-grab of frame buffers so that the window, which pops up for only three seconds, is not missed.

Compare that to not having to force anything to popup or eyeball anything, simple typing in "you have mail", checking case-insensitive box, and being done with it. Not rate-of-grab would be necessary because there would be no frame grabbing. The monitoring software would simply "know" the state of entire GUI system at any point in time.

Certainly you will agree that, if this is what your software does, the latter method has significant advantages?

-Le Chaud Lapin-

- J
- Jan Panteltje
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Nov 1, 2006 11:21 AM

On a sunny day (Tue, 31 Oct 2006 19:24:17 -0600) it happened "Peter Olcott" wrote in :

I ma sure you can, but because of the large amoun tof stuff that potentially _can_ run, X11 (own drivers), text console (own drivers), vgalib, you'd first have to find what is running and how I think, before you can access any display buffer[s]. There may be more then one graphics card too :-)

- P
- Peter Olcott
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Nov 1, 2006 12:26 PM

My system is the only possible way that is inherently compatible with every system , platform, and application. There are many cases where the required information is unavailable from the system internals. My system handles all of those cases. Now that we have dual core machines, it is possible, using a DFA to process many screens very quickly. I expect that my system could even play and win fast paced video games.

- P
- Peter Olcott
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Nov 1, 2006 12:32 PM

And significant disadvantages, for example a false positive match. For something as simple as that, my system might be able to process as many as 100 frames per second. In fact that may be the biggest problem with the approach you are proposing over my method. Another problem is that there are times when this "text" message is displayed using a bitmap, rather than text itself.

- R
- Roger Hamlett
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Nov 1, 2006 2:05 PM

While I don't actually 'like' the keyboard interface approach being asked for, these devices are readily available, and so it should be easy to test how it does all behave. Look at the Hagstrom KE72

formatting link

or the PI engineering X-Keys control board (the USB versions, may well be the better long term solution)

formatting link

. The Vetra systems VIP module, might also be worth a look (allows direct RS232 input to the keyboard port)

formatting link

Best Wishes

- L
- Le Chaud Lapin
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Nov 1, 2006 5:14 PM

something

The method I was suggesting could not generate a false positive for text that is not regarded as simply an image. The reason is that, the very objective of your software - to determine what text is being rendered, is actually acoomplish before the text even hits the screen. If there is any program anywhere on the computer that tries to display "MOSFET" using any DRAW-TEXT primitive in the system, my method would catch it. So in fact, I would get a 100% hit rate on text that is normally rendered by the system.

For text where the programmer first converted it to an image and told the GUI subsystem to render it, my method would fail with OCR. But then, the problem reverts to OCR anyway.

Now consider: we do not have an exhaustive list of fonts to be used, so your method would have to have that to approach a hit rate of 90% without help from the user. Of course, if the user tells you what the font face is, etc, and all of these things, then yes, your software would approach 100%.

However, as mentioned, my gut feel is that "in-line-interception-of-text" versus "snapshot-of-graphics" is superior. One has to imagine the headache vs. % effectiveness of using each model.

Which would you rather have? 100% hit rate on 95% (perhaps) of the situation by simply declaring what text needs to be sought or 98%+ hit rate on 98% of the situations with painstaking determination of color, font face, pitch, and foreground back ground color each time, not to mention the possibility that you will miss an "easy" true positive because you're taking snapshots?

-Le Chaud Lapin-

- P
- Peter Olcott
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Nov 1, 2006 11:18 PM

Including all the cases where the string "MOSFET" was not placed on the right place of the right window to be the correct trigger even for the required action.

So my system can ALWAYS work, whereas yours only works some of the time. SeeScreen is inherently compatible with every system, platform and application.

The next best alternative is a hodge podge conglomeration of many different complex technologies limited to simulating user actions on far fewer applications and operating system platforms.

I already posted one of several ways that my system can determine with certainty the exact set of FontInstances.