OT , but may be of interest Speech recognition

Speech recognition seems to be one of the things that is always just 5 year s from being accepted. Back in about 1998, I convinced the company to buy a copy of Dragon and they set it up for one of the supervisors who did not use a compute very much. And sure enough he also did not use the Dragon pr ogram.

And then in 2006 I bought a copy for my son, and he did not find it very he lpful.

So here it is 2017 and the WSJ had an article where someone said that Speec h recognition had progressed from being 95% accurate to 99% accurate , and that made it worthwhile.

I have a grand niece who lives in Alaska, and I live in Delaware, so there is 4 hours time difference. And we do not do very well as far as connecti ng by phone. Email could be a good thing, but she never replies. I am not sure if it has to do with the speed with which she types ( Could be that she dislikes me, but I do not think so, We get along well when we do conn ect by phone ).

So I thought maybe a speech recognition program might be worth trying.

So do any of you use speech recognition? And if so what program do you fin d worth while.

One thing I found is using " Google Docs Voice Typing " I think all the he avy lifting is done on the cloud. That is I think your voice is digitized same essentially as VOIF and the recognition is not done on your own comput er. I have not tried it so far. I need to reconfigure some things before it will work on my usual computer.

Dan

Reply to
dcaster
Loading thread data ...

I use Google speech recognition in various programs, mostly running under ChromeOS on a Chromebook. I also use it in the Chrome browser under Windoze. Mostly, I use it for command and searches, not for typing. I've had numerous experiences with the multitude of Dragon Naturally Speaking mutation and discovered a few things.

  1. People talk differently than they write. About the only thing that comes out correctly when using speech recognition for dictation is for speech writing.
  2. It takes less time to type badly and correct the obvious spelling and typo mistakes, than to fix both the grammar and spelling errors produced by voice dictation, which can be obscured by bad guesses and will not be obvious to a spelling chequer.
  3. Doing computer dictation while watching the screen is much like playing piano while watching your fingers. Neither works well because the brain is being asked to do two things at the same time. Best to not look at the screen while dictating and then go back and clean up the mess. Untraining users with this bad habit has proven to be impossible.
  4. Any sufficiently advanced voice command structure is indistinguishable from floundering around the keyboard at random. Most of these commands and structures require far too much memorization to be useful.

Bottom line: Voice dictation and command are not a good replacement for learning to type and think.

Correct. You'll find that it doesn't work without an internet connection.

I think you mean VoIP (Voice over Internet Protocol).

Correct.

You should. Fire up Chrome browser, enable the voice recognition and OK Google feature. You may need to do the voice training procedure. For the Chrome browser on Windoze, OK Google doesn't work. You will need to click on the microphone icon in the search box.

Try the commands. Some of my favorites; "What time is it" "Convert 184 pounds to kilograms" "How far to Dominican Hospital" "How far to Paris France" "Weather 95005" (my zip code) "In French say I hate computers" "Show my Google search history" "When is the next meeting" "Take a picture" "Record a video" "What is 15 percent of 24 dollars" (tip calculator) "What is the phone number of Donald Trump" More: Hint: You'll need a cheat sheet and the command and syntax are changing/improving constantly.

Well, that's better than an unusual computer.

--
Jeff Liebermann     jeffl@cruzio.com 
150 Felker St #D    http://www.LearnByDestroying.com 
 Click to see the full signature
Reply to
Jeff Liebermann

Just like AI ("Any day, now!")

"Speech recognition" can cover a lot of different applications and expectations.

"Limited domain" recognizers are usually VERY accurate -- but of little interest to the population as a whole (outside of IVR systems provide by their banks, etc.)

Recognizers that require training can be difficult to use -- because the training phase is boring and tends to cause the user to provide "unnatural" voice samples (when speaking colloquially, there is usually a lot of variation in our speech patterns).

Unconstrained recognition among a variety of users is the holy grail and requires a fair bit of technology as well as "investment" on the part of the user (speaker). How patient would *you* be shepherding a recognizer through its mistakes? If you (naively) assume 99% accuracy means it screws up "1 in 100 words", then that's several words in each MINUTE of speech!

Will you manually correct the text output? Invest the time to improve the recognition? Or, simply decide to email a analog recording of your speaking and let the other party "recognize" it?

I use it in SWMBO's vehicle (*rarely* as it is MORE distracting than pushing buttons -- even on a busy screen!). I also use it in my current project (where there is no alternative input mechanism).

In both of these cases, the domain is highly constrained. Neither ever has to worry that I might utter "aardvark" or even "grapefruit"! :> And, in the latter case, the system can ALWAYS be learning and refining its models.

I've used Dragon in the past but had limited success. Ideally, (IMO) you want to be able to close your eyes and talk. With Dragon (or the car example), you are in a perpetual state of apprehension -- watching as each word is recognized and prepared to "correct" it. Its the sort of frustration that you'd experience speaking to someone over a very long-distance satellite link (never knowing whether the other party is done speaking and if it is "safe" for you to start -- or, if there will be an "audio collision" and you'll both back away from it and try again)

My MD uses speech to summarize each office visit, issue followup orders, etc. But, his eyes are glued to the screen while speaking and the environment is relatively benign. Despite that, he is often "correcting" the machine. Consider: the machine KNOWS that it is "him"; he uses it extensively EVERY (work) DAY; and the domain is largely limited (he rarely talks about aardvarks *or* grapefruit) and known to the system designers ahead of time!

We had a friend with ALS that I tried to convince to use "voice mail" in lieu of having to spend hours hunt-and-pecking on a keyboard to compose an email (as her motor skills degraded). She resisted the suggestion, largely because she was losing the ability to speak, as well. And, she could put on a braver face hiding behind printed glyphs than her halting speech!

[Lousy way to die! :< ]

Yes, most of the "unconstrained" recognizers work by shipping the audio off to some remote server (e.g., phones, TVs, google, etc.). This makes some sense as it lets those folks throw gobs of resources at the problem instead of being limited to what you *might* have on your desktop.

It also lets them revise their algorithms "on the fly" as well as keeping them under their own control.

Of course, the downside is you've now disclosed the content of your conversation along with a unique biometric -- *your* voice characteristics. And, doubtless there are any "rights" that you retain to the information that is extracted from that.

["Send your DNA sample to ancestry.com and we'll tell you where your ancestors originated... (we WON'T tell you of any OTHER uses we might make of it!)"]
Reply to
Don Y

There's an app in the android play store called ListNote. com.khymaera.android.listnotefree

I found it to be very helpful. It's not perfect, but very useful for communication. I've seen it get the word wrong, but after more input, it went back and changed the word to fit the context.

If you expect EXACTLY correct everything, you probably will be disappointed. Don't expect it to do well with technical jargon or weird sentence structures. Or with atypical dialect/accent.

If you just want to communicate using natural language, I feel I could use it as-is without correcting anything. I didn't have to train it at all.

Reply to
mike

ars from being accepted. Back in about 1998, I convinced the company to bu y a copy of Dragon and they set it up for one of the supervisors who did no t use a compute very much. And sure enough he also did not use the Dragon program.

helpful.

ech recognition had progressed from being 95% accurate to 99% accurate , an d that made it worthwhile.

e is 4 hours time difference. And we do not do very well as far as connec ting by phone. Email could be a good thing, but she never replies. I am n ot sure if it has to do with the speed with which she types ( Could be th at she dislikes me, but I do not think so, We get along well when we do co nnect by phone ).

ind worth while.

I have a few-year-old copy of Dragon Naturally Speaking, version 12. It's surprisingly good. But as Don Y mentions, you have to correct it constantl y. "Their automobile..." may well come out "There automobile," or "They're automobile."

Still, for transcribing, for example, a documentary, it's a lot more fun to auto-recognize, then go back and correct than to type it all in by hand.

(Note that it's speaker-dependent recognition, so to translate a documentar y *you* have to listen to the documentary in one ear, and repeat in your own voice into Dragon's microphone. Which is still about half the work as straight listening and typing it all.)

heavy lifting is done on the cloud. That is I think your voice is digitize d same essentially as VOIF and the recognition is not done on your own comp uter. I have not tried it so far. I need to reconfigure some things befor e it will work on my usual computer.

Google loves to save your voice, since that way they can add biomarker recognition to their Google cache on you. Which is why they offer it. :-)

Cheers, James Arthur

Reply to
dagmargoodboat

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.