Anyone for Mux?

Hi,

In multiplex comms systems, what's the minimum sampling percentage of plain speech necessary to make it intelligible at the other end?

Thanks,

p.

--

"What is now proved was once only imagin'd." - William Blake, 1793.
Reply to
Paul Burridge
Loading thread data ...

I'll confess to being confused by the question -- *percentage* of plain speech.

If you use a 2400Hz bandwidth (used in ham radio SSB, not hi-fi, but it works), then Nyquist says 4800 samples/second. The sampling percentage goes to zero as the aperture time of the s/h goes to zero.

So, 4800 samples/second, before any games (coding, compression, modeling, etc). You can make the most of available bandwidth, or reduce bits/second using coding techniques (mu-law, adpcm), compression (all sorts of techniques), modeling (12-point LPC is great for one voice at a time).

War story:

Once upon a time, a long time ago, the (DARPA) Network Speech Compression project investigated sending speech over the (then ARPA)net. We used a 12-point LPC (Linear-predictor-Corrector) running on a very advanced DEC PDP 11/45 with a butterfly box attached to do the hard work. The LPC modeled the vocal tract (see, for example, Markel and Gray). The digitized audio waveform (recorded in a very quiet sound booth, more on that shortly) goes into the model. Rather than sending the digitized audio over the net, the model parameters are transmitted, resulting in an amazing drop in data rate. The LPC modeling approach was also used by the TI Speak-N-Spell toys; a LOT of preprocessing gives you a very low data rate. Reconstruction is fairly easy.

But there's a price to be paid, and it isn't just in the amount of preprocessing required (which is trivial today, but a pain in the ass in the 70's). Remember, the LPC models the vocal tract. So if someone comes into the booth and knocks a book into a metal trashcan, or slams the door while the system is live, what the listener on the other end gets is that sound -- as imitated by the human vocal tract! Two people speaking at the same time? Comes out as one vocal tract trying to create that sound (and not doing too well at it)!

Conference calls? No "easy" way to sum multiple LPC datastreams. Oh, you can reconstruct them all to audio samples, and then sum, but you can't re-encode and transmit as one LPC -- as all those sounds are going to be modeled coming from *one* vocal tract. And when you reconstruct multiple LPC streams and sum, it gets very confusing, since a lot of what makes individual voices individual gets lost through the LPC filtering process.

Still, it was a fun project and kept a bunch of us off the streets.

--
Namaste--
Reply to
artie

Hi Paul,

Probably a cell phone network designer can answer that best since in that business profits are inversely proportional to the number of data packets sent per second for each connection. But they sometimes seem to believe it can be even lower than the percentage required for 'intelligible'. I had conversations were I am certain I wouldn't have understood much had I not known the person at the other end and the way he or she usually speaks. For example, once I didn't even recognize who was on the phone or what was said to me but after she handed the cell phone to my wife I could understand. My wife had to repeat it all and that way the provider could clock another minute. Ka-ching.

Anyway, it might make sense to use Google to obtain some numbers for TDMA or GSM networks.

Regards, Joerg

formatting link

Reply to
Joerg

8K samples of 2 bits will give reasonable speech. As little as 8K sample of one bit will work, but gets pretty rough. Lower sample rates will work, but limit the band width even more than fewer bits. Normal (ISDN) digital phones use 64Kb/sec.
Reply to
Clarence

Yeah, cell phones really rot, IMHO. There might be a market for "medium-fi" cell phones doing a solid 8ksps * 8bits.

Oh what the heck, let's start a 48ksps * 16 bits stereo cell phone network!

--
_______________________________________________________________________
Christopher R. Carlen
Principal Laser/Optical Technologist
Sandia National Laboratories CA USA
crcarle@sandia.gov -- NOTE: Remove "BOGUS" from email address to reply.
Reply to
Chris Carlen

My age is 63 and I find cell phones dicey, at best. Hearing tests show me to be sort of average in acuity for my age. However, there are "notched frequencies" that vary from one individual to another. Those darned hairs in the transducer in our inner ear are sharply tuned! I'm guessing the current number of bits is OK, but a higher sampling rate is in order. I'd think a user boost in the frequencies around 2 to 3 kHz might be very helpful for older folks.

Reply to
Charles Schuler

Hi Chris,

I believe it has gotten worse over the last few years. I used to never have any problems understanding someone on a cell phone or even recognizing the voice. Nowadays, when someone doesn't start with their name but says "Hello Joerg" I often have to ask back "Who is it?".

Or three different plans. Basic: You'll be able to notice that someone is telling you something. Intermediate: Your can understand more than

50% or your money back. Premium: You can actually experience the voice quality of a land line. Say, for just $9.95 more a month...

Just FYI: Your email address shows up non-munged at the bottom of your posts. That could invite spam crawlers to pick it up.

Regards, Joerg

formatting link

Reply to
Joerg

Cheer up, Charles. You may not be able to hear what your interlocutor is saying, but you'll be able to see him/her in much finer detail. All the extra available bandwidth is going into the video side of it! :-/

--

"What is now proved was once only imagin'd." - William Blake, 1793.
Reply to
Paul Burridge

Yeah, well I've known for several years now that most markets are driven by youth and that youth is wasted on the young. Cheers.

Reply to
Charles Schuler

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.