Detecting the original pitch of the human voice

Suppose you have a recording of a human voice singing a song. The recording has been sped up or slowed down, such that both the tempo and pitch have changed. The aim is to detect as close as possible exactly how much you need to compress or expand the waveform (to speed it up or slow it down) in order to restore it to the pitch it was originally recorded.

The human voice has a limited range, so you could easily get it within this range, just by knowing that most people would not be able to sing outside this range. You also know that the song is sung in tune, in equal temperament, so the pitches will need to align exactly to a set of defined notes.

If you knew anything about the singing ability of the singer, you might also be able to infer something based on how strained the singing of each note is, but assume you only have the recording, and no prior information about the singer.

How would you do it?

Reply to
C3
Loading thread data ...

recording

need

order

this

also

Dunno, but you might want to go all Google on "vocoder" - might give you some ideas.

Reply to
Poxy

On Mon, 12 Jun 2006 21:37:10 +1000, "C3" put finger to keyboard and composed:

Depending on the quality of the recording, you may be able to detect

50Hz or 60Hz hum.

- Franc Zabkar

--
Please remove one 'i' from my address when replying by email.
Reply to
Franc Zabkar

If you know the key of the song you can fairly easily 'digitally' pitch / tempo adjust.

Regards, Mitch... ( I have done this when remastering some cassette taps on to CD.)

Reply to
Mitchell

pick a piece where you know the note. work out the difference between that note and the recording. Use the change in frequency to shift the entire recording in the frequency domain - eg modulate up to a higher frequency, then back down again but to the correct frequency this time. I can think of several ways to do that, so there must be dozens, its not hard.

Digitise it, then play it back with a variable sampling rate, and adjust until that note comes out right.

I once (1990) made a doorbell that had an EPROM, DAC, counter and audio amplifier. I stuck a speed pot on it, controlling the counter clock. Made for fun times, recording things on PC, burning to EPROM then adjusting playback speed.

Cheers Terry

Reply to
Terry Given

If it was digital I would convert straight to frequency domain and digitally band pass filter the most common frequency of human voice, then have a look at all the harmonics. Do this for heaps of standard noises and you should see that you get almost the same thing every time. Then compare this to a file sped/up slowed down adn you should find that you dont get the same harmonic peaks at the same frequencies.

Reply to
Dac

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.