Audio Quality Help

I am attempting to replace an ISD17xx (datasheet here:

formatting link
device with audio produced using a PWM peripheral from my micro. I've got audio output working but the quality isn't quite there. I don't know why and so I was hoping with some samples someone might be able to help me identify the issue.

First, here are small samples of the audio playback from each device taken from a microphone in a very quiet room.

ISD17xx playback

formatting link
Micro playback
formatting link

Note, the ISD17xx device is set to sample at 5.3kHz so there is that immediate difference. However, the audio playback from the micro is very tinny or robotic sounding. Viewing the waveforms in Audacity you can see that the ISD playback is 'smoother'. Lots of 'jagged edges' in the micro playback...

My micro is pulling raw audio (8 bit at 8kHz) from memory and stuffing it into the PWM output register. My PWM peripheral is set up at 32kHz (4x sampling rate). The PWM timer is 16 bit and the clock to the PWM peripheral is 20MHz. This results in a timer load value of 625 counting down towards zero. In other words, 20MHz/(8kHz*4x) yields 625 counts. So, each 8 bit value is scaled using (value*625)/256.

The PWM value is updated every 4th PWM interrupt (32kHz/4=8kHz). You can clearly see this here

formatting link
From this screenshot you can see that the PWM (yellow) and the filtered signal (magenta). Note the two cursors are 125us apart (8kHz).

The relevant portion of the schematic can be seen here

formatting link
I've got a simple RC lowpass filter on the output of the PWM. The knee point is set at roughly 3.8kHz (4.7k ohm + .01uF). This is then capacitively coupled to the amplifier. The gain on the amplifier is controlled through a digital potentiometer.

Anything immediately stand out?

--------------------------------------- Posted through

formatting link

Reply to
eeboy
Loading thread data ...

Are there some "free" FFT / spectrum analyser progs you could feed the .wav file to and see if anything useful shows up?

Reply to
Dennis

It is quite possible to have better then the phone line quality by PWM generated by MCU at 32 kHz.

You should use the noise shaping technique when converting PCM to PWM. Also, make sure that you are using the dual edge phase correct PWM.

Here lies the first problem. The 8kHz should be upsampled to 32kHz by interpolation. Just a simple linear interpolation will make a lot of a difference in the quality.

Apply the preemphasis to the digital audio and do the deemphasis in the analog. Calculate the lowpass accordingly. That will substantially improve the quality, too.

The knee point is set at roughly 3.8kHz (4.7k ohm +

IHTH

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

Reply to
Vladimir Vassilevsky

3zmfkk8) device with audio produced using a PWM
n

al

n

he

You need to do better that 1st order low pass. A least 3rd order. Also, drop the cutoff to 3kHz.

Scaling 256 to 625 will introduce additional quatization. Rather scale to 512. ie.e double it.

Make sure that the recorded wav file has been properly low pass filtered. If done through a sound card, then it is also proably OK, if recorded by the micro then you also need a nice steep filter. The ISD chips use pretty steep filters.

Reply to
Rocky

I would have thought that if you're using PWM and an RC as DAC then you are introducing HF from the PWM switching, this would give you the tinny sound? Your knww point may be at 3.8kHz but your still letting through higher frequency an your ears will detect it.

Reply to
DaveN

Just listening, my impression is that you have some high-frequency components there you don't want contributing to the "tinny" souind.

Please remember that a first-order low-pass filter doesn't roll off very sharply ... you are getting stuff in there that you can hear way beyond 3.8kHz. I'd try knocking down the time constant to 1kHz, plus using a higher-order filter.

DFC

Reply to
Datesfat Chicks

.wav

That's a good point. I had done an FFT on the signal using my scope which is located here

formatting link
It didn't provide much useful information, mostly because I don't fully understand how to use it. What I could see is that first harmonic at the PWM frequency (32kHz).

After you mentioned that, I poked around in Audacity and found that I could produce FFT plots from the application and they are much nicer. I have plots of each...

ISD

formatting link
Micro
formatting link

This points to what a lot of the respondents have said... 1st order filtering isn't enough.

--------------------------------------- Posted through

formatting link

Reply to
eeboy

Any reason this couldn't be achieved using several (3) RC filters to keep costs low?

I can do that. It would effect my volume, but if it improves quality I am all for it. I'll give it a shot.

The audio is recorded via a sound card at a much higher sampling rate (41,000) and then converted to (8,000). I can apply a lowpass during the conversion. Should the source audio be low passed at the same frequency as the knee point?

--------------------------------------- Posted through

formatting link

Reply to
eeboy

Can you elaborate or point me to more information?

I was using a right aligned PWM. I've changed this and it seems to have made the biggest impact. It seems to have reduced the robotic sound. Although it is still present, it's not as prevalent.

I've implemented linear interpolation by tracking the last PWM value, subtracting the current PWM value, dividing the difference by the oversampling rate (4) and adding it in each of the oversample periods. I was expecting this to have the biggest impact but it did not. I really could not hear the difference.

Maybe a silly question but what is meant by preemphasis and deemphasis in this context?

--------------------------------------- Posted through

formatting link

Reply to
eeboy

formatting link

That's because the THD of the dual edge PWM is less by an order or so then that of the single edge PWM.

If there is enough of CPU speed to do the pseudo natural PWM correction (PNPWM), that will improve the THD by another order of magnitude.

If the interpolation is done right, there should be substantial improvement in the high frequency area.

Apply preemphasis to the audio source. Apply deemphasis to the analog signal produced by PWM. That will clean up the high frequency area of the signal by several dB.

Preemphasis = high frequency boost, low frequency cut. Deemphasis = low frequency boost, high frequency cut. Look for CCIT J.17 curve or similar.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

Reply to
Vladimir Vassilevsky

Playback is getting closer (less robotic sounding) but I think I need to take a look at the conversion before it gets to my device. The source files are all recorded via a sound card at 44,100 - 16 bit (stereo). I am then using SoX to convert the audio to 8,000 - 8 bit (mono).

SoX recommends (and defaults) to using dithering when converting bit depth. The resulting audio file can be heard here

formatting link
There is quite a bit of noise in the resulting file.

If I remove dithering, it sounds better as the noise is reduced. The result can be heard here

formatting link
The noise is more prevalent in the speech than it is during silence.

I assume that it would be best if I re-recorded all the source files at 8 bit so that there is no quantization error associated with the conversion, correct?

Anyone familiar with SoX and techniques which might help in producing a better output from my existing source files?

--------------------------------------- Posted through

formatting link

Reply to
eeboy

es

h.

lt

nt

,

I don't know anything about SoX, but you don't reduce quantization noise by quantizing to 8 bits at the start rather than later. Quantization noise is the result of quantizing, period. An 8 bit quantity can't represent analog values that are between the values specified by the 8 bit number. The difference between the analog value and the value represented by the 8 bit number is the quantization noise.

I listened to your audio samples from your first post, but they are so short I can't really tell what they are supposed to sound like.

When you cascade multiple passive filters, you need to consider the design as a whole. The input of one filter loads the output of the previous one and changes the corner frequency. In general, the first stage will use lower value components to present a lower impedance. Each successive stage will use higher value components to present a smaller load to the previous stage.

It's not clear to me that you need a higher order filter. I can't tell what the cause of your problem is yet.

Rick

Reply to
rickman

s

Yes, you should low pass filter the original recording, but I expect that is being done when you change the sample rate. Is this your software or a canned package? No point in doing it twice. The goal of the low pass is to remove frequencies above the Nuquist rate of the new sample rate. These frequencies will alias into the signal during the conversion. So you need to remove them with the low pass filter.

That means the goal of the filter is to cut off as well as required everything above the Nyquist rate. Since the filter has a finite transition band it is typical to set the transition across the Nyquist rate. I assume you are processing only voice. I think the upper frequency for voice is typically 3.4 kHz, so that gives you plenty of room for the transition band. These are standard telephone issues, so you should be able to get sound quality at least as good as standard telephone... well, except that telephone is actually 12 or 13 bits compressed to 8 bits with uLaw or Alaw. Why are you using linear 8 bits?

Rick

Reply to
rickman

The issue with just cascading RC filters is that they have a very poor cutoff and a lot of interaction. Basically the simplest would be an RC with a 2nd order Sallen & Key. You should be able to find a lot of references to "cook-book" cicuits using only one opamp on the web.

The distortion caused by the single edge aligned PWM is due to phase modulation of the audio. However as your PWM is at 32KHz and your sample rate is only 8KHz it should be less than a few percent. You could consider increasing the PWM rate to 64KHz since you only have 8 bit data.

Another approach can be to only use 256 levels out of your 625 because the phase modulation distortion depends on the depth of the PWM modulation and the audio frquency. You should have sufficient gain with your amplifier to compensate for the reduced output.

Reply to
Rocky

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.