I think I just understood what is wrong with my thinking. I was only looking at one direction. The problem the OP has is that he is looping back the received data and retransmitting it at the same slow(er) rate the receiver is using. Since the transmitter must use a full 10 bits, then the buffer between the receiver and transmitter will overflow at some point.
This is a problem that is found in communications systems. All units must run at the same rate, but may use a different reference. So if they receive data at 8.0000 kHz and try to DAC it at 7.9999 kHz, about once per ten seconds there will be a byte too many and a sample will have to be dropped. This will cause a noticable click or other distortion of the voice.
Some systems try to buffer this out, but that only postpones the problem. Most systems use a common reference that is very accurate and stable. Then the ADC and DAC clocks must be sync'd to the reference. I worked on a system that used a bendable VCXO to establish the 8 kHz rate. As the buffer data count varied from half full, the VCXO rate was increased or decreased by small amounts to keep the average rates matched.
In the case of the OP, the first channel can use two stop bits and the echo channel can use one stop bit. This may not be pretty, but it will work.