What do you mean by error-linearity?!?
Huh? 4046's only work into the low MHz; with big enough capacitors they'll work down to the single-digit Hz.
This is just something you'll have to live with.
I've implemented software PLLs up to 80kHz on modern DSP chips. You need to have a fast DSP to be able to update the control loop at that rate, or you need to figure out how to subsample the input without messing up. If you can generate a sine wave at 20kHz in your DSP you must be turning a routine over at over 40kHz, so I assume you have a fast DSP.
The key is to use a timer for the phase detector. If you have a timer capture input you can capture the timer's phase at the instant that the input signal happens. If you use the same clock that's generating your sine wave for input capture you can easily relate the input phase to your output phase.
Yes, your loop will be more complex. But with this complexity you can buy easy linearization, gain scheduling to account for the changing sampling rate as your input rate gets low, and good fast rough frequency estimation to start your NCO at the right frequency and phase for quick locking.