OK, I'm sort of thinking this through.
Suppose I have a stream of parallel digital words that represent a waveform. Assume it's audio or something similar. Further assume that it has an arbitrary number of bits available at an arbitrarily high rate; we can fake the rate by digital interpolation if the raw data stream isn't fast enough.
Assume that I want to use a 16 bit DAC but I need close to 120 dB (a few parts per million) dynamic range. The 16 bit DAC quantizes the signal to 15 PPM, and even worse, the DAC that we have in stock is only guaranteed monotonic to 14 bits.
We can clock this DAC really fast and lowpass filter the output.
If the data is, say, 24 bits wide, we can call the first 16 bits H and the low 8 bits L.
If we just stuff H into the DAC, we're quantized to 16 bits, 15 PPM.
We could take the L bits and turn them into a PWM waveform, and when the PWM was high, add one LSB to the 16 bit DAC. The averaged value of the DAC now has 24 bit resolution. But the PWM signal winds up as a spurious signal in the DAC output, at 1/256 the clock rate, and we'd have to filter that out. That could be inconvenient.
A better way to do this is to add the LSB correction at the same duty cycle but splatter the highs and lows in time. That is, basically, a first-order delta-sigma algorithm. The wiggled LSB is no longer a low frequency whine, but it becomes wideband noise.
One problem is that we can't be sure that the LSB that we dither is exactly 1/65536 of the DAC full-scale amplitude. The dac is monotonic to 14 bits, so bad luck would have the dithered bit be 4x as big as expected. Or negative.
Assuming the DAC has the biggest nasties at major transitions, like
0x7FFF to 0x8000, we could add some DC offset to avoid that step when we're making small signals. Add 64 or 293 or something maybe, to get in a zone where it's probably pretty linear.One could accomplish the PWM by adding a sawtooth or triangle waveform to L before quantizing to 16 bits. Or splatter the triangle samples randomly, to get the delta-sigma noise shaping.
If you added a bigger triangle than 1 LSB of H, the DAC output would splatter over more than one bit. That reduces the probability that we are using a "bad bit" transition, an non-monotic step, for the dither interpolation.
So it looks like I could have a lookup table of 24-bit numbers in the ballpark of 293*256 or whatever, randomized with a flat probability distribution (scrambled triangle) some modest number of H LSBs in p-p amplitude. Every DAC clock, add the next one to the 24-bit data before truncating to 16 bits for the DAC. That's sort of like a higher-order "noise shaped" delta-sigma, but there's no feedback loop. The bigger the spread of the random numbers, the more noise we make but the better we cope with DAC monoticity errors.