I've read the Wikipedia article about Direct Digital Synthesis (

formatting link

) and building a DDS generator with a FPGA, which interpolates between adjacent entries in the lookup table, looks like some fun. This is my first try:

formatting link

Maybe when I have some more time, I'll add more features, like a SPI interface to control it from an external microcontroller and multiple outputs.

Any ideas how to improve it? I have read this paper:

formatting link

In this document an Inverse Sinc Filter is mentioned, but without details about it. Do you know how to implement it? And does it improve the output of an interpolating generator or is this useful for non-interpolating generators, only?

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

Use a longer Sine lookup table instead of interpolation.

Store only 1/4 cycle of the sinewave in the lookup table and use bit operations on the address and output to map the 1/4 cycle onto the full wave.

Use a dual-port RAM for the lookup table to simultaneously generate sine & cosine (useful for digital radio applications)

Inverse sinc filters are common ways to equalize the spectral droop caused by the zero-order hold nature of the DAC. Typically a simple FIR filter with a few taps (

You should synchronize 'reset' to the clock and then use that synchronized reset to reset your state machine and the counter signal). If reset happens to go away inside of the setup/hold window of any of the flops you'll get into an odd initial state or count.

Consider redesigning it to incorporate the following features:

Produce an output every clock. This allows much higher output update rates, which can greatly improve the quality of the output. You'll have to throw away your state machine and go to a fully pipelined datapath approach.

Increase the maximum clock rate. Design it so that the Fmax of the ram is the only limitation.

The mid-sized rams from both X and A are dual port. This allows you to do two lookups in the same clock, which is needed for step 1.

As others have pointed out, utilise the symmetry of the sine function to reduce the size of the lookup table by a factor of 4. Equivalently, get 4 times the phase resolution in your table for a given size of ram.

We've done that: dual-port out of the lookup table, mul+add to interpolate based on lower-order bits of the phase accumulator, do all that and load the dac every clock. We got 128 MHz on a Spartan3, with a 4k x 16 full-cycle lookup table (we need to do arbs, too, and you can't fold arbitrary waveforms) and 16 bit interpolation into, err, a

14 bit dac.

It doesn't help harmonic distortion much (that's mostly an analog problem) but it whacks the heck out of close-in spurs.

This sounds interesting, but I wonder why do you need to interpolate at all, if you have only a 14 bit dac, because then it should be possible to use 14 bit width words and a dense table, with only 1 LSB difference between two adjacent samples max. The only interpolation needed then is to decide if you need to use the index from the higher order bits of the accumulator, or the next index, if the lower order bits are >0.5. Of course, you'll need a bit more memory, but e.g. a 14k, 14bit memory could fit easily in the smallest Cyclone III, which you can buy for $15.

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

We have 8 channels, which used up all the block rams in the biggest Spartan. So we could only use 4k points per channel. The interpolation does have a dramatic effect on near-carrier spurs. We have an option to turn it off, for situations where the customer wants to make step edges in an arbitrary waveform... the interpolation turns everything into slopes.

Xilinx has a good DDS in their CoreGen library so you could just use that if you need something already completed. The sinx/x filter could be skipped if your DAC sample rate is high enough compared to the DDS frequency. Some DACs also have built-in sinx/x correction.

Interpolation in a DDS is usually handled differently than in, say, an interpolation filter. Normally it is done with a Taylor polynomial, which yields much better results than a linear interpolator. The usual problem with a Taylor polynomial is that it requires derivatives of the function. In a DDS, though, the derivitaves of the sine and cosine functions are the very same sines and cosines (and their opposites). So with a BRAM-based lookup table with two read ports, you can read both sine and cosine at the same time, so you effectively also have, for free, the first (and second and third, etc.) derivatives of the outputs. So then with little hardware you can make a first-order Taylor, and with a bit more you could even make a second-order, although rarely is this necessary. I'll send you a Xilinx paper that explains this. It's by Chris Dick and Fred Harris and called "Direct Digital Syhthesis - Some Options for FPGA Implementation".

I assume the primary goal of your application was to generate arbitrary waveforms, because for sine and 14 bit output, one table with 51,471 words of size 14 bit is sufficient for max 1 LSB difference for adjacent values, so with using the symmetry of sine you'll need 12,868 words to produce a perfect sine. This is 180,152 bits and would fit in the second smallest Spartan3E, the XC3S250E. But when using 8 channels, it could be difficult for high speed output to use only one lookup table. 8 sine tables would require 1,441,216 bits. This would fit in the EP3C55 Cyclone 3. There is no price for it, but three EP3C25, which would work, too, costs about $135, a bit expensive just for a sine generator :-)

Kevin Neilson has sent me a nice paper about Error Feed-Forward DDS, which helps to reduce the error, but I wonder if it would be possible to create a mathematically perfect 24 bit sine output, e.g. with fdlibm of netlib:

formatting link

There are many muls and adds, but how high do I need the degree of a polynomial to get 24 bit resolution and how do I calculate it? Would be cool to create a pure functional function generator in FPGA, which can produce perfect sine, square, triangle, sawtooth and noise at 24 bit, and with some memory for arbitrary waveforms. Do you think cubic interpolation would help for interpolating the arbitrary waveforms?

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

Thanks, the Error Feed-Forward DDS looks interesting. But it is patented:

formatting link

This is no problem for me, because in Germany you can't patent algorithms and formulaes (at least this is what I know, hope they didn't change it), but would be nice to have a free algorithm, because I plan to publish it on my website, so everyone can use and build it for whatever they want, without the danger of maybe paying licence costs to Xilinx.

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

does sine-triangle-sawtooth-pwm, with 14 bit dacs, 2K points per channel, 64 MHz dac clock, and uses a smallish fpga.

This

formatting link

does 4k points at 128 MHz dac rate, standard waveforms plus arb plus noise. Both have interpolation available. Rams are 16 bits wide to add some headroom for amplitude scaling. Both are primitive relative to the levels you propose.

At any decent speed, analog issues (noise, nonlinearity, thd, drift, crosstalk) overwhelm math accuracy, and at 14 dac bits we're already there. At 32 MHz and healthy swings, the thd limit is the output amplifiers, with 50-60 dB tough to hit. But most commercial arbs and RF signal generators have ghastly thd specs, like -30 or even -20 dBc.

Your very hint of using non-powers-of-two memory size gives me the willies.

I want to generate AES3 and S/PDIF as well, which needs 24 bit resolution. But for the audio signals the frequency needs only below 20kHz, so I think I can split this: very good 24 bit, but slow generator with direct sine calculation from fdlibm and DDS table lookup for faster signals.

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

An additional suggestion to what Emeb has said... Instead of directly storing the sine samples in the lookup table you may store the difference between the sine function and y=3Dx function. You'll save some bits in your table.

Interessant--I didn't know Chris had the patent on that. And in this millenium, no less! It's basically a truncated Maclaurin series--probably similar to the way Babbage cranked out sine tables on his Difference Engine. But it is a great idea, and witty because you get the derivatives for free, and apparently no one had used it in a DDS before. You can still get this feature if you use the Xilinx CoreGen DDS core. (Vielleicht ist das Patent nicht gueltig, wenn ein 2nd-Order Polynom verwendet wird!)

For high precision sinusoids in FPGA's with multipliers, I'd try dusting off the technique from the vintage 1970 Tierney/Rader/Gold paper [Ref 1] and doing something like :

- Upper two{three} phase bits used for quadrant{octant} folding

- next N phase bits look up a 'coarse' IQ value ( coarse phase index, yet precise amplitude )

- next M phase bits look up a 'fine' IQ value ( residual rotation )

- complex multiply rotates coarse IQ by fine IQ

Figure six of their paper has a nice graphical summary of the technique.

The beauty of this scheme is that it is an exact computation, not an approximation; I haven't worked out the error terms for

18x18 or 36x36 multipliers, but I'd expect you could easily do a computation to twenty-something bits of precision with two comfortably-fit-in-BRAM sized lookup tables and one complex multiply.

Their actual implementation with 1970-era TTL took some shortcuts to conserve hardware, e.g. approximate the fine cosine values as ~1.0

[Ref 2] is a great DDS reference that reprints that early paper, along with summaries of other sine computation methods [Ref 3, Ref 4]

Brian

[Ref 1] "A Digital Frequency Synthesizer", Tierney/Rader/Gold, IEEE Transactions on Audio and Electroacoustics, March 1971
[Ref 2] "Direct Digital Frequency Synthesizers, Kroupa (ed) IEEE Press, 1996
[Ref 3] "The Optimization of Direct Digital Frequency Synthesizer Performance in the Presence of Finite Word Length Effects" Nicholas/Samueli/Kim, Proceedings of the 42nd Annual Frequency Control Symposium,
1988
[Ref 4] "Methods of Mapping from Phase to Sine Amplitude in Direct Digital Synthesis", Vankka IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, March 1997

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.