Sine Lookup Table with Linear Interpolation

- J
- Jerry Avins
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 6:03 PM

An in-between approach is quadratic interpolation. Rickman reads Forth.

formatting link

may inspire him. (It's all fixed-point math.) ((As long as there is interpolation code anyway, an octant is sufficient.))

Jerry

--
Engineering is the art of making what you want from things you can get. 
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

- J
- Jerry Avins
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 6:10 PM

formatting link

should be a good start. Do you remember Ray Andraka?

Jerry

--
Engineering is the art of making what you want from things you can get. 
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

- E
- Eric Jacobsen
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 7:54 PM

He indicated that his target is an FPGA that has no multiplies. This is the exact environment where the CORDIC excels. LUTs can be expensive in FPGAs depending on the vendor and the device.

Almost any DSP process, including a large DFT, can be done in a LUT, it's just that nobody wants to build memories that big. It's all about the tradeoffs. FPGA with no multipliers -> CORDIC, unless there are big memories available in which case additional tradeoffs have to be considered. Most FPGAs that don't have multipliers also don't have a lot of memory for LUTS.

There are lots of ways to do folded-wave implementations for sine generation, e.g., DDS implementations, etc. Again, managing the tradeoffs with the available resources usually drives how to do get it done. Using a quarter-wave LUT with an accumulator has been around forever, so it's a pretty well-studied problem.

Eric Jacobsen Anchor Hill Communications

formatting link

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 8:21 PM

harmonic

just

component

null

Mostly because the table is already optimal, unless you want to correct for other components.

filter,

table

fast.

and

Ok, then you can add that to the LUT. As you say, it will require a full 360 degree table. This will have a limit on the order of the harmonics you can apply, but I expect the higher order harmonics would have very, very small coefficients anyway.

--

Rick

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 8:23 PM

Ok, as a mathematician would say, "There exists a solution!"

--

Rick

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 9:36 PM

Almost exact, sin(90-x) = cos(x), not -cos(x). More important is sin(180-x) = sin(x) from the same equation above. This is the folding around 90 degrees. Or sin(90+x) = cos(x), but I don't have a cos table... lol

This is the two levels of folding that are "free" or at least very inexpensive. It requires controlled inversion on the bits of x and a controlled 2's complement on the LUT output.

You really lost me on this one. How do I get cos(45-x)? If using this requires a multiply by sqrt(2), that isn't a good thing. I don't have a problem with a 512 element table. The part I am using can provide up to

36 bit words, so I should be good to go with the full 90 degree table.

Separating x into two parts, 45 and x', where x' ranges from 0 to 45, the notation should be, sin(45+x') = sin(45)cos(x') + cos(45)sin(x') = sin(45) * (cos(x') + sin(x')). I'm not sure that helps on two levels. I don't have cos(x') and I don't want to do multiplies unless I have to.

I don't have resources to do a lot of math. A linear interpolation will work ok primarily because it only uses one multiply of limited length. I can share this entire circuit between the two channels because the CODEC interface is time multiplexed anyway.

I did some more work on the interpolation last night and I put to rest my last nagging concerns. I am confident I can get max errors of about ±2 lsbs using a 512 element LUT and 10 bit interpolation. This is not just the rounding errors in the calculations but also the error due to the input resolution which is 21 bits. The errors without the input resolution are limited to about 1.5 lsbs. I expect the RMS error is very small indeed as most point errors will be much smaller than the max.

--

Rick

- L
- langwadt
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 9:38 PM

...

e a

er

this

e of

.

out

is he still around? I just checked his page, last update was 2008

a quick google finds something that might give a quick view of how it works

formatting link

-Lasse

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 9:47 PM

formatting link

It is a lot of document for just two pages. Try this link to see the short version...

arius.com/foobar/1972_National_MOS_Integrated_Circuits_Pages_273-274.pdf

--

Rick

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 9:58 PM

Yes, but the error is much larger than the errors I am working with. I want an 18 bit output from the process... at least a 16 bit accurate since the DAC is only ~90 dB SINAD. I expect to get something that is about 17-17.5 ENOB going into the DAC.

I don't think the method described in the paper is so good when you want accuracies this good. The second LUT gets rather large. But then it doesn't use a multiplier does it?

They just meant to use a 2^n+1 long table rather than just 2^n length. No real problem if you are working in software with lots of memory I suppose.

Yep. About a factor of 35db worse than my design I expect.

One thing I realized about the descriptions most of these designs give is they only analyze the noise from the table inaccuracies *at the point of the sample*. In other words, they don't consider the noise from the input resolution. Even if you get perfect reconstruction of the sine values, such as a pure LUT, you still have the noise of the resolution of the input to the sine generation. Just my 2 cents worth...

--

Rick

- K
- krw
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 10:09 PM

His LinkedIn profile was updated in October of last year.

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 10:12 PM

That is an interesting point, but this is actually harder to calculate with than the difference. First, cos(x) is sin(90-x). Second, where we are looking for the sin(M+L) (M is msbs and L is lsbs) the linear interpolation becomes,

delta = sin(M+1)-sin(M) * L

This is a simple multiply.

To do this using the cos function it becomes,

delta = sin(1/Mmax) * cos(M) * L

sin(1/Mmax) is a constant to set the proper scaling factor. With two variables and a constant, this requires two multiplies rather than one multiply and one subtraction. As it turns out my LUT has lots of bits out so I may just store the difference eliminating one step. It depends on how I end up designing the actual circuit. The ram can either be a true dual port with 18 bits or a single port with 36 bits. So if I need the dual port, then I might not want to store the delta values. Either way, this is no biggie.

--

Rick

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 10:39 PM

This is the French spelling. Make sure you put the accent on the second syLAble.

--

Rick

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 10:57 PM

Yes, of course. I should have thought of that. I don't go to his site often, but his stuff is always excellent. I may have even downloaded this paper before. I took a look at the CORDIC once a few years ago and didn't get too far. I'm happy with my current approach for the moment, but I will dig into this more another time.

I appreciate the link. Thanks Jerry.

--

Rick

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 11:01 PM

I did a little work with this tonight and guess what, this is nearly the same as the linear interpolation. cos(M) is as you say, nearly one. So close in fact, that using the 10 msbs for M and an 18 bit output word, there value calculated for the 1023rd entry is 0x40000.

That leaves sin(L). Surprise! This turns out to be a straight line! So the formula is FAPP the same as the linear interpolation with the added fudge factor of multiplying sin(L) by cos(M) rather than just multiplying L directly with the difference sin(M+1)-sin(M). The sin*cos method does require a second table to be stored. In fact, the paper you reference uses a bit of slight of hand to include both the cos(M) * sin(L) at a lesser precision both in the input and output of this table. I'm not sure how well this would work with higher precision needs, but I suppose it is worth a bit further look. It would be nice to eliminate the multiplier which is used in the linear interpolation.

So I think this is just different strokes for different folks in the end. The sin*cos method might work out a little better with the RMS noise depending on how the second table is handled. But I think it may end up... well, in the noise anyway.

I think the real problem is that as the resolution of the input increases, the second table gets harder to keep small. But their trick of essentially tossing the middle bits seems to work fairly well.

Yes, I see this note. This seems to be exactly what I was thinking. But I don't think the data has to be subtracted. The unadjusted error is zero at the end points of each interpolated line and consistently increases in toward the middle from each end point. Adjusting the data in the table results in an error that instead goes positive and negative, but the table data is still always added, although they don't deal with the issues of implementing more than 90 degrees. If you go around the clock you then need to consider sign bits (as opposed to sine bits) on the inputs to the tables.

I am working in a small FPGA with rather limited block RAM capabilities. It is already on the board, so I can't swap it out. I also need two generator circuits which pushes the limits a bit more. At least I don't have any shortage of adder/subtractors or all the other logic this will need.

Thanks for the pointer and the advice.

--

Rick

- J
- Jamie
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 11:01 PM

formatting link

isn't it "fubar" ?

Jamie

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 11:17 PM

distortion!

and

near

You're no fun anymore. :-)

It does have a 0.01% FS range, so you'd expect to get a lot better than -80 dB.

But for fast stuff, I'd probably do the same as you--filter the daylights out of it and amplify the residuals.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
ElectroOptical Innovations LLC 
Optics, Electro-optics, Photonics, Analog Electronics 

160 North State Road #203 
Briarcliff Manor NY 10510 USA 
+1 845 480 2058 

hobbs at electrooptical dot net 
http://electrooptical.net

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 11:18 PM

(snip, ending on CORDIC)

I once thought I pretty well understood binary CORDIC, but it is also used in decimal on many pocket calculators. I never got far enough to figure out how they did that.

-- glen

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 11:48 PM

(snip, I wrote)

It has been a while since I used a table with actual interpoltation, but as I remember it they give you the spacing somewhere on the table. Now, you could have one table to look up the spacing and another to calculate the interpolation value (that is, a multiply table).

So, they divide the problem up into 16 different slopes, but the slopes are equally distributed in theta.

Well, it is also convenient for the size of ROMs they had available. (Or the other way around. A convenient use for the size of ROMs they had to sell.) The tradeoffs might be different for more, smaller ROMs. I believe the manual I had (and didn't find) has the actual ROM tables listed. At least I thought it did.

It could also be pipelined, especially with more levels of ROM.

(snip, I wrote)

Oh, I think I see what you mean. But it might have bigger slope discontinuities that way.

If you have more, smaller ROMs I think the idea can be repeated. That is, have High, Mid, and Low bits, and two levels of interpolation.

-- glen

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Mon, Mar 18, 2013 11:53 PM

Ok, semantics. What word should I use instead of "app"?

No, that didn't come from my numbers, well, not exactly. You took my numbers and turned it into a full 360 degree table. I was asking where you got 10 bits, not why a 1024 table needs 10 bits. Actually, in your case it is a 1025 table needing 11 bits.

Because that is extra logic and extra work. Not only that, it is beyond the capabilities of the DAC I am using. As it is, the linear interp will generate bits that extend below what is implied by the resolution of the inputs. I may use them or I may lop them off.

OK, thanks for the insight.

It is simple, the LUT gives values for each point in the table, precise to the resolution of the output. The sin function between these points varies from close to linear to close to quadratic. If you just interpolate between the "exact" points, you have an error that is

*always* negative anywhere except the end points of the segments. So it is better to bias the end points to center the linear approximation somewhere in the middle of the real curve. Of course, this has to be tempered by the resolution of your LUT output.

If you really want to see my work it is all in a spreadsheet at the moment. I can send it to you or I can copy some of the formulas here. My "optimization" is not nearly optimal. But it is an improvement over doing nothing I believe and is not so hard. I may drop it when I go to VHDL as it is iterative and may be a PITA to code in a function.

I don't think I'll do that. I'll most likely code it in VHDL. Why use yet another tool to generate the FPGA code? BTW, VHDL can do pretty much anything other tools can do. I simulated analog components in may last design... which is what I am doing this for. My sig gen won't work on the right range and so I'm making a sig gen out of a module I build.

BTW, to do the calculation you describe above, it has to take into account that the end points have to be truncated to finite resolution. Without that it is just minimizing noise to a certain level only to have it randomly upset. It may turn out that converting from real to finite resolution by rounding is not optimal for such a function.

I don't get your point really. The LUT values will deviate from the sin function because of one thing, the resolution of the output. The sin value calculated will deviate because of three things, input resolution, output resolution and the accuracy of the sin generation.

I get what you are saying about the mean square error. How would that impact the LUT values? Are you saying to bias them to minimize the error over the entire signal? Yes, that is what I am talking about, but I don't want to have to calculate the mean square error over the full sine wave. Curently that's two million points. I've upped my design parameters to a 512 entry table and and 10 bit interpolation.

Segments. I picture the linear approximations between end points as lines. When you improve one line segment by moving one end point you also impact the adjacent line segment. I have graphed these line segments in the spread sheet and looked at a lot of them. I think the "optimizations" make some improvement, but I'm not sure it is terribly significant. It turns out once you have a 512 entry table, the sin function between the end points is pretty close to linear or the deviation from linear is in the "noise" for an 18 bit output. For example, between LUT(511) and LUT(512) is just a step of 1. How quadratic can that be?

God! Thanks but no thanks. Is this really practical to solve for the entire curve? Remember that all points affect other points, possibly more than just the adjacent points. If point A changes point A+1, then it may also affect point A+2, etc. What would be more useful would be to have an idea of just how much *more* improvement can be found.

Blame my parents!

--

Rick

- R
- Robert Baer
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Tue, Mar 19, 2013 1:31 AM

formatting link

I had used my "standard" 4.0 Acrobat; fails in that manner very reliably. ver 7.0 works 100 percent; thanks.