Bezier lengths w/o lots of resources

- P
- Peter Dickerson
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Tue, Jun 29, 2010 6:36 AM

That's it. In fact your method has error O(h^4) which is the same (up to a factor) as using the distance between endpoints and then applying one round of Romberg. The difference is your method uses 4 sqrts at each stange and mine uses 3 (one from the previous lower resolution + 2 from the current split). However, your factor is smaller, so the actual error at each step is slightly smaller in your case. In the example table below the raw length estimate uses the sqrts column number of square roots for that row. The Romberg result needs that row's sqrts plus the previous row's. The double Romberg need sqrts plus the previous two rows. So my method can get to 1 part per billion (ppb) using 56 sqrts (32+16+8), while yours uses 96 (64+32).

Control points (0.000000,0.000000) (0.000000,0.500000) (0.500000,0.500000) (0.500000,0.000000)

sqrts length Romberg Romberg^2 1 0.500000000 2 0.901387819 1.035183758 4 0.975359556 1.000016801 0.997672337 8 0.993857921 1.000024042 1.000024525 16 0.998465633 1.000001538 1.000000037 32 0.999616481 1.000000096 1.000000000 64 0.999904125 1.000000006 1.000000000 128 0.999976031 1.000000000 1.000000000 256 0.999994008 1.000000000 1.000000000 512 0.999998502 1.000000000 1.000000000 1024 0.999999625 1.000000000 1.000000000

sqrts DY length DY Romberg 4 1.000000000 8 1.002470605 1.002635312 16 1.000128079 0.999971911 32 1.000007988 0.999999982 64 1.000000499 1.000000000 128 1.000000031 1.000000000 256 1.000000002 1.000000000 512 1.000000000 1.000000000

1024 1.000000000 1.000000000 2048 1.000000000 1.000000000 4096 1.000000000 1.000000000 [snip]

Yes, you are right. Not particularly pathological. The point I was making is that I think there is a lower limit on the path length/distance along control points otherwise the pathlength can approach zero with the control points a finite distance apart.

[snip]

Its a cubic! How much change of curvature are you expecting?

And spend more effort working out which places they are (if any). That doesn't come for free.

But that has little to do with recursion. That I integrated along the path in t order is not the point. I could equally have done it via the curve splitting (had I known how to do it at the time). The point was to show that the Romberg method can reduce the amount of work dramatically without any smarts. In my example above it reduced 1024 sqrts (proxy for work) for 375 ppb error to 56 sqrts for > There aren't any hard curves. It's a cubic.

if this curve turns up a lot then you are using the wrong technology :-)

Yes, a bit harder, particularly if you are looking for ppb accuracy.

That has to be a heuristic until you know how you are calculating the estimates. However, the ratio of the endpoint distance to the sum of control point distances will give some idea. If it i small then curve has turn back on itself. If it's large (approaching 1) then its nearly straight. It can't be nearly straight and highly twisted and still be a cubic...

Peter

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Tue, Jun 29, 2010 4:08 PM

The first iteration of my approach uses 4 sqrts. Thereafter, you only need 2.5 of them. (I haven't checked to see what other optimizations might be implied by the geometry)

"Double Romberg"? Is that like "double secret probation"?? :>

I'm not sure how to adjust the sqrts column, here, to reflect my 2.5/iteration... (?). I should just write it up and let it do the crunching. (I assume you are just forcing the algorithm to run to a depth of N regardless of the actual error that is estimated at that depth?)

Agreed.

But that's the point: see what it is that makes a curve (or, a "subcurve") too curvy and see how easy it is to identify those criteria *without* having to invest lots of effort. E.g., the example I gave shows one of the "control point vectors" as having a really big magnitude compared to the angle it forms with its neighbors --> sharp bend. (I already need to know that magnitude so that doesn't cost anything extra).

Agreed. :> But, any representation used for "curves" has to also handle it -- else you need to provide a different encoding and a "flavor marker".

[I have assumed any equivalent representation for said "curve" would be equally efficient, computationally. WITHOUT CHECKING FOR SPECIAL CONDITIONS, I am not sure that is always the case with this. I.e., it may be wiser to use a particular representation than some others...?]

But, I might be able to detect this "problem area" just by looking at dx,dy's of contiguous chords/vectors. I.e., as if doing trig but without the *actual* trig (e.g., if dx >= 100*dy ...)

Yes, but for a given sum of control point distances to endpoint distances -ratio, you can have very different curve characters. E.g., compare a closed/nearly closed curve (distance from start to end very small) to an "s/z" formed with endpoints the same distance apart but control points on opposite sides of the endpoint-line.

I think you have to look at two contiguous chords and their angle (plus magnitudes) to properly identify a "tight corner".

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Tue, Jun 29, 2010 6:44 PM

OK, I made a stab at trying to "adjust" this for a more efficient algorithm (2.5/iteration).

Trusting your "results", I assume the table would be:

sqrts DY length DY Romberg 4 1.000000000 5 1.002470605 1.002635312 10 1.000128079 0.999971911 20 1.000007988 0.999999982 40 1.000000499 1.000000000

I.e., at this point, I have invested *79* sqrts (4+5+10+...). Note, however, that some of those might never occur if the estimate for a particular "subcurve" (ick!) looks "good enough" (it need not be further subdivided).

If I blindly force N subdivisions, it changes the mix.

I'll try to make some time to code up a "real" algorithm and instrument SQRT() to do the counting for me with various "estimate tolerances".

- P
- Peter Dickerson
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Wed, Jun 30, 2010 5:45 AM

[snip]

Sorry, I don't understand where the 2.5 sqrts/iteration comes from. When you split the line and have two sets of four control points you need eight distances (sqrt) but none of those have been computed at the previous level. The only place I see five appear is that the are five new points to calculate, one of which is used twice (giving six) and the start and end points from the original (now we have eight). None of that uses sqrts and the time to calculate it is potentially insignificant relative to the sqrts (e.g. it can be done with integers of sufficient size by rescaling by eight at each nesting).

You've still lost e.

Surely you can identify any 'sharp' regions of the curve from the radius of curvature (or at least its square, otherwise you'll need a sqrt for that). A proxy might be dx/dt*d2y/dt2 - dy/dt*d2x/dt2 being small.

Peter

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Jul 1, 2010 8:55 AM

Given a Start point (S), departing control point (D), arriving control point (A), and End point (E), [i.e., these are P0 - P3]:

The first iteration computes the lengths of |SD|, |DA|, |AE| and compares them against |SE| (i.e., 4 sqrts).

Then the curve is subdivided into two half-curves ("Left" and "Right"). The control points for the two half curve are: M = (D+A)/2

L0 = S R3 = E

L1 = (S+D)/2 R2 = (A+E)/2

L2 = (L1+M)/2 R1 = (M+R2)/2

L3 = R0 = (L2+R1)/2

Looking at the Left half curve L0, L1, L2, L3:

Compute |L0L1|, |L1L2| and |L2L3| and compare against |L0L3|.

But, |L0L1| is actually just half of |AD| -- which we already have from the first iteration!

However, we do have to compute |L1L2|. And |L0L3|. There's two sqrts.

OTOH, when we compute |L2L3|, this is really half of |L2R1|. So, whatever result we get there we can use for the |R0R1| computation! So, that's a 0.5 sqrt.

(Did I do that right?)

Assume I am correct with 2.5 sqrts per "half curve". And 4 on the first iteration.

So, second iteration is 5 (2.5 + 2.5) sqrts (assuming I am not looking at the closeness of fit).

Third iteration cuts each of the two half curves from the second iteration in half using 5 sqrts for each of the second iteration half curves -- 10 sqrts total.

20 for the next. etc.

Since my algorithm starts with the *first* ITERATION and conditionally subdivides, it takes on the costs of each iteration as it subdivides the curve(s). So, my total cost is the sum of the costs of *every* iteration.

I.e., my code computes "sum of segments" and "endpoint length"; checks to see how close they are to each other (or, does your magic Romberg estimate and compares

*that* to these lengths to see how closely *it* fits); and, if the fit isn;t good enough, *then* it subdivides the curve and does another iteration of chord length computations. (but, it has already incurred the cost of the computations leading up to that decision to recurse!)

- B
- Boudewijn Dijkstra
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Jul 1, 2010 2:17 PM

Op Tue, 15 Jun 2010 03:23:44 +0200 schreef D Yuniskis :

Last week it dawned to me that any function, whether real or complex, travels a path of which the distance can be calculated. (Except at vertical asymptotes.) After some dabbling I came up with the following definition.

The distance travelled by a function f(t) from t=t0 to t=t1 is given by: length(f(t), t0, t1) = int(abs(diff(f(t),t)),t=t0..t1) where int=integrate, abs=absolute, diff=derivative

Note 1: the use of abs() yields sqrt() operations if f(t) is complex.

Note 2: for a complex function z(t)=x(t)+j*y(t): length(z(t), t0, t1) != length(x(t), t0, t1) + length(y(t), t0, t1) but it is usually a good guess for the upper bound.

Note 3: for the same complex function z(t): length(z(t), t0, t1) != sqrt(length(x(t), t0, t1)^2 + length(y(t), t0, t1)^2) but it is usually a good ballpark estimate.

With the cubic Bezier curve function:

B[3](t) = (1-t)^3*P[0]+3*(1-t)^2*t*P[1]+3*(1-t)*t^2*P[2]+t^3*P[3]

We get the following results when combining:

length(B[3](t), 0, 1) = Int(abs(3*(1-t)^2*P[0]+6*(1-t)*t*P[1]-3*(1-t)^2*P[1]+3*t^2*P[2]-6*(1-t)*t*P[2]-3*t^2*P[3]), t=0..1)

Unfortunately my old copy of Maple wasn't able to simplify this any further. But the result can be calculated using numerical integration.

Note 4: if you know the roots of the x- and y-function (where the derivative is zero), you can set up interpolation points.

--
Gemaakt met Opera's revolutionaire e-mailprogramma:
http://www.opera.com/mail/
(remove the obvious prefix to reply by mail)

- A
- Albert van der Horst
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, Jul 1, 2010 4:19 PM

5 points is 10-th order... I guess experiment trumps theory.

This is my simplistic analysis: I looked up some theory behind Bezier and I arrive at L = integral( sqrt(p(t)) ) dt Now t=0..1 and p(t) = A.t^4 + B.t^3 + C.t^2 + D.T + E (Product of derivatives of two third order poly's)

Assuming a rather flat curve with E larger than all the others:

sqrt(p(t)) ~ sqrt(E) * ( 1 + D/2E * t .. A/2E * t^4 )

This should be good news for a Gauss that is exact for order 4, such as yours, or is it?

Numerical Recipes warns: tinkering with Gauss base points may "*deny* high order accuracy to integrands that are otherwise perfectly smooth and well-behaved".

Groetjes Albert

--

--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

- P
- Peter Dickerson
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, Jul 2, 2010 10:15 AM

Not sure that's so. You mean |SD|, I guess.

OK. L2 is the middle of L2 and R1. An aweful lot of degneracy here.

So, you're say that you need five additional sqrts to compute one line, not five sqrts. In my tables the sqrts column is the number of sqrts needed to calculate the second column. But in you case you need to include *some of* the counts higher up the table. Then when it comes to using Romberg on your method you need all the sqrts done for the previous row whereas you only need some to compute the current. So not directly comparable. In particular if we have some method to estimate the depth needed ahead of time then my method would not need to start from the top (although its a convenient way to calculate the sub-curve endpoints). So I would only do sqrts that are needed where as you'd need to do some earlier ones and pass them down. Still, the naive method+Romberg is very similar work to your's without, and botgh are O(h^4).

see above.

I'm just doing the calculation to a fixed depth to see how fast it converges. There is no reason why I couldn't compare row against the previous and stop when they are close enough. Close enough means meeting some nearness criterion for the non-romberg results. That nearness will be much larger than the actual error after Romberg.

Peter

- A
- Albert van der Horst
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, Jul 2, 2010 6:15 PM

What did you do? I inserted a ds = sqrt( x'*x' +y'*y')dt and got the following result for the 5 point (10th order) Gaussian:

Control points (0.000000,0.000000) (0.000000,1.000000) (1.000000,1.000000) (1.000000,0.000000)

Derive control points (0.000000,1.000000) (1.000000,0.000000) (0.000000,-1.000000)

#ival length Romberg Gauss 1 1.00000000 ------------ 2.00000000 2 1.80277564 2.07036752 2.00000000 4 1.95071911 2.00003360 2.00000000 8 1.98771584 2.00004808 2.00000000 16 1.99693127 2.00000308 2.00000000 32 1.99923296 2.00000019 2.00000000 64 1.99980825 2.00000001 2.00000000 128 1.99995206 2.00000000 2.00000000 256 1.99998802 2.00000000 2.00000000 512 1.99999700 2.00000000 2.00000000

1024 1.99999925 2.00000000 2.00000000

Sorry, if the above looks like a fake ;-) Run the program below if you don't believe it. So doing integration of the whole interval at the expense of 10 sqrt gives 8 digits precision, good enough for government work. Splitting up the interval doesn't improve or deteriorate precision.

Tabulating the function integrated shows a cusp, the sqrt causes the derivative to change sign. This should be bad for Gauss, but in this case the discontinuity is at an interval boundary, but not for 1 interval. Did we get lucky?

Well here is the code. No warranty of any kind.

---------------------8

- P
- Peter Dickerson
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, Jul 2, 2010 7:03 PM

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, Jul 2, 2010 9:49 PM

[snip]

Yes, sorry. (I was mentally using A, B, C, D as point names and had to adjust them to fit the previous names I had used. Obviously got my wires crossed. > However, we do have to compute |L1L2|. And |L0L3|.

I need 3 to compute one "half curve" -- or, 5 to compute *two* half curves.

I need to include *all* of the counts above. Sometimes, I might not have to use all of the counts on the "current line" (or on lines below) -- if I can assure myself that the estimate for that "half curve" is good enough. (i.e., stop recursing on that half curve)

Correct. If you don't know a priori how far down the table you are going to have to "start", then you have to work your way down to it, dynamically examining the goodness of the estimate to determine if you're far enough down.

Exactly! Hence my interest in being able to determine which curves are "hard" (require fine subdivision) vs. "easy" (can be approximated with simple linen segments) by examining the control points!

Yes. Without knowledge of how good an estimate will be "a priori", I make a stab at computing the length (estimate) in the *hope* that it will be "good enough". The subdivision algorithm lends itself to a recursive divide-and-conquer approach whereby the initial problem is now two similar, but *smaller*, problems (which are tackled the same way).

Yes. But, if you do that, then you have to assume the costs of computing that "not-good-enough" estimate in the total cost of your final estimate.

E.g., if I "get lucky" and my first estimate fits "perfectly" (for sufficiently small values of "perfectly"), then I quit after 4 sqrts. If *not*, I get to salvage up to two of those sqrts in my next iteration (recursion).

- A
- Albert van der Horst
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Sat, Jul 3, 2010 4:11 PM

The above code from numerical recipes uses 10 points, there are just

5 constants, because they take advantage of symmetry where those are pairwise the same. At each point I need to calculate a square root.

--

--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

- P
- Peter Dickerson
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Sun, Jul 4, 2010 5:14 AM

Yes, I was using five points in total. If this works for the tricker cases that Don raised then I'd say that's job done! Of course the static double arrays should be static const and the 0-th element is wasted presumably due to translation from the Fortran version. An integerized version of this scheme would be more troublesome since the constants need high precision.

Peter

- A
- Albert van der Horst
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Mon, Jul 5, 2010 10:26 PM

Should be no problem with fixed point, as long as the precision of the ints is sufficient. I would suspect overflow need more consideration than precision.

Somehow I have the impression that it is not clear how different my approach is. Maybe an example. Suppose we want to know the length of a circle. The parametrization is (sin(t), cos(t) ) t=0,2.pi

L = int( len(t) ) dt We know dsin(t)=cos(t)dt dcos(t) = -sin(t)dt

Here again ds = sqrt( dx^2, dy^2) = sqrt( cos(t)^2+sin(t)^2 ) dt = ( 1) dt

or L = int( 1 ) dt , ==> L=t1-t0=2.pi-0=2.pi

So as long as the length along the curve goes smooth with t, calculation is precise. There is no talk about dividing the curve itself in segments. (Gauss intervals could be subdivided, where the *length* depends on t more wildly.

--

--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

- P
- Peter Dickerson
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Tue, Jul 6, 2010 6:11 AM

[snip]

Yes, I do understand how it works quite well, including the derivation. My problem was that I was only using five points. However, some experimentation with your code and more inconvenient control points leads to poor results in the same way that I saw with five points, except of course that the errors are smaller. The problem is that the Gauss quadrature is a polynomial fit procedure but the ds/dt has a square root in it. That is problematic for the Romberg procedure too. The reason you got such good results on the examle set of control points is that the curve was symmetric. If ds/dt is significantly different in one region of the curve to another then the sqrt term leads to a poor poly fit. If not then the sqrt can be well approximated by a power series expansion.

For the record I have a masters in mathematics from Cambridge. Its just that

30 years of gravity since then has caused the contents of neurons to drain to my feet and be lost in nail clippings!

Peter

- A
- Albert van der Horst
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Wed, Jul 7, 2010 2:58 AM

I just wanted to demonstrate a basically sound approach and thought that the elementary explanation would be interesting for the readers at large. There is no way that just using Gauss means that error estimates and interval partitioning are no longer needed, but I was too lazy to apply Romberg to the calculated length.

I tried the third control point at 3,3 and got a one to ten variation in ds/dt with the same improvement over Romberg, so just the variation in ds/dt can't be the whole story. Your argument about the square root is valid, but you are pessimistic. It is the square root of a 4-th order polynomial, and so it could be well be a quadratic polynomial.

This is an interesting question and it merits a careful study and analysis. There is a lot of literature about splines, but I found little or nothing about length.

For me it was 40 years ago that I was a promising math student. I'm trying to keep up, currently at expert level in projecteuler.net. (Was in the Eulerians list, before they changed it from coverage to speed. ;-( )

Groetjes Albert

--

--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

- P
- Peter Dickerson
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Wed, Jul 7, 2010 5:36 AM

It was indeed a valuable contribution. It shows I (we?) was all being more simplistic than I thought. Too much trying to match the curve and too little analysis (in the math sense).

To apply Romberg you need a power law of some sort for convergence. I'm not sure if thats what you'll see. In particular the errors may change sign with each subdivision.

Maybe, but since the 4th order is the sum of two quadratics its unlikely to end up as a quadratic unless one of the pair is 0.

Indeed. The sqrt make it a non-linear problem. Gen Rel has a similar problem with sqrt(g).

I never claimed to be promising. In fact I'm very slow at everything.

Peter