Point on Line Segment in 2D. Which code is faster ? Can you improve it ?

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Sep 16, 2005 3:18 PM

(snip)

Then don't use floating point math.

As you don't indicate the origin of the problem, we can't help any more than that. In some cases it can be done in fixed point, especially if the segment ends and point being checked are fixed point values.

Otherwise, if you select two end points somewhat randomly there is a high probability that no floating point values lie on the segment between them.

-- glen

- S
- Skybuck Flying
  
  Contact options for registered users
posted
18 years ago

Fri, Sep 16, 2005 3:28 PM

Hi,

I needed a method to determine if a point was on a line segment in 2D. So I googled for some help and so far I have evaluated two methods.

The first method was only a formula, the second method was a piece of C code which turned out to be incorrect and incomplete but by modifieing it would still be usuable.

The first method was this piece of text:

" if x is between StartX and EndX or if y is between StartY and EndY, then the point (x,y) is on the line.

Another way is the parametric form of a line,ie: X = StartX + t0 * (EndX - StartX) Y = StartY + t1 * (EndY - StartY) calculate t0 and t1, if t0 and t1 are equal, then the point is on the line if the parameter (t0 = t1) is between 0 and 1, the the point is on the segment. "

The second method was this piece of text:

" Very simple,

First test if the x-value of the point is within the interval StartX,EndX. If this is not true the point is not on the line.

int on_line(x, y, StartX, StartY, EndX, EndY) { int dx, dy, dx1, dy1;

if (xEndX) return 0; dx = EndX-StartX; dy = EndY-StartY; dx1 = x-StartX; dy1 = y-StartY;

return (dx*dy1 == dy*dx1); } "

So after reading the first method I needed to be able to calculate T0 and T1 since all other variables are known the formula's can be rewritten as to calculate T0 and T1.

Changing the formula's to get the final formula:

Step1 original: X = StartX + t0 * (EndX - StartX) Y = StartY + t1 * (EndY - StartY)

Step2 delta's introduced: DeltaX = EndX - StartX DeltaY = EndY - Starty X = StartX + t0 * (DeltaX) Y = StartY + t1 * (DeltaY)

Step3 switch StartX + t0 * (DeltaX) = X StartY + t1 * (DeltaY) = Y

Step4 start to isolate t0 and t1 by moving StartX and StartY to the otherside (thanks to my math teacher for those usefull lessons =D) t0 * (DeltaX) = X - StartX t1 * (DeltaY) = Y - StartY

Step5 isolate to and t1 further by moving delta's to the other side (thanks to my math teacher again for those usefull lessons =D) t0 = (X - StartX)/DeltaX t1 = (Y - StartY)/DeltaY

If you look at the original formula you will notice how t0 and t1 are sort of distances on the line segment itself ranging from 0.0 to 1.0 (like a percentage).

The second method looks a little bit similair but works totally different.

It multiplies the delta's with the line segment with the delta's with the start and point. Which kinda looks like a dot product.. I dont know what that's all about ;) Maybe this is just another way of calculating how much distance each component has. (There are two components x and y, to be on the line they both need to have the same distance from the start point for example ;) ofcourse this could mean a circle... but since you working with delta's there is only one possible direction)

First of all the C method is incorrect unless the coordinates are pre-sorted from left to right. If the coordinates are not sorted, for example (100,10)-(10,50) (going from right to left) the method will simply exit because of the first if statement which is just wrong. So this is easily fixed by removing that if statement.

However I also consider the C method to be incomplete. It's fine if one wants to calculate if a point is on an infinite line. It's impossible to tell if the original poster wanted a line segment or just a line... he does mention an start and end point... For my purposes however I need to be able to tell if a point is on a line segment. So this C method needs to be modified/expanded to accomadate for it.

The fascination starts with the C method looking very fast. Just a few lines of code and a multiplication. It also looks a bit dangerous because of the multiplication which could overflow... but all in all not to bad/shabby and definetly worth a try.

The first method looks kinda big and slower because of more variables/code and divisions etc, but it does look complete and stable and robust.

So now for a reality check.

I have implemented both versions in pascal/delphi which I believe to produce correct results everytime.

The question is which method is faster. I would place my money on the first method. Simply because the code is shorter and it has to do less floating point comparisions and less jumps than the second method.

The C method (second method) actually needs a bounding box to be constructed to be usefull (for line segment tests). The bounding box is necessary to determine if the point is inside the line segment. Therefore I believe the C method in this case to be slower. It requires more instructions, more comparisions and more jumps.

I think I have done a fair job of optimizing both implementations by trying to reduce the number of comparisions, jumps and optimizing for branch prediction (most likely case fall through etc).

I spent lot's of time optimizing method1 to reduce the number of compares by first checking one delta before checking the next etc.. that simply saved one or two compares and jumps compared to previous versions.

I also spent some time optimizing method2 but probably a little bit less time. I inlined the bounding box calculation and I also postponed it until after the if statement etc as to not execute if the if statement is false.

Finally the point needs to be checked to be inside or outside the bounding box. Checking for outside is easier/shorter for multiple bounding so I coded it like that. However in this case there is only one bounding box so in this case it doesn't matter and it could also be checked to be inside. The generated assembler would however be the same so it doesn't matter.

For my purposes the diagonal case will be the most likely and occuring case so it's most interesting to see which method would be faster.

The first method requires 53 instructions for the diagonal case (with point on line)

The second method requires 81 instructions for the diagonal case (with point on line)

Both methods require the same ammounts of pushes onto the stack so those pushes have not been counted.

The current implementations look like this:

v3 is the first method v5 is the second method

While writing this post I discovered a flaw in v3, fixed it, and even optimized it. I think v3 should now be much faster for case horizontal and case vertical than v5.

So the main focus is on the diagonal case.

My question to you the delphi, electronics and computer architecture community is:

Which code do you think is faster ?

Can you explain it purely based on theory without doing any performance measurements ? ;)

For example: number of instructions, instruction cycles, branch prediction, pipe line optimizations, pairing, etc.

I shall provide you with the current pascal/delphi and the assembler code:

If you think you really are an expert in the optimization field/instruction/hardware/assembler field then next question will be ofcourse:

Can you optimize these methods or come up with a faster method ?

Quite frankly I am obsessed with this code. It's such a trivial code that I dont want my PC to be spending to much time on it... currently this code is not being used in any algorithms. I could use alternative information from a line segment intersection test algorithm which can already determine if one of the 4 points is on a line segment yes/no etc... but that would make the routine's header much more complex and maybe I dont want that at first at least.... so I might choose to use a routine like below to keep things a bit more simple and pragmatic... etc.

So after each line segment intersection test this routine would also need to be called, so it would be called a lot, so I dont want it to be a lot of overhead if you know what I mean ;)

And even if this code will never be used it's still quite fascinating and educational to see/learn how floating point optimizations work etc... that's the main reason why I am investigating this and what you to have a look at this because I am puzzled =D

Ok now the subjective stuff:

My gutt feeling says v3 is faster than v5 because v3 uses less compares and less jumps which should be good for branch prediction etc.. v3 also has less instructions.

However v3 uses two divisions (fdiv) which might be a lot slower than multiplications (fmul) which version v5 uses two of. However my gutt feeling says v5 has the bounding box thing going on etc so it needs to do more stuff/instructions which could make it slow ;)

Add the unknown factors like branch prediction, pipe lining, pairing and god knows what else and you start to see the doubt hehehehehe.

Can you sort it out ? ;) =D

Or do you need performance measurements like me ?

And let's be honest about that... How are we going to do accurate performance measurements ?

Lot's of context switching, multi threading, harddisk activity going on, on windows xp, so I am not too sure if that is reliable but ok :)

Here is the pascal/delphi code:

function point_on_line_segment_in_2d_v3( StartX, StartY : double; EndX, EndY : double; PointX, PointY : double ) : boolean; var vDeltaX : double; vDeltaY : double;

vDistanceXcomponent : double; vDistanceYcomponent : double; begin // default is false result := false;

vDeltaX := EndX - StartX; vDeltaY := EndY - StartY;

// if vDeltaX is not zero then that means the line is horizontal or diagonal if (vDeltaX 0) then begin // if vDeltaY is not zero then that means the line is diagonal if (vDeltaY 0) then begin // line is diagonal

// calculate both components and look if they are the same // if yes then see if they both lie between 0 and 1. vDistanceXcomponent := (PointX - StartX) / vDeltaX; vDistanceYcomponent := (PointY - StartY) / vDeltaY;

if (vDistanceXcomponent = vDistanceYcomponent) then begin if (vDistanceXcomponent>0) and (vDistanceXcomponent StartX) and (PointX < EndX) then begin result := true; end; end else begin if (PointX > EndX) and (PointX < StartX) then begin result := true; end; end;

// not necessary simply check if between startx and endx or // vica versa depending on if endx is greater then startx etc // since fdiv seems to be slow let's do it with some compares // see above ;) { // calculate and look at the horizontal component if it lies between 0 and 1. vDistanceXcomponent := (PointX - StartX) / vDeltaX;

if (vDistanceXComponent>0) and (vDistanceXComponent StartY) and (PointY < EndY) then begin result := true; end; end else begin if (PointY > EndY) and (PointY < StartY) then begin result := true; end; end;

{ // calculate and see if the vertical component lies between 0 and 1 vDistanceYcomponent := (PointY - StartY) / vDeltaY;

if (vDistanceYComponent>0) and (vDistanceYComponent

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Sep 16, 2005 3:34 PM

(snip)

There is a story about when D.Knuth was working on TeX and DVI processors, and generating output on what was supposed to be a 5333 dpi device that the output looked wrong. When it was found to be 5333+1/3 dpi, and the code was changed, it all looked right.

The eye is very sensitive to some things, and not sensitive at all to others, but not always the ones we expect. For most curves, a half pixel is probably right, but you have to watch for other effects, mostly related to aliasing.

-- glen

- S
- Skybuck Flying
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Sep 16, 2005 4:26 PM

So I

Congratulations, you just reposted the C/second method ;)

Though the < 0.0000001 is new which I dont quite understand ;)

Bye, Skybuck.

- S
- Skybuck Flying
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Sep 16, 2005 4:29 PM

So I

Besides it says OnLine ?

I need OnLineSegment...

See the difference ?

Bye, Skybuck.

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Sep 16, 2005 4:35 PM

(snip)

You really need to tell us what "this case" is.

You expect everyone to help you, but don't seem interested in helping much yourself.

-- glen

- S
- Skybuck Flying
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Sep 16, 2005 7:26 PM

Actually I have a confesion to make as well ;)

The second method (the C method is flawed).

I am writing a test/measure program, I still need to input some more verification data ;)

Tomorrow or so it will be done and then I will post it so everybody can verify and test there goddamn methods lol =D

Bye, Skybuck.

- V
- Vincent Zweije
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Sep 16, 2005 10:55 PM

|| Nicholas Sherlock wrote: || > Skybuck Flying wrote: || >

|| >> I needed a method to determine if a point was on a line segment in 2D. || >> So I || >> googled for some help and so far I have evaluated two methods. || >

|| > function PointOnLine2D(x1,y1, x2,y2, x3,y3:double):boolean; || > begin || > result:=((x2 - x1) * (y3 - y1)) - ((x3 - x1) * (y2 - y1)) < 0.0000001; || > end; || || Sorry, I missed some of this code from my app: || || function PointOnLine2D(x1, y1, x2, y2, x3, y3: double): boolean; || begin || result := || (((x3 >= x1) and (x3 = x2) and (x3 = y1) and (y3 = y2) and (y3

- S
- Skybuck Flying
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Sep 16, 2005 11:26 PM

Yes I figured that much. So this would mean the function returns true if abs(blabla) is near zero ?

Why use 0.00001 why not 0.0000001 or 0.00000000000001 ?

Personally I don't like epsilons like this... it leaves room for error/malfunction ?

function PointOnLine2D(x1, y1, x2, y2, x3, y3: double): boolean; begin result := (((x3 >= x1) and (x3 = x2) and (x3 = y1) and (y3 = y2) and (y3 = and and

- M
- Maarten Wiltink
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Sep 16, 2005 11:55 PM

Some real life logic: once in a graphics rendering lab assignment, we were instructed to approximate Bezier curves. This is an iterative process; the stop condition was for the error to become less then half a pixel. For visualisation on a raster display, that is exactly what's required.

Groetjes, Maarten Wiltink

- S
- Skybuck Flying
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Sep 17, 2005 12:06 AM

That certainly won't do in this case etc.

It should be as exact/accurate as possible.

Bye, Skybuck.

- N
- Nicholas Sherlock
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Sep 17, 2005 2:12 AM

function PointOnLine2D(x1,y1, x2,y2, x3,y3:double):boolean; begin result:=((x2 - x1) * (y3 - y1)) - ((x3 - x1) * (y2 - y1)) < 0.0000001; end;

Cheers, Nicholas Sherlock

- N
- Nicholas Sherlock
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Sep 17, 2005 2:25 AM

Sorry, I missed some of this code from my app:

function PointOnLine2D(x1, y1, x2, y2, x3, y3: double): boolean; begin result := (((x3 >= x1) and (x3 = x2) and (x3 = y1) and (y3 = y2) and (y3

- N
- Nicholas Sherlock
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Sep 17, 2005 2:25 AM

The other code I posted does segment.

Cheers, Nicholas Sherlock

- N
- Nicholas Sherlock
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Sep 17, 2005 2:28 AM

Floating point math is always inaccurate as most numbers cannot be exactly represented. This bit basically takes care of that by adding a fudge factor.

Cheers, Nicholas Sherlock

- K
- Ketil Malde
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Sep 17, 2005 8:41 AM

Then use rational numbers?

-k

--
If I haven\'t seen further, it is by standing in the footprints of giants

- S
- Skybuck Flying
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Sep 17, 2005 12:05 PM

You ll see quite soon the test program is done ;)

Everybody should simply do his/her best at "as accurate" as possible.

Though it should also be fast.

So you can use single or double floating point format.

Whatever floats your boat ;)

Though I would suggest doubles since those can handle higher/better precision.. bigger/smaller numbers etc.

Using epsilon's and stuff like should probably be prevented as much as possible since those are common forms of errors etc and limitations etc... ;)

Bye, Skybuck.

- S
- Skybuck Flying
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Sep 17, 2005 12:07 PM

You are more then welcome to prove this very soon. I'll shall do or attempt to do two things:

"allow dll plugins" for the test program and "allow data/verification" files for the test program.

As to provide a binary and test files for those people who don't have a pascal compiler, or can't program or don't want to program or just want to supply some verification data =D

It's gonna be quite cool lol.

Bye, Skybuck.

- N
- nobody
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Sep 17, 2005 7:26 PM

You have a gross and dangerous misunderstanding of principles of doing numerical processing with the computer. I suggest you study the fundmentals carefully first before wasting your time writing what will undoubtedly be some ridicolously naive and rather useless code.

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sun, Sep 18, 2005 1:32 AM

(snip)

I pick the points (0,0) and (512,997) slightly easier to see as integers, but you can shift the binary point over and make them floating point. Find any point on the segment between them.

-- glen