Hi,
I needed a method to determine if a point was on a line segment in 2D. So I googled for some help and so far I have evaluated two methods.
The first method was only a formula, the second method was a piece of C code which turned out to be incorrect and incomplete but by modifieing it would still be usuable.
The first method was this piece of text:
" if x is between StartX and EndX or if y is between StartY and EndY, then the point (x,y) is on the line.
Another way is the parametric form of a line,ie: X = StartX + t0 * (EndX - StartX) Y = StartY + t1 * (EndY - StartY) calculate t0 and t1, if t0 and t1 are equal, then the point is on the line if the parameter (t0 = t1) is between 0 and 1, the the point is on the segment. "
The second method was this piece of text:
" Very simple,
First test if the x-value of the point is within the interval StartX,EndX. If this is not true the point is not on the line.
int on_line(x, y, StartX, StartY, EndX, EndY) { int dx, dy, dx1, dy1;
if (xEndX) return 0; dx = EndX-StartX; dy = EndY-StartY; dx1 = x-StartX; dy1 = y-StartY;
return (dx*dy1 == dy*dx1); } "
So after reading the first method I needed to be able to calculate T0 and T1 since all other variables are known the formula's can be rewritten as to calculate T0 and T1.
Changing the formula's to get the final formula:
Step1 original: X = StartX + t0 * (EndX - StartX) Y = StartY + t1 * (EndY - StartY)
Step2 delta's introduced: DeltaX = EndX - StartX DeltaY = EndY - Starty X = StartX + t0 * (DeltaX) Y = StartY + t1 * (DeltaY)
Step3 switch StartX + t0 * (DeltaX) = X StartY + t1 * (DeltaY) = Y
Step4 start to isolate t0 and t1 by moving StartX and StartY to the otherside (thanks to my math teacher for those usefull lessons =D) t0 * (DeltaX) = X - StartX t1 * (DeltaY) = Y - StartY
Step5 isolate to and t1 further by moving delta's to the other side (thanks to my math teacher again for those usefull lessons =D) t0 = (X - StartX)/DeltaX t1 = (Y - StartY)/DeltaY
If you look at the original formula you will notice how t0 and t1 are sort of distances on the line segment itself ranging from 0.0 to 1.0 (like a percentage).
The second method looks a little bit similair but works totally different.
It multiplies the delta's with the line segment with the delta's with the start and point. Which kinda looks like a dot product.. I dont know what that's all about ;) Maybe this is just another way of calculating how much distance each component has. (There are two components x and y, to be on the line they both need to have the same distance from the start point for example ;) ofcourse this could mean a circle... but since you working with delta's there is only one possible direction)
First of all the C method is incorrect unless the coordinates are pre-sorted from left to right. If the coordinates are not sorted, for example (100,10)-(10,50) (going from right to left) the method will simply exit because of the first if statement which is just wrong. So this is easily fixed by removing that if statement.
However I also consider the C method to be incomplete. It's fine if one wants to calculate if a point is on an infinite line. It's impossible to tell if the original poster wanted a line segment or just a line... he does mention an start and end point... For my purposes however I need to be able to tell if a point is on a line segment. So this C method needs to be modified/expanded to accomadate for it.
The fascination starts with the C method looking very fast. Just a few lines of code and a multiplication. It also looks a bit dangerous because of the multiplication which could overflow... but all in all not to bad/shabby and definetly worth a try.
The first method looks kinda big and slower because of more variables/code and divisions etc, but it does look complete and stable and robust.
So now for a reality check.
I have implemented both versions in pascal/delphi which I believe to produce correct results everytime.
The question is which method is faster. I would place my money on the first method. Simply because the code is shorter and it has to do less floating point comparisions and less jumps than the second method.
The C method (second method) actually needs a bounding box to be constructed to be usefull (for line segment tests). The bounding box is necessary to determine if the point is inside the line segment. Therefore I believe the C method in this case to be slower. It requires more instructions, more comparisions and more jumps.
I think I have done a fair job of optimizing both implementations by trying to reduce the number of comparisions, jumps and optimizing for branch prediction (most likely case fall through etc).
I spent lot's of time optimizing method1 to reduce the number of compares by first checking one delta before checking the next etc.. that simply saved one or two compares and jumps compared to previous versions.
I also spent some time optimizing method2 but probably a little bit less time. I inlined the bounding box calculation and I also postponed it until after the if statement etc as to not execute if the if statement is false.
Finally the point needs to be checked to be inside or outside the bounding box. Checking for outside is easier/shorter for multiple bounding so I coded it like that. However in this case there is only one bounding box so in this case it doesn't matter and it could also be checked to be inside. The generated assembler would however be the same so it doesn't matter.
For my purposes the diagonal case will be the most likely and occuring case so it's most interesting to see which method would be faster.
The first method requires 53 instructions for the diagonal case (with point on line)
The second method requires 81 instructions for the diagonal case (with point on line)
Both methods require the same ammounts of pushes onto the stack so those pushes have not been counted.
The current implementations look like this:
v3 is the first method v5 is the second method
While writing this post I discovered a flaw in v3, fixed it, and even optimized it. I think v3 should now be much faster for case horizontal and case vertical than v5.
So the main focus is on the diagonal case.
My question to you the delphi, electronics and computer architecture community is:
Which code do you think is faster ?
Can you explain it purely based on theory without doing any performance measurements ? ;)
For example: number of instructions, instruction cycles, branch prediction, pipe line optimizations, pairing, etc.
I shall provide you with the current pascal/delphi and the assembler code:
If you think you really are an expert in the optimization field/instruction/hardware/assembler field then next question will be ofcourse:
Can you optimize these methods or come up with a faster method ?
Quite frankly I am obsessed with this code. It's such a trivial code that I dont want my PC to be spending to much time on it... currently this code is not being used in any algorithms. I could use alternative information from a line segment intersection test algorithm which can already determine if one of the 4 points is on a line segment yes/no etc... but that would make the routine's header much more complex and maybe I dont want that at first at least.... so I might choose to use a routine like below to keep things a bit more simple and pragmatic... etc.
So after each line segment intersection test this routine would also need to be called, so it would be called a lot, so I dont want it to be a lot of overhead if you know what I mean ;)
And even if this code will never be used it's still quite fascinating and educational to see/learn how floating point optimizations work etc... that's the main reason why I am investigating this and what you to have a look at this because I am puzzled =D
Ok now the subjective stuff:
My gutt feeling says v3 is faster than v5 because v3 uses less compares and less jumps which should be good for branch prediction etc.. v3 also has less instructions.
However v3 uses two divisions (fdiv) which might be a lot slower than multiplications (fmul) which version v5 uses two of. However my gutt feeling says v5 has the bounding box thing going on etc so it needs to do more stuff/instructions which could make it slow ;)
Add the unknown factors like branch prediction, pipe lining, pairing and god knows what else and you start to see the doubt hehehehehe.
Can you sort it out ? ;) =D
Or do you need performance measurements like me ?
And let's be honest about that... How are we going to do accurate performance measurements ?
Lot's of context switching, multi threading, harddisk activity going on, on windows xp, so I am not too sure if that is reliable but ok :)
Here is the pascal/delphi code:
function point_on_line_segment_in_2d_v3( StartX, StartY : double; EndX, EndY : double; PointX, PointY : double ) : boolean; var vDeltaX : double; vDeltaY : double;
vDistanceXcomponent : double; vDistanceYcomponent : double; begin // default is false result := false;
vDeltaX := EndX - StartX; vDeltaY := EndY - StartY;
// if vDeltaX is not zero then that means the line is horizontal or diagonal if (vDeltaX 0) then begin // if vDeltaY is not zero then that means the line is diagonal if (vDeltaY 0) then begin // line is diagonal
// calculate both components and look if they are the same // if yes then see if they both lie between 0 and 1. vDistanceXcomponent := (PointX - StartX) / vDeltaX; vDistanceYcomponent := (PointY - StartY) / vDeltaY;
if (vDistanceXcomponent = vDistanceYcomponent) then begin if (vDistanceXcomponent>0) and (vDistanceXcomponent StartX) and (PointX < EndX) then begin result := true; end; end else begin if (PointX > EndX) and (PointX < StartX) then begin result := true; end; end;
// not necessary simply check if between startx and endx or // vica versa depending on if endx is greater then startx etc // since fdiv seems to be slow let's do it with some compares // see above ;) { // calculate and look at the horizontal component if it lies between 0 and 1. vDistanceXcomponent := (PointX - StartX) / vDeltaX;
if (vDistanceXComponent>0) and (vDistanceXComponent StartY) and (PointY < EndY) then begin result := true; end; end else begin if (PointY > EndY) and (PointY < StartY) then begin result := true; end; end;
{ // calculate and see if the vertical component lies between 0 and 1 vDistanceYcomponent := (PointY - StartY) / vDeltaY;
if (vDistanceYComponent>0) and (vDistanceYComponent