Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken KdNode equals and compareTo causes the KdTree not to return all valid results #159

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

maraswrona
Copy link

@maraswrona maraswrona commented Apr 22, 2021

By submitting this pull request I confirm I've read and complied with the below requirements.

  • I have read the Contribution guidelines and I am confident that my PR reflects them.
  • I have followed the coding guidelines for this project.
  • My code follows the skeleton code structure.
  • This pull request has a descriptive title. For example, Added {Algorithm/DS name} [{Language}], not Update README.md or Added new code.

I haven't done the above, so won't mark the checklist as the CONTRIBUTING.md file DOES NOT EXIST.


TL;DR; Calling kdTree.nearestNeighbourSearch(K, point) where K >= the size of the tree should return all the tree elements - but sometimes it doesn't (whether it works or not depends on the points coordinates). This PR fixes it. For more details, read below.


So first of all, thank you very much for this repository and all the implementations. I have been looking closely at the KdTree implementation to build my own and after extensive testing I noticed that this implementation is broken. Here is why:

  1. In KdTree while finding all nearest neighbours we go down the tree and then travel up and to other branches to seek for more candidates until we find K of them.
  2. The visited nodes are added to examined HashSet so we don't visit an already visited branch.
  3. The Set.contains method from java relies heavily on equals and hashcode implementations - objects equality must be checked carefully.
  4. The KdNode equals implementation uses its own compareTo implementation, which in turn determines two KdNodes to be equal when:
  • their two underlying points are on the same depth in the tree, and
  • the two points have the one coordinate on the axis "related" to that depth (axis = depth%k) equal to each other
    Which is already incorrect as those could be two completely different points
  1. The KdNode hashCode implementation uses also PointXYZ hashCode implementation which is extremely naive and basically calculates the sum of X, Y and Z. This makes it very easy to create two points that share same hashCode and because of the equals implementation mentioned above makes it possible that two KdNodes will collide and will appear to be "equal" from the perspective of a Set collection. Because of this collision a certain branch of the tree will not be visited at all and its nodes missed.
  2. As a result when adding certain set of points to the tree (e.g. 3 of them) and querying the nearestNeighbors with k = the size of the tree (or higher) we won't get all the 3 points in return - which would be the expected behavior.

I have added a test case demonstrating this behavior + this small PR fixing it. I'm only fixing the compareTo method of the KdNode, but frankly also the hashCode implementations should be changed.

Marek Woroniecki added 2 commits April 22, 2021 15:09
…ent 3D points to be interpreted as equal (and hence the tree not returning all valid results)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant