Use Each Neighbor Once In Sklearn NearestNeighbor
I am comparing two point clouds of different sizes. I don't want to cutoff the last points in the larger pointcloud pc1. For points in pc1 I would like to find the nearest neighbor in pc2. After using this point in pc1 and pc2 it should not be used again for any other comparison. Calculate distances from pc1 to pc2: Start from lowest distance. Take both points in pc1 and pc2 and flag the points as used (or delete). Get next pair with next higher distance ... Until all points in pc2 are used. Return List of distances.
Here is what I tried so far:
from sklearn.neighbors import NearestNeighbors
import numpy as np
pc1 = np.array([[-1, -1], [-2, -1], [1, 1], [2, 2], [3, 5]])
pc2 = np.array([[0, 0], [0, 0], [6,6]])
nbrs = NearestNeighbors(n_neighbors=1, algorithm='kd_tree', metric='euclidean').fit(pc2)
distances, indices = nbrs.kneighbors(pc1)
This is the output:
indices = [[0],[0],[0],[0],[2]]
But I want to have:
indices_wanted = [[0],[1],[2]]
which should refer to the points in pc1: [[-1, -1], [1, 1], [3, 5]]
Is there an efficient way of doing that? My pointclouds are in 3D and I have approximately 8000 points per cloud. It should be very fast, as I need to repeat this procedure for each frame in some "movie" with 200 frames. Create some sample data in 3D:
pc1 = np.random.randn(8000,3)
pc2 = np.random.randn(7990,3)
Here is an image to clarify what the situation is:
The red points are pc1 and the green ones are pc2. Ultimately, I have 3D points.
Edit:
I am not restricted to sklearn, I just know it is very efficient. So KDTree is also an option. I may have to include a maximum distance for two nodes for speeding up this approach. The cube I am using is of size 4m (I already delete points that are too far off before running the neighbor search)
Problem: The code provided by @user2874583 takes around 0.8s for one set with 8000 points. That is way too slow. I need less than 0.2s. Is there a way to modify the code to making use of the structure: Cube with no points outside? Maybe sorting the array?
Answer
You can use the scipy libary to calculate the distance between to points.
from scipy.spatial.distance import cdist
def closest_node_index(node, nodes):
index = cdist([node], nodes).argmin()
return index
final = []
for arr in pc2:
i = closest_node_index(arr, pc1)
final.append(pc1[i])
pc1 = np.delete(pc1, i, axis=0)
Related Questions
- → What are the pluses/minuses of different ways to configure GPIOs on the Beaglebone Black?
- → Django, code inside <script> tag doesn't work in a template
- → React - Django webpack config with dynamic 'output'
- → GAE Python app - Does URL matter for SEO?
- → Put a Rendered Django Template in Json along with some other items
- → session disappears when request is sent from fetch
- → Python Shopify API output formatted datetime string in django template
- → Can't turn off Javascript using Selenium
- → WebDriver click() vs JavaScript click()
- → Shopify app: adding a new shipping address via webhook
- → Shopify + Python library: how to create new shipping address
- → shopify python api: how do add new assets to published theme?
- → Access 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT' with Python Shopify Module