在二维多边形点数组中查找最近点答案

【问题标题】：Find closest point in a 2D array of polygon points在二维多边形点数组中查找最近点
【发布时间】：2019-12-07 18:56:08
【问题描述】：

我有一个二维地理坐标数组，像这样

coords = np.array(
[[[54.496163, 21.770491],
  [54.495438, 21.755107],
  [54.494713, 21.739723],
  [54.493988, 21.724339],
  [54.493263, 21.708955]],
 [[54.504881, 21.769271],
  [54.504157, 21.753884],
  [54.503432, 21.738497],
  [54.502707, 21.72311 ],
  [54.501983, 21.707723]],
 [[54.5136, 21.768052],
  [54.512875, 21.752661],
  [54.512151, 21.737271],
  [54.511426, 21.72188 ],
  [54.510702, 21.70649 ]],
 [[54.522318, 21.766832],
  [54.521594, 21.751439],
  [54.52087, 21.736045],
  [54.520145, 21.720651],
  [54.519421, 21.705257]],
 [[54.531037, 21.765613],
  [54.530312, 21.750216],
  [54.529588, 21.734819],
  [54.528864, 21.719421],
  [54.52814, 21.704024]]]
)

在空间中它定义了一个多边形

我想找到某个点在coords中最近点的索引，例如pt = [54.5, 21.7]

coords 在这里可能看起来像一个平行四边形，但实际上它是一个形状为(1200, 1500, 2) 的多边形。出于显而易见的原因，我在这里显示coords[0:5,0:5]。多边形的真实形状可以在question找到。

现在我正在计算整个 coords 数组相对于点 pt 的欧几里得距离，以找到最近的点 [r1,c1]

flidx = ((coords - pt) ** 2).sum(2).argmin()
r1 = int(flidx / coords.shape[1])
c1 = flidx % coords.shape[1]

但这需要太多时间。

我正在考虑在多边形中实现二进制搜索，我可以将它分成 4 个部分，检查点在哪一部分中，然后循环直到我有一个相对较小的点数组，例如 16 x 16 .然后应用欧式距离法。

问题是我不知道如何检查一个点是否在多边形内。一个矩形会相当简单，但这不是一个。

对于此方法或任何其他查找最近点的方法的任何帮助将不胜感激。

谢谢

【问题讨论】：

多边形到底是什么意思？它看起来像一个线性变换下的格子。点会一直像网格一样吗？
您已经在使用数组函数，因此它可能不会给您带来很大的速度提升，但您可以尝试第二版答案here。
@VersBersch 真正的形状可以在这个question 中找到。看图像中的绿色多边形。是的，它是网格状的。
@ErgiS 在您的链接中，数据看起来像一个纬度/经度点的网格。即原始数据形成一个正方形网格（可能旋转了一点），形状的“失真”来自将纬度/经度坐标映射到二维表面。那么底层数据本身是否存在这种失真，还是来自映射？
@VersBersch 失真来自数据，而不是它在二维表面上的投影。

标签： python numpy search

【解决方案1】：

首先注意数据不是完美的网格，而是“网格状”

from netCDF4 import Dataset
import numpy as np
from matplotlib import pyplot as plt

group = Dataset('./coords.nc', 'r', format='NETCDF4')

# reverse the input so that the bottom left point is at [0, 0]
lat = np.array(group['latitude_in'])[::-1]
lon = np.array(group['longitude_in'])[::-1]

# plot a sub-grid
slat = np.array([arr[::100] for arr in lat[::100]]).flatten()
slon = np.array([arr[::100] for arr in lon[::100]]).flatten()
plt.scatter(slat, slon)
plt.show()

要找到集合中与某个目标点最近的点的坐标，您可以通过“更改基准”来获得合理的近似值（搜索的初始猜测）。 IE。如果从左下角到右下角的向量是您的 x 方向，而左下角到左上角是 y 方向向量，则应用基矩阵的变化会将点映射到单位正方形（不完美）。然后就可以算出相对坐标了。

然后完成，您可以沿着网格（从最初的猜测开始）朝着目标点的方向走（即移动到最近的邻居）

import itertools

class NearestIndex:
    def __init__(self, points):
        self.points = points 
        self.size = np.array(self.points.shape[:2]) - 1  # 1199 x 1499

        self.origin = points[0][0]  # origin must be at [0, 0]
        dX = points[-1, 0] - self.origin # the X-direction
        dY = points[0, -1] - self.origin # the Y-direction
        self.M = np.linalg.inv(np.array([dX, dY])) # change of basis matrix

    def guess(self, target):
        """ guess the initial coordinates by transforming points to the unit square """
        p = map(int, self.size * np.matmul(target - self.origin, self.M))
        return np.clip(p, 0, self.size)  # ensure the initial guess is inside the grid

    def in_grid(self, index):
        return (index == np.clip(index, 0, self.size)).all()

    def distance_to_target(self, index):
        return np.linalg.norm(self.points[index] - self.target)

    def neighbour_distances(self, index):
        i, j = index
        min_dist = np.inf
        min_index = None       
        for di, dj in itertools.product((-1, 0, 1), repeat=2):
            neighbour = (i + di, j + dj)
            if not (di == dj == 0) and self.in_grid(neighbour):
                dist = self.distance_to_target(neighbour)
                if dist < min_dist:
                    min_dist, min_index = dist, neighbour

        return min_index, min_dist

    def find_nearest(self, target):
        self.target = target
        index = self.guess(target)  # make an initial guess
        min_dist = self.distance_to_target(index)  # distance to initial guess
        while True:
            # check the distance to the target from each neighbour of index
            neighbour, dist = self.neighbour_distances(index)
            if dist < min_dist:
                index, min_dist = neighbour, dist
            else:
                return index, min_dist

这样使用

points = np.dstack([lat, lon])
indexer = NearestIndex(points)
index, dist = indexer.find_nearest(np.array([46, 15])) 

print(index, coords[index], dist)  # (546, 556) [46.004955 14.999708] 0.004963596377623203

它已经相当快了，但还有很大的优化空间。您可以记忆函数distance_to_target，或者在走向该点时使用不同的步长。

【讨论】：

【解决方案2】：

如果你重新排列你的点数组，我想你可以使用shapely:

from shapely.geometry import Point
from shapely.geometry.polygon import Polygon

point = Point(0.5, 0.5)
polygon = Polygon([(0, 0), (0, 1), (1, 1), (1, 0)])
print(polygon.contains(point))

我不完全确定速度，但应该很简单。

【讨论】：

【解决方案3】：

您还可以使用unravel_index 计算欧几里得距离并找到正确的索引：

import numpy as np

pt = [54.5, 21.7]

#Distance for each coordinate sqrt((ptx-x)^2+(pty-y)^2)
dis = ((pt[0]-coords[:,:,0])**2+(pt[1]-coords[:,:,1])**2)**0.5
#Get the x,y index
ind = np.unravel_index(dis.argmin(), dis.shape)
#Get the coordinate
val = coords[ind[0],ind[1],:]

【讨论】：