如何在 Python 中找到包含坐标的点的邻居？ [关闭]答案

【问题标题】：How do I find the neighbors of points containing coordinates in Python? [closed]如何在 Python 中找到包含坐标的点的邻居？ [关闭]
【发布时间】：2020-09-09 03:34:29
【问题描述】：

我有很多带有坐标的点。我想打印至少一个点的三个最近邻居及其到该点的距离。我怎样才能在 Python 中做到这一点？在 WGS84 系统中。

NAME    Latitude    Longitude
B   50.94029883 7.019146728
C   50.92073002 6.975268711
D   50.99807758 6.980865543
E   50.98074288 7.035060206
F   51.00696972 7.035993783
G   50.97369889 6.928538763
H   50.94133859 6.927878587
A   50.96712502 6.977825322

【问题讨论】：

这能回答你的问题吗？ Calculate distance between two latitude-longitude points? (Haversine formula)

标签： python coordinates wgs84

【解决方案1】：

最近邻技术对很多点更有效

蛮力（即遍历所有点）复杂度为 O(N^2)
最近邻算法复杂度为 O(N*log(N))

Python 中的最近邻

在您的问题上使用 BallTree 的说明（相关 Related Stackoverflow Post）

代码

import pandas as pd
import numpy as np

from sklearn.neighbors import BallTree
from io import StringIO

# Create DataFrame from you lat/lon dataset
data = """NAME Latitude Longitude
B 50.94029883 7.019146728
C 50.92073002 6.975268711
D 50.99807758 6.980865543
E 50.98074288 7.035060206
F 51.00696972 7.035993783
G 50.97369889 6.928538763
H 50.94133859 6.927878587
A 50.96712502 6.977825322"""

# Use StringIO to allow reading of string as CSV
df = pd.read_csv(StringIO(data), sep = ' ')

# Setup Balltree using df as reference dataset
# Use Haversine calculate distance between points on the earth from lat/long
# haversine - https://pypi.org/project/haversine/ 
tree = BallTree(np.deg2rad(df[['Latitude', 'Longitude']].values), metric='haversine')

# Setup distance queries (points for which we want to find nearest neighbors)
other_data = """NAME Latitude Longitude
B_alt 50.94029883 7.019146728
C_alt 50.92073002 6.975268711"""

df_other = pd.read_csv(StringIO(other_data), sep = ' ')

query_lats = df_other['Latitude']
query_lons = df_other['Longitude']

# Find closest city in reference dataset for each in df_other
# use k = 3 for 3 closest neighbors
distances, indices = tree.query(np.deg2rad(np.c_[query_lats, query_lons]), k = 3)

r_km = 6371 # multiplier to convert to km (from unit distance)
for name, d, ind in zip(df_other['NAME'], distances, indices):
  print(f"NAME {name} closest matches:")
  for i, index in enumerate(ind):
    print(f"\t{df['NAME'][index]} with distance {d[i]*r_km:.4f} km")

输出

NAME B_alt closest matches:
    B with distance 0.0000 km
    C with distance 3.7671 km
    A with distance 4.1564 km
NAME C_alt closest matches:
    C with distance 0.0000 km
    B with distance 3.7671 km
    H with distance 4.0350 km

【讨论】：