【发布时间】:2018-02-06 07:05:22
【问题描述】:
我有一个包含纬度和经度元组的数据框,如下所示(实际坐标示例):
id latlon
67 79 (39.1791764701497, -96.5772313693982)
68 17 (39.1765194942359, -96.5677757455844)
69 76 (39.1751440428827, -96.5772939901891)
70 58 (39.175359525189, -96.5691986655256)
71 50 (39.1770962912298, -96.5668107589661)
我想在同一个数据框中找到id 和最近的latlon 的距离(为了说明,我只是在nearest_id 和nearest_dist 列中组成以下数字):
id latlon nearest_id nearest_dist
67 79 (39.1791764701497, -96.5772313693982) 17 37
68 17 (39.1765194942359, -96.5677757455844) 58 150
69 76 (39.1751440428827, -96.5772939901891) 50 900
70 58 (39.175359525189, -96.5691986655256) 17 12
71 50 (39.1770962912298, -96.5668107589661) 79 4
我有大量 (45K+) 坐标,我想在这些坐标上执行此操作。
下面是我尝试的解决方案,使用来自geopy.distances 的great_circle:
def great_circle_dist(latlon1, latlon2):
"""Uses geopy to calculate distance between coordinates"""
return great_circle(latlon1, latlon2).meters
def find_nearest(x):
"""Finds nearest neighbor """
df['distances'] = df.latlon.apply(great_circle_dist, args=(x,))
df_sort = df.sort_values(by='distances')
return (df_sort.values[1][0], df_sort.values[1][2])
df['nearest'] = df['latlon'].apply(find_nearest)
df['nearest_id'] = df.nearest.apply(lambda x: x[0])
df['nearest_dist'] = df.nearest.apply(lambda x: x[1])
del df['nearest']
del df['distances']
可以做些什么来有效地进行这种计算?
【问题讨论】:
标签: python pandas gis geopandas