【发布时间】:2021-01-26 00:56:36
【问题描述】:
我有一个 osmnx 图和大约 6m 的查询坐标来查找图中最近的节点。我想并行化任务,我尝试了 joblib 和最短路径描述的方法 [https://github.com/gboeing/osmnx-examples/blob/v0.16.0/notebooks/02-routing-speed-time.ipynb] [1]。然而,在这两个试验中,我只使用了 15 个可用 CPU 中的 1 个。
我对多处理了解不多,只是试图模仿最短路径的解决方案。我做错了什么还是该方法不适用于 osmnx.get_nearest_node(..) 函数?是否有任何其他建议可以加快这一进程?
import pandas as pd
import osmnx as ox
ox.config(use_cache=True, log_console=True)
G = ox.graph_from_bbox(38.289, 36.898, -122.704, -121.214 ,
network_type='drive',
retain_all = False,
simplify=True)
origins = [(37.775, -122.216),(37.458, -121.913),(37.558, -122.258), (37.791, -122.413),
(37.775, -122.219),(37.773, -121.952),(37.773, -121.926),
(37.332, -122.003),(37.462, -122.228),(37.701, -122.089),
(37.696, -122.143),(37.931, -122.323),(37.558, -122.273),
(37.357, -121.902),(37.462, -122.228),(37.791, -122.416),
(37.922, -122.368),(37.551, -122.291),(37.701, -122.088),
(37.802, -122.267),(38.015, -122.015),(37.701, -122.088),
(37.503, -122.267), (37.791, -122.416), (37.551, -122.301),
(37.405, -122.061), (37.228, -121.877), (37.326, -121.814),
(37.292, -122.032), (37.722, -122.15), (37.966, -122.507),
(37.773, -121.989), (37.294, -121.895), (37.881, -122.127),
(37.872, -122.14), (37.551, -122.307), (37.404, -121.976),
(37.775, -122.209), (37.791, -122.413), (37.228, -121.878)]
import multiprocessing as mp
def nn(G,origin):
try:
return ox.get_nearest_node(G, origin)
except:
return np.nan
params = ((G, orig) for orig in origins)
pool = mp.Pool(15)
sma = pool.starmap_async(nn, params)
routes = sma.get()
pool.close()
pool.join()
也是一个有趣的细节,当我计时时,我得到以下结果:
CPU时间:用户3分58秒,系统:8.91秒,总计:4分6秒 挂墙时间:4分15秒
当我使用具有相同图形的简单 for 循环并指向它时更快:
for origin in origins:
r = ox.get_nearest_node(G, origin)
CPU 时间:用户 10.8 秒,系统:39.9 毫秒,总计:10.8 秒 挂壁时间:10.8 秒 [1]:https://github.com/gboeing/osmnx-examples/blob/v0.16.0/notebooks/02-routing-speed-time.ipynb
【问题讨论】:
标签: python parallel-processing multiprocessing joblib osmnx