【发布时间】:2018-08-10 15:37:31
【问题描述】:
我正在尝试构建 Gremlin 查询以在启用地理搜索的 DSE Graph 中使用(在 Solr 中编制索引)。问题是图是如此紧密地相互连接,以至于循环路径遍历超时。现在我正在使用的原型图有 ~1600 个顶点和 ~35K 边。还总结了通过每个顶点的三角形数量:
+--------------------+-----+
| gps|count|
+--------------------+-----+
|POINT (-0.0462032...| 1502|
|POINT (-0.0458048...| 405|
|POINT (-0.0460680...| 488|
|POINT (-0.0478356...| 1176|
|POINT (-0.0479465...| 5566|
|POINT (-0.0481031...| 9896|
|POINT (-0.0484724...| 433|
|POINT (-0.0469379...| 302|
|POINT (-0.0456595...| 394|
|POINT (-0.0450722...| 614|
|POINT (-0.0475904...| 3080|
|POINT (-0.0479464...| 5566|
|POINT (-0.0483400...| 470|
|POINT (-0.0511753...| 370|
|POINT (-0.0521901...| 1746|
|POINT (-0.0519999...| 1026|
|POINT (-0.0468071...| 1247|
|POINT (-0.0469636...| 1165|
|POINT (-0.0463685...| 526|
|POINT (-0.0465805...| 1310|
+--------------------+-----+
only showing top 20 rows
我预计图表最终会增长到巨大的规模,但我会将循环搜索限制在地理区域(例如半径 ~ 300 米)。
到目前为止,我最好的尝试是以下几个版本:
g.V().has('gps',Geo.point(lon, lat)).as('P')
.repeat(both()).until(cyclicPath()).path().by('gps')
Script evaluation exceeded the configured threshold of realtime_evaluation_timeout at 180000 ms for the request
为了便于说明,下面的地图以绿色显示起始顶点,以红色显示终止顶点。假设所有顶点都是相互连接的。我对绿色和红色之间最长的路径感兴趣,这将是环绕该街区。
我已经阅读了一些链接无济于事:
1)http://tinkerpop.apache.org/docs/current/recipes/#cycle-detection
2)Longest acyclic path in a directed unweighted graph
3)https://groups.google.com/forum/#!msg/gremlin-users/tc8zsoEWb5k/9X9LW-7bCgAJ
编辑
使用下面丹尼尔的建议创建子图,它仍然超时:
gremlin> hood = g.V().hasLabel('image').has('gps', Geo.inside(point(-0.04813968113126384, 51.531259899256995), 100, Unit.METERS)).bothE().subgraph('hood').cap('hood').next()
==>tinkergraph[vertices:640 edges:28078]
gremlin> hg = hood.traversal()
==>graphtraversalsource[tinkergraph[vertices:640 edges:28078], standard]
gremlin> hg.V().has('gps', Geo.point(-0.04813968113126384, 51.531259899256995)).as('x')
==>v[{~label=image, partition_key=2507574903070261248, cluster_key=RFAHA095CLK-2017-09-14 12:52:31.613}]
gremlin> hg.V().has('gps', Geo.point(-0.04813968113126384, 51.531259899256995)).as('x').repeat(both().simplePath()).emit(where(both().as('x'))).both().where(eq('x')).tail(1).path()
Script evaluation exceeded the configured threshold of realtime_evaluation_timeout at 180000 ms for the request: [91b6f1fa-0626-40a3-9466-5d28c7b5c27c - hg.V().has('gps', Geo.point(-0.04813968113126384, 51.531259899256995)).as('x').repeat(both().simplePath()).emit(where(both().as('x'))).both().where(eq('x')).tail(1).path()]
Type ':help' or ':h' for help.
Display stack trace? [yN]n
【问题讨论】:
标签: graph datastax gremlin tinkerpop