【问题标题】:Graphx get vertex label from vertex idGraphx从顶点id获取顶点标签
【发布时间】:2016-07-04 09:58:18
【问题描述】:

我在 Graphx 中有以下图表

graph.vertices.foreach(println)

(6109253945443866644,"Futurama"@en)
(7558506336564503178,"AccessibleComputing"@en)
(0,null)
(-2278222762001827643,"Programming languages"@en)
(-9007336571746445204,http://dbpedia.org/resource/Category:Presocratic_philosophers)
(-3236797006683951166,http://dbpedia.org/resource/Category:Programming_languages)
(-4159090027031366209,http://dbpedia.org/resource/Anaximenes_of_Miletus)
(7722304331424482609,http://dbpedia.org/resource/Category:Futurama)
(-323898215277667127,http://dbpedia.org/resource/AccessibleComputing)

我在这个图上应用了连通分量算法,其输出如下:-

ccGraph.vertices.foreach(println)

(6109253945443866644,6109253945443866644)
(7558506336564503178,-323898215277667127)
(0,0)
(-2278222762001827643,-3236797006683951166)
(-9007336571746445204,-9007336571746445204)
(-3236797006683951166,-3236797006683951166)
(-4159090027031366209,-9007336571746445204)
(7722304331424482609,6109253945443866644)
(-323898215277667127,-323898215277667127)

我找不到在 ccGraph 中找到 (vertexID,vertexID) 的顶点标签/顶点名称的方法,以便输出从
(vertexID,vertexID) => (vertexLabel,vertexLabel)

我尝试了以下方法但失败了

    ccGraph.vertices.map({case arr =>  
val k1 = graph.vertices.lookup(arr(0))
val k2 = graph.vertices.lookup(arr(1))
(k1,k2)
})

<console>:51: error: (org.apache.spark.graphx.VertexId, org.apache.spark.graphx.VertexId) does not take parameters
                  ccGraph.vertices.map({case arr =>  val k1 = graph1.vertices.lookup(arr(0))
                                                                                        ^
<console>:52: error: (org.apache.spark.graphx.VertexId, org.apache.spark.graphx.VertexId) does not take parameters
                                                     val k2 = graph1.vertices.lookup(arr(1))

【问题讨论】:

  • 顶点标签是什么意思?打印顶点时,它已经显示为 (vertex_id, vertex_value)。
  • 当您打印连接的组件 ccGraph 时,它采用 (vertexid,vertexid) 形式,而不是通常的 (vertexid,vertexvalue)

标签: scala apache-spark graph spark-graphx connected-components


【解决方案1】:

连通组件算法生成一个新图,其顶点值设置为组件 id。如果您想获得原始值,则必须将它们与原始图表连接起来。

ccGraph.joinVertices(graph.vertices) { (id, component_id, old_value) =>
  ...
}

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-04-06
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-11-13
    相关资源
    最近更新 更多