【发布时间】:2017-03-26 03:00:18
【问题描述】:
当我尝试从 Vector Transformer 的输出中创建标记点时,我遇到了以下问题:
val realout = output.select("label","features").rdd.map(row => LabeledPoint
row.getAs[Double]("label"),
row.getAs[org.apache.spark.mllib.linalg.SparseVector]("features")
))
我得到的错误是:
enter [error] (run-main-0) org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 13.0 failed 1 times, most recent failure: Lost task 0.0 in stage 13.0 (TID 13, localhost): java.lang.ClassCastException: org.apache.spark.ml.linalg.SparseVector cannot be cast to org.apache.spark.mllib.linalg.Vector
[error] at DataCleaning$$anonfun$1.apply(DataCleaning.scala:107
[error] at DataCleaning$$anonfun$1.apply(DataCleaning.scala:105)
[error]
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
[error]
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:462
[error]
atorg.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:213)
我检查了链接1 中提供的解决方案,该解决方案解释了 spark 2.0.0 中向量的转换,但面临如下所述的编译错误,
object linalg is not a member of package org.apache.spark.ml
请帮助。谢谢!
【问题讨论】:
标签: scala apache-spark machine-learning