【问题标题】:spark error RDD type not found when creating RDD创建RDD时未找到火花错误RDD类型
【发布时间】:2014-12-25 09:49:35
【问题描述】:

我正在尝试创建案例类对象的 RDD。例如,

// sqlContext from the previous example is used in this example.
// createSchemaRDD is used to implicitly convert an RDD to a SchemaRDD.
import sqlContext.createSchemaRDD

val people: RDD[Person] = ... // An RDD of case class objects, from the previous example.

// The RDD is implicitly converted to a SchemaRDD by createSchemaRDD, allowing it to be stored using        Parquet.
people.saveAsParquetFile("people.parquet")

我正在尝试通过给出

来完成上一个示例中的部分
    case class Person(name: String, age: Int)

    // Create an RDD of Person objects and register it as a table.
    val people: RDD[Person] = sc.textFile("/user/root/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))
    people.registerAsTable("people")

我收到以下错误:

<console>:28: error: not found: type RDD
       val people: RDD[Person] =sc.textFile("/user/root/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))

知道出了什么问题吗? 提前致谢!

【问题讨论】:

    标签: scala apache-spark apache-spark-sql


    【解决方案1】:

    这里的问题是显式的RDD[String] 类型注释。看起来RDD 默认情况下没有在spark-shell 中导入,这就是Scala 抱怨找不到RDD 类型的原因。尝试先运行import org.apache.spark.rdd.RDD

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2017-02-28
      • 1970-01-01
      • 2014-08-30
      • 1970-01-01
      • 2020-10-21
      • 1970-01-01
      • 2015-04-01
      • 2021-07-25
      相关资源
      最近更新 更多