值 _2 不是 Double spark-shell 的成员答案

【问题标题】：value _2 is not a member of Double spark-shell值 _2 不是 Double spark-shell 的成员
【发布时间】：2017-01-28 15:57:24
【问题描述】：

在 spark-scala-shell 中实现 aggregateByKey 时出错。

我试图在 Scala-shell 上执行的代码是这样的，

val orderItemsMapJoinOrdersMapMapAgg = orderItemsMapJoinOrdersMapMap
  .aggregateByKey(0.0,0)(
      (a,b) => (a._1 + b , a._2 + 1),
      (a,b) => (a._1 + b._1 , a._2 + b._2 )
  )

但我收到以下错误，

<console>:39: error: value _1 is not a member of Double
         val orderItemsMapJoinOrdersMapMapAgg = orderItemsMapJoinOrdersMapMap.aggregateByKey( 0.0,0)( (a,b) => (a._1 + b , a._2 +1), (a,b) => (a._1 + b._1 , a._2 + b._2 ))


scala> orderItemsMapJoinOrdersMapMap
res8: org.apache.spark.rdd.RDD[(String, Float)] = MapPartitionsRDD[16] at map at <console>:37

有人可以帮助我理解双精度和浮点值逻辑以及如何解决它

【问题讨论】：

标签： scala apache-spark

【解决方案1】：

问题在于您以错误的方式提供了第一个 curried 参数。应该是这样的，

val orderItemsMapJoinOrdersMapMap: RDD[(String, Float)] = ...

// so elems of your orderItemsMapJoinOrdersMapMap are (String, Float)

// And your accumulator looks like (Double, Int)

// thus I believe that you just want to accumulate total number of elements and sum of the floats in them

val orderItemsMapJoinOrdersMapMapAgg = orderItemsMapJoinOrdersMapMap
  .aggregateByKey((0.0,0))(
      (acc, elem) => (acc._1 + elem._2 , acc._2 + 1),
      (acc1, acc2) => (acc1._1 + acc2._1 , acc1._2 + acc._2)
  )

【讨论】：

感谢@Sarvesh Kumar Singh 的快速和良好的响应。我想知道为什么我得到 RDD[(String, (Double, Int))] 类型的输出 Rdd 而不是 RDD[(String, (Float, Int))] ，因为我在早期的地图操作中使用了 toFloat
0.0 不是Float。如果你想要Float，那么你可以使用0.0f。
谢谢@Sarvesh Kumar Singh