在scala中编写调用泛型函数的泛型函数答案

【问题标题】：write generic function that calls generic functions in scala在scala中编写调用泛型函数的泛型函数
【发布时间】：2018-01-01 12:41:39
【问题描述】：

我正在使用 Spark 数据集读取 csv 文件。我想制作一个多态函数来为许多文件执行此操作。函数如下：

def loadFile[M](file: String):Dataset[M] = {
    import spark.implicits._
    val schema = Encoders.product[M].schema
    spark.read
      .option("header","false")
      .schema(schema)
      .csv(file)
      .as[M]
}

我得到的错误是：

[error] <myfile>.scala:45: type arguments [M] do not conform to method product's type parameter bounds [T <: Product]
[error]     val schema = Encoders.product[M].schema
[error]                                  ^
[error] <myfile>.scala:50: Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._  Support for serializing other types will be added in future releases.
[error]       .as[M]
[error]          ^
[error] two errors found

我不知道如何处理第一个错误。我尝试添加与产品定义相同的方差（M <: product typetag available for m>

如果我传入已经从编码器生成的模式，则会收到错误：

[error] Unable to find encoder for type stored in a Dataset

【问题讨论】：

标签： scala generics apache-spark

【解决方案1】：

您需要要求任何致电loadFile[M] 的人提供证据证明M 存在这样的编码器。您可以通过在 M 上使用上下文边界来做到这一点，这需要 Encoder[M]：

def loadFile[M : Encoder](file: String): Dataset[M] = {
  import spark.implicits._
  val schema = implicitly[Encoder[M]].schema
  spark.read
   .option("header","false")
   .schema(schema)
   .csv(file)
   .as[M]
}

【讨论】：

谢谢！那肯定是编译的，但是我在运行我的程序时遇到了一些访问问题和内存不足问题，即使我没有调用该函数。我假设我可以让我的案例类扩展编码器，如果我没有这些其他运行时问题，它应该可以工作？
@kim 这是编译时要求，这根本不应该影响运行时。也许是其他原因导致您的代码 OOM。
我决定不使用 Spark 来解决整个编码器问题，但我确实发现了这个问题，它谈到了 encoders for custom objects。当我有时间时，我会回来弄清楚。不过，我会将其标记为我的答案，因为它让我走上了正轨。