【问题标题】:How to dynamically instantiate array buffer[Type] in scala如何在scala中动态实例化数组缓冲区[Type]
【发布时间】:2021-04-03 08:42:45
【问题描述】:

我想在 scala 中创建一个数组缓冲区,而不用一开始就使用数据类型对其进行实例化。我想检查一个条件,然后动态地将类型传递给它。查看给定的代码。

def rowGen(startNumber:Int,tableIdentifier:String,NumRows:Int)={
var tmpArrayBuffer:collection.mutable.ArrayBuffer[_]=null  // I tried [T] here. That didn't work either.
tableIdentifier match {
case value if value==baseTable => tmpArrayBuffer= new collection.mutable.ArrayBuffer[(String,String,String,String)]()
case value if value==batchTable => tmpArrayBuffer= new collection.mutable.ArrayBuffer[(String,String)]()
}
for (currentNum <- startNumber to startNumber+NumRows)
tableIdentifier match {
case value if value==baseTable => tmpArrayBuffer+=(s"col1-${currentNum}",s"col2-${currentNum}",s"col3-${currentNum}",s"col4-${currentNum}")
case value if value==batchTable => tmpArrayBuffer+=(s"col1-${currentNum}",s"col2-${currentNum}")
}
tableIdentifier match {
case value if value==baseTable => tmpArrayBuffer.toSeq.toDF("col1","col2","col3","col4")
case value if value==batchTable => tmpArrayBuffer.toSeq.toDF("col1","col2")
}
}

请帮我解决这个问题。根据我想实例化 ArrayBuffer[(String,String)] 或 ArrayBuffer[(String,String,String,String)] 的条件。

【问题讨论】:

    标签: scala apache-spark user-defined-functions scala-collections


    【解决方案1】:

    我只想在匹配中定义数组缓冲区:

    import org.apache.spark.sql.DataFrame
    
    val baseTable = "baseTable"
    val batchTable = "batchTable"
    
    def rowGen(startNumber:Int, tableIdentifier:String, NumRows:Int) : DataFrame = {
        tableIdentifier match {
            case `baseTable` => {
                var tmpArrayBuffer = new collection.mutable.ArrayBuffer[(String,String,String,String)]
                for (currentNum <- startNumber to startNumber+NumRows){
                    tmpArrayBuffer += ((s"col1-${currentNum}",s"col2-${currentNum}",s"col3-${currentNum}",s"col4-${currentNum}"))
                }
                tmpArrayBuffer.toSeq.toDF("col1","col2","col3","col4")
            }
            case `batchTable` => {
                var tmpArrayBuffer = new collection.mutable.ArrayBuffer[(String,String)]
                for (currentNum <- startNumber to startNumber+NumRows) {
                    tmpArrayBuffer += ((s"col1-${currentNum}",s"col2-${currentNum}"))
                }
                tmpArrayBuffer.toSeq.toDF("col1","col2")
            }
        }
    }
    
    scala> rowGen(1, "batchTable", 5).show
    +------+------+
    |  col1|  col2|
    +------+------+
    |col1-1|col2-1|
    |col1-2|col2-2|
    |col1-3|col2-3|
    |col1-4|col2-4|
    |col1-5|col2-5|
    |col1-6|col2-6|
    +------+------+
    
    scala> rowGen(1, "baseTable", 5).show
    +------+------+------+------+
    |  col1|  col2|  col3|  col4|
    +------+------+------+------+
    |col1-1|col2-1|col3-1|col4-1|
    |col1-2|col2-2|col3-2|col4-2|
    |col1-3|col2-3|col3-3|col4-3|
    |col1-4|col2-4|col3-4|col4-4|
    |col1-5|col2-5|col3-5|col4-5|
    |col1-6|col2-6|col3-6|col4-6|
    +------+------+------+------+
    
    

    或者,正如评论所建议的,使用Seq.newBuilder 更好:

    import org.apache.spark.sql.DataFrame
    
    val baseTable = "baseTable"
    val batchTable = "batchTable"
    
    def rowGen(startNumber:Int, tableIdentifier:String, NumRows:Int) : DataFrame = {
        tableIdentifier match {
            case `baseTable` => {
                var tmpArrayBuffer = Seq.newBuilder[(String,String,String,String)]
                for (currentNum <- startNumber to startNumber+NumRows){
                    tmpArrayBuffer += ((s"col1-${currentNum}",s"col2-${currentNum}",s"col3-${currentNum}",s"col4-${currentNum}"))
                }
                tmpArrayBuffer.result.toDF("col1","col2","col3","col4")
            }
            case `batchTable` => {
                var tmpArrayBuffer = Seq.newBuilder[(String,String)]
                for (currentNum <- startNumber to startNumber+NumRows) {
                    tmpArrayBuffer += ((s"col1-${currentNum}",s"col2-${currentNum}"))
                }
                tmpArrayBuffer.result.toDF("col1","col2")
            }
        }
    }
    

    【讨论】:

    • 直接使用Seq.newBuilder,看不出ArrayBuffer有什么好处
    • 感谢@cchantep 和 mck 的帮助。我以为我们可以在不同的地方使用相同的变量,但 Seq.newBuilder 也可以正常工作。
    • @Raptor0009 你可以,但我认为事先动态实例化它没有意义。我只是重构代码以避免需要,就像我在回答中所做的那样。
    • @mck 如果可以的话,你能帮我看看动态将dType传递给ArrayBuffer的示例代码吗?我只是好奇。
    • @Raptor0009 如何做到这一点对我来说并不明显。 . .你可能需要一些花哨的多态性来实现这一点
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多