【问题标题】:How to create new Column using withColumn to concentrate two numeric conlumn as String ? [duplicate]如何使用 withColumn 创建新列以将两个数字列集中为字符串? [复制]
【发布时间】:2019-04-19 06:31:16
【问题描述】:

我的数据框如下

val employees = sc.parallelize(Array[(String, Int, BigInt)](
  ("Rafferty", 31, 222222222), ("Jones", 33, 111111111), ("Heisenberg", 33, 222222222), ("Robinson", 34, 111111111), ("Smith", 34, 333333333), ("Williams", 15, 222222222)
)).toDF("LastName", "DepartmentID", "Code")

employees.show()

 +----------+------------+---------+
|  LastName|DepartmentID|     Code|
+----------+------------+---------+
|  Rafferty|          31|222222222|
|     Jones|          33|111111111|
|Heisenberg|          33|222222222|
|  Robinson|          34|111111111|
|     Smith|          34|333333333|
|  Williams|          15|222222222|
+----------+------------+---------+

我想创建另一列作为personal_id 作为集中DepartmentId 和Code。示例:拉弗蒂 => 31222222222

所以我写代码如下:

val anotherdf = employees.withColumn("personal_id", $"DepartmentID".cast("String") + $"Code".cast("String"))


 +----------+------------+---------+------------+
|  LastName|DepartmentID|     Code| personal_id|
+----------+------------+---------+------------+
|  Rafferty|          31|222222222|2.22222253E8|
|     Jones|          33|111111111|1.11111144E8|
|Heisenberg|          33|222222222|2.22222255E8|
|  Robinson|          34|111111111|1.11111145E8|
|     Smith|          34|333333333|3.33333367E8|
|  Williams|          15|222222222|2.22222237E8|
+----------+------------+---------+------------+

但我的personal_id是double。

anotherdf.printSchema

root
 |-- LastName: string (nullable = true)
 |-- DepartmentID: integer (nullable = false)
 |-- Code: decimal(38,0) (nullable = true)
 |-- personal_id: double (nullable = true) 

【问题讨论】:

    标签: scala apache-spark apache-spark-sql


    【解决方案1】:

    我应该使用concat

    import org.apache.spark.sql.functions.concat
    val anotherdf2 = employees.withColumn("personal_id", concat($"DepartmentID".cast("String"), $"Code".cast("String")))
    
    
     +----------+------------+---------+-----------+
    |  LastName|DepartmentID|     Code|personal_id|
    +----------+------------+---------+-----------+
    |  Rafferty|          31|222222222|31222222222|
    |     Jones|          33|111111111|33111111111|
    |Heisenberg|          33|222222222|33222222222|
    |  Robinson|          34|111111111|34111111111|
    |     Smith|          34|333333333|34333333333|
    |  Williams|          15|222222222|15222222222|
    +----------+------------+---------+-----------+
    

    【讨论】:

      猜你喜欢
      • 2011-07-26
      • 2012-01-11
      • 2019-11-02
      • 1970-01-01
      • 2012-12-23
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-02-22
      相关资源
      最近更新 更多