【发布时间】:2018-02-12 14:28:02
【问题描述】:
Spark:我想分解多列并合并为单列,列名作为单独的行。
Input data:
+-----------+-----------+-----------+
| ASMT_ID | WORKER | LABOR |
+-----------+-----------+-----------+
| 1 | A1,A2,A3| B1,B2 |
+-----------+-----------+-----------+
| 2 | A1,A4 | B1 |
+-----------+-----------+-----------+
Expected Output:
+-----------+-----------+-----------+
| ASMT_ID |WRK_CODE |WRK_DETL |
+-----------+-----------+-----------+
| 1 | A1 | WORKER |
+-----------+-----------+-----------+
| 1 | A2 | WORKER |
+-----------+-----------+-----------+
| 1 | A3 | WORKER |
+-----------+-----------+-----------+
| 1 | B1 | LABOR |
+-----------+-----------+-----------+
| 1 | B2 | LABOR |
+-----------+-----------+-----------+
| 2 | A1 | WORKER |
+-----------+-----------+-----------+
| 2 | A4 | WORKER |
+-----------+-----------+-----------+
| 2 | B1 | LABOR |
+-----------+-----------+-----------+
【问题讨论】:
-
贴出你试过的代码,至少就像把数据加载到spark中一样。
标签: apache-spark apache-spark-sql spark-dataframe apache-spark-dataset