【发布时间】:2020-11-30 21:08:50
【问题描述】:
我想知道是否有人知道在 pandas 中快速旋转数据框以实现下面所需的转换。这是一种从宽到长的支点,但并不完全如此。
输入数据框结构(需要能够支持N个类别,而不是下面的3个)
+------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+
| id | catA_present | catA_pos | catA_neg | catA_ntrl | catB_present | catB_pos | catB_neg | catB_ntrl | catC_present | catC_pos | catC_neg | catC_ntrl |
+------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+
| 0001 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 |
+------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+
| 0002 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 |
+------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+
| 0003 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
+------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+
| 0004 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 1 |
+------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+
| 0005 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 |
+------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+--------------+----------+----------+-----------+
Output Transformed Dataframe 结构:(需要支持 N 个类别,而不是示例所示的 3 个)
+------+------+-------+------+-------+------+-------+
| id | cat1 | sent1 | cat2 | sent2 | cat3 | sent3 |
+------+------+-------+------+-------+------+-------+
| 0001 | catA | pos | catC | neg | NULL | NULL |
+------+------+-------+------+-------+------+-------+
| 0002 | catB | pos | catC | pos | NULL | NULL |
+------+------+-------+------+-------+------+-------+
| 0003 | catA | ntrl | catB | ntrl | NULL | NULL |
+------+------+-------+------+-------+------+-------+
| 0004 | catA | pos | catB | pos | catC | ntrl |
+------+------+-------+------+-------+------+-------+
| 0005 | catC | neg | NULL | NULL | NULL | NULL |
+------+------+-------+------+-------+------+-------+
【问题讨论】:
标签: python pandas pivot transform melt