【发布时间】:2021-11-21 17:26:36
【问题描述】:
我有一个 Excel 电子表格,导入后看起来类似于:
df = pd.DataFrame({
datetime(2021, 8, 1, 00, 00, 00): [120, np.nan, np.nan, np.nan, 300],
datetime(2021, 9, 1, 00, 00, 00): [np.nan, np.nan, 50, np.nan, np.nan],
datetime(2021, 10, 1, 00, 00, 00): [np.nan, 40, np.nan, 100, np.nan],
datetime(2021, 11, 1, 00, 00, 00): [80, np.nan, 50, np.nan, np.nan],
datetime(2021, 12, 1, 00, 00, 00): [np.nan, 20, np.nan, np.nan, np.nan]})
| 2021-08-01 | 2021-09-01 | 2021-10-01 | 2021-11-01 | 2021-12-01 |
|---|---|---|---|---|
| 120 | NaN | NaN | 80 | NaN |
| NaN | NaN | 40 | NaN | 20 |
| NaN | 50 | NaN | 50 | NaN |
| NaN | NaN | 100 | NaN | NaN |
| 300 | NaN | NaN | NaN | NaN |
我正在寻找(通过 python)将它转换成这样的东西:
shouldbe = pd.DataFrame({
"PayDate1":
[datetime(2021,8,1), datetime(2021,10,1), datetime(2021,9,1), datetime(2021,10,1), datetime(2021,8,1)],
"Amount1": [120, 40, 50, 100, 300],
"PayDate2":
[datetime(2021,11,1), datetime(2021,12,1), datetime(2021,11,1), '', ''],
"Amount2": [80, 20, 50, np.nan, np.nan]}))
| PayDate1 | Amount1 | PayDate2 | Amount2 |
|---|---|---|---|
| 2021-08-01 | 120 | 2021-11-01 | 80 |
| 2021-10-01 | 40 | 2021-12-01 | 20 |
| 2021-09-01 | 50 | 2021-11-01 | 50 |
| 2021-10-01 | 100 | NaT | NaN |
| 2021-08-01 | 300 | NaT | NaN |
我正在寻找一些如何实现这种转换的示例,在此先感谢您的帮助。
【问题讨论】:
-
查看 pandas.DataFrame.pivot,或获取日期列表并手动构建数据
-
@2e0byo。枢轴的使用并不像看起来那么明显。要获得最终的数据框还有很长的路要走。如果你想检查我的答案:)
-
@Corralien 确实有;很好的答案。我没有时间弄清楚,虽然看着你的答案,我只是循环并处理执行时间,而不是与熊猫打架。不过非常好!
标签: python pandas dataframe data-cleaning