【发布时间】:2020-08-03 22:25:48
【问题描述】:
df
Id timestamp data Date
30424 30665 2020-01-04 19:40:23.827 17.5 2020-01-04
31054 31295 2020-01-05 22:26:39.860 17.0 2020-01-05
32150 32391 2020-01-06 23:00:14.607 18.0 2020-01-06
33236 33477 2020-01-07 22:52:56.757 18.0 2020-01-07
34314 34555 2020-01-08 20:45:48.927 18.0 2020-01-08
35592 35833 2020-01-09 20:56:21.320 18.0 2020-01-09
36528 36769 2020-01-10 20:41:36.323 19.5 2020-01-10
37054 37295 2020-01-11 19:35:50.553 18.5 2020-01-11
37652 37893 2020-01-12 19:28:22.823 17.0 2020-01-12
38828 39069 2020-01-13 23:48:12.533 21.5 2020-01-13
40004 40245 2020-01-14 22:50:56.873 18.5 2020-01-14
df1
Date data
0 2020-01-04 NaN
1 2020-01-07 NaN
2 2020-01-08 19.0
3 2020-01-09 NaN
4 2020-01-11 NaN
5 2020-01-12 NaN
6 2020-01-16 NaN
7 2020-01-17 NaN
8 2020-01-24 18.5
如果df1['data'] 的值不是NaN,我想用df1['data'] 中的值替换df 中的data。
预期结果:
Id timestamp data Date
30424 30665 2020-01-04 19:40:23.827 17.5 2020-01-04
31054 31295 2020-01-05 22:26:39.860 17.0 2020-01-05
32150 32391 2020-01-06 23:00:14.607 18.0 2020-01-06
33236 33477 2020-01-07 22:52:56.757 18.0 2020-01-07
34314 34555 2020-01-08 20:45:48.927 19.0 2020-01-08 # This row changed
35592 35833 2020-01-09 20:56:21.320 18.0 2020-01-09
36528 36769 2020-01-10 20:41:36.323 19.5 2020-01-10
37054 37295 2020-01-11 19:35:50.553 18.5 2020-01-11
37652 37893 2020-01-12 19:28:22.823 17.0 2020-01-12
38828 39069 2020-01-13 23:48:12.533 21.5 2020-01-13
40004 40245 2020-01-14 22:50:56.873 18.5 2020-01-14
This answer 与我的问题类似,但情况并不完全相同。
我试过了:
pd.merge(df, df1, how='left', on='Date')
返回:
Id timestamp data_x Date data_y
0 30665 2020-01-04 19:40:23.827 17.5 2020-01-04 NaN
1 31295 2020-01-05 22:26:39.860 17.0 2020-01-05 NaN
2 32391 2020-01-06 23:00:14.607 18.0 2020-01-06 NaN
3 33477 2020-01-07 22:52:56.757 18.0 2020-01-07 NaN
4 34555 2020-01-08 20:45:48.927 18.0 2020-01-08 19.0
5 35833 2020-01-09 20:56:21.320 18.0 2020-01-09 NaN
6 36769 2020-01-10 20:41:36.323 19.5 2020-01-10 NaN
7 37295 2020-01-11 19:35:50.553 18.5 2020-01-11 NaN
更新:
试过了:
df['data'] = df['Date'].map(df1.set_index('Date')['data']).fillna(df['Date'])
但data 列似乎有问题:
Id timestamp data Date
30424 30665 2020-01-04 19:40:23.827 1.578096e+18 2020-01-04
31054 31295 2020-01-05 22:26:39.860 1.578182e+18 2020-01-05
32150 32391 2020-01-06 23:00:14.607 1.578269e+18 2020-01-06
33236 33477 2020-01-07 22:52:56.757 1.578355e+18 2020-01-07
34314 34555 2020-01-08 20:45:48.927 1.900000e+01 2020-01-08
35592 35833 2020-01-09 20:56:21.320 1.578528e+18 2020-01-09
36528 36769 2020-01-10 20:41:36.323 1.578614e+18 2020-01-10
【问题讨论】:
标签: python pandas numpy dataframe merge