【问题标题】:Pandas: replace a NaN with another DataFramePandas:用另一个 DataFrame 替换 NaN
【发布时间】:2020-08-21 19:58:37
【问题描述】:

我正在尝试解决这个问题,所以请帮帮我,我有这个数据集:

df1= pd.DataFrame(data={'col1': ['a','b','c','d'],
                              'col2': [1,2,np.nan,4]})
df2=pd.DataFrame(data={'col1': ['a','b','b','a','f','c','e','d','e','a'],
                       'col2':[1,3,2,3,6,4,1,2,5,2]})

df1

  col1  col2
0    a   1.0
1    b   2.0
2    c   NaN
3    d   4.0

df2

  col1  col2
0    a     1
1    b     3
2    b     2
3    a     3
4    f     6
5    c     4
6    e     1
7    d     2
8    e     5
9    a     2

我试过了

df1[df1['col2'].isna()] = pd.merge(df1, df2, on=['col1'], how='left')

我预料到了

  col1  col2
0    a   1.0
1    b   2.0
2    c   4
3    d   4.0

但是,我得到了这个

  col1  col2
0    a   1.0
1    b   2.0
2    a   NaN
3    d   4.0

然后我尝试了这个

for x in zip(df1,df2):
    if x in df1['col2'] == x in df2['col2']:
        df1['col1'][df1['col1'].isna()] = df2['col1'].where(df1['col2'][x] == df2['col2'][x])

但是得到了这个

  col1  col2
0    a   1.0
1    b   2.0
2    c   NaN
3    d   4.0

我也试过this answer

还是什么都没有

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    使用Series.map 匹配值col1Series 唯一列col1 DataFrame.drop_duplicates 并仅用Series.fillna 替换缺失值:

    s = df2.drop_duplicates('col1').set_index('col1')['col2']
    print (s)
    col1
    a    1
    b    3
    f    6
    c    4
    e    1
    d    2
    Name: col2, dtype: int64
    
    print (df1['col1'].map(s))
    0    1
    1    3
    2    4
    3    2
    Name: col1, dtype: int64
    
    df1['col2'] = df1['col2'].fillna(df1['col1'].map(s))
    print (df1)
      col1  col2
    0    a   1.0
    1    b   2.0
    2    c   4.0
    3    d   4.0
    

    【讨论】:

      猜你喜欢
      • 2018-11-14
      • 2016-01-08
      • 1970-01-01
      • 2017-07-04
      • 2015-03-10
      • 2017-12-03
      • 2012-11-23
      • 1970-01-01
      • 2013-09-12
      相关资源
      最近更新 更多