【问题标题】:Pandas dataframe fillna by some value熊猫数据框通过一些值填充
【发布时间】:2019-01-05 07:58:15
【问题描述】:

我有这个数据

import numpy as np
import pandas as pd
group = {'gender': ['male', 'female', 'female', 'male', 'female', 'male', 'male'],
        'height': [175, 168, np.nan, 170, 167, np.nan, 190],
        }
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
df = pd.DataFrame(group, index=labels)
df2 = df.groupby('gender')['height'].mean()

我想用 df2 的平均值填充 nan

【问题讨论】:

    标签: python pandas numpy dataframe pandas-groupby


    【解决方案1】:

    代码

    import pandas as pd
    import numpy as np
    
    group = {'gender': ['male', 'female', 'female', 'male', 'female', 'male', 'male'],
            'height': [175, 168, np.nan, 170, 167, np.nan, 190],
            }
    labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
    df = pd.DataFrame(group, index=labels)
    df2 = df.groupby('gender')['height'].mean()
    df['height'].fillna(df['gender'].map(df2), inplace=True)
    # print(df2)
    print(df)
    

    输出

       gender      height
    a    male  175.000000
    b  female  168.000000
    c  female  167.500000
    d    male  170.000000
    e  female  167.000000
    f    male  178.333333
    g    male  190.000000
    

    【讨论】:

      【解决方案2】:

      您可以使用groupby + transformmean。然后fillna 与结果系列。

      means = df.groupby('gender')['height'].transform('mean')
      df['height'] = df['height'].fillna(means)
      
      print(df)
      
         gender      height
      a    male  175.000000
      b  female  168.000000
      c  female  167.500000
      d    male  170.000000
      e  female  167.000000
      f    male  178.333333
      g    male  190.000000
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2020-11-13
        • 2016-12-11
        • 2020-06-06
        • 2020-03-30
        • 1970-01-01
        • 1970-01-01
        • 2019-11-30
        • 2020-11-19
        相关资源
        最近更新 更多