【问题标题】:Replace values in column of Pandas DataFrame using a Series lookup table使用 Series 查找表替换 Pandas DataFrame 列中的值
【发布时间】:2016-06-17 00:31:52
【问题描述】:

我想用我准备的系列形式的查找表生成的更准确/完整的一组值替换 DataFrame 中的一列值。

我以为我可以这样做,但结果并不像预期的那样。

这是我要修复的 DataFrame:

In [6]: df_normalised.head(10)
Out[6]: 
  code                                          name
0    8                             Human development
1   11                                              
2    1                           Economic management
3    6         Social protection and risk management
4    5                         Trade and integration
5    2                      Public sector governance
6   11  Environment and natural resources management
7    6         Social protection and risk management
8    7                   Social dev/gender/inclusion
9    7                   Social dev/gender/inclusion

(注意第 2 行中缺少的名称)。

这是我为进行修复而创建的查找表:

In [20]: names
Out[20]: 
1                              Economic management
10                               Rural development
11    Environment and natural resources management
2                         Public sector governance
3                                      Rule of law
4         Financial and private sector development
5                            Trade and integration
6            Social protection and risk management
7                      Social dev/gender/inclusion
8                                Human development
9                                Urban development
dtype: object

这是我认为可以做到的方式:

In [21]: names[df_normalised.head(10).code]
Out[21]: 
code
8                                Human development
11    Environment and natural resources management
1                              Economic management
6            Social protection and risk management
5                            Trade and integration
2                         Public sector governance
11    Environment and natural resources management
6            Social protection and risk management
7                      Social dev/gender/inclusion
7                      Social dev/gender/inclusion
dtype: object

但是,我希望上面的结果系列具有与 df_normalised 的索引相同的索引(即 0、1、2、3),而不是基于代码值的索引。

所以我不确定如何用这些系列值替换 df_normalised 中“名称”列中的原始值,因为索引不一样。

顺便说一句,上面的索引怎么可能有重复值?

【问题讨论】:

    标签: python pandas dataframe


    【解决方案1】:

    您可以为此使用map() 函数:

    In [38]: df_normalised['name'] = df_normalised['code'].map(name)
    
    In [39]: df_normalised
    Out[39]:
       code                                          name
    0     8                             Human development
    1    11  Environment and natural resources management
    2     1                           Economic management
    3     6         Social protection and risk management
    4     5                         Trade and integration
    5     2                      Public sector governance
    6    11  Environment and natural resources management
    7     6         Social protection and risk management
    8     7                   Social dev/gender/inclusion
    9     7                   Social dev/gender/inclusion
    

    【讨论】:

    • 优秀。谢谢!我查看了地图,但认为它只是用于应用功能。
    【解决方案2】:

    这行得通。不过,我很确定一定有更简单的方法来做到这一点。

    In [50]: df_normalised.name = pd.Series(names[df_normalised.code].values)
    
    In [51]: df_normalised.head(10)
    Out[51]: 
      code                                          name
    0    8                             Human development
    1   11  Environment and natural resources management
    2    1                           Economic management
    3    6         Social protection and risk management
    4    5                         Trade and integration
    5    2                      Public sector governance
    6   11  Environment and natural resources management
    7    6         Social protection and risk management
    8    7                   Social dev/gender/inclusion
    9    7                   Social dev/gender/inclusion
    

    【讨论】:

      猜你喜欢
      • 2014-06-12
      • 1970-01-01
      • 2018-01-22
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-08-15
      • 2019-07-16
      相关资源
      最近更新 更多