【问题标题】:Replacing values in a pandas multi-index替换熊猫多索引中的值
【发布时间】:2016-07-06 19:42:39
【问题描述】:

我有一个带有多索引的数据框。当满足第一个索引的某些条件时,我想更改第二个索引的值。 我在这里发现了一个类似(但不同)的问题:Replace a value in MultiIndex (pandas) 这没有回答我的观点,因为那是关于更改单行,并且解决方案也传递了第一个索引的值(不需要更改)。就我而言,我正在处理多行,但我无法使该解决方案适应我的情况。

我的数据的最小示例如下。谢谢!

import pandas as pd
import numpy as np

consdf=pd.DataFrame()

for mylocation in ['North','South']:
    for scenario in np.arange(1,4):
        df= pd.DataFrame()
        df['mylocation'] = [mylocation]
        df['scenario']= [scenario]
        df['this'] = np.random.randint(10,100)
        df['that'] = df['this']  * 2
        df['something else']  = df['this'] * 3
        consdf=pd.concat((consdf, df ), axis=0, ignore_index=True)

mypiv = consdf.pivot('mylocation','scenario').transpose()

level_list =['this','that']
# if level 0 is in level_list --> set level 1 to np.nan
mypiv.iloc[mypiv.index.get_level_values(0).isin(level_list)].index.set_levels([np.nan], level =1, inplace=True)

最后一行不起作用:我明白了:

ValueError: On level 1, label max (2) >= length of level  (1). NOTE: this index is in an inconsistent state

【问题讨论】:

  • 您可以选择重置索引还是保留它?
  • 我想保留它。我可以重置它然后重新添加它吗?
  • 你可以用mypiv.loc[(level_list,)]代替mypiv.iloc[mypiv.index.get_level_values(0).isin(level_list)]

标签: python pandas dataframe multi-index


【解决方案1】:

IIUC 您可以为级别值添加新值,然后使用advanced indexingget_level_valuesset_levelsset_labels 方法更改索引的标签:

len_ind = len(mypiv.loc[(level_list,)].index.get_level_values(1))
mypiv.index.set_levels([1, 2, 3, np.nan], level=1, inplace=True)
mypiv.index.set_labels([3]*len_ind + mypiv.index.labels[1][len_ind:].tolist(), level=1, inplace=True)

In [219]: mypiv
Out[219]: 
mylocation               North  South
               scenario              
this           NaN          26     46
               NaN          32     67
               NaN          75     30
that           NaN          52     92
               NaN          64    134
               NaN         150     60
something else  1.0         78    138
                2.0         96    201
                3.0        225     90

注意您的其他 scenario 的值将转换为 float,因为它应该是一种类型,而 np.nan 具有 float 类型。

【讨论】:

    【解决方案2】:

    注意: ix 在 Pandas 0.20+ 中已被弃用。请改用loc 访问器。

    这是一个解决方案,使用reset_index()方法:

    In [95]: new = mypiv.reset_index()
    
    In [96]: new
    Out[96]:
    mylocation         level_0  scenario  North  South
    0                     this         1     32     64
    1                     this         2     18     40
    2                     this         3     76     56
    3                     that         1     64    128
    4                     that         2     36     80
    5                     that         3    152    112
    6           something else         1     96    192
    7           something else         2     54    120
    8           something else         3    228    168
    
    In [100]: new.ix[new.level_0.isin(level_list), 'scenario'] = np.nan
    
    In [101]: new
    Out[101]:
    mylocation         level_0  scenario  North  South
    0                     this       NaN     32     64
    1                     this       NaN     18     40
    2                     this       NaN     76     56
    3                     that       NaN     64    128
    4                     that       NaN     36     80
    5                     that       NaN    152    112
    6           something else       1.0     96    192
    7           something else       2.0     54    120
    8           something else       3.0    228    168
    
    In [103]: mypiv = new.set_index(['level_0', 'scenario'])
    
    In [104]: mypiv
    Out[104]:
    mylocation               North  South
    level_0        scenario
    this           NaN          32     64
                   NaN          18     40
                   NaN          76     56
    that           NaN          64    128
                   NaN          36     80
                   NaN         152    112
    something else 1.0          96    192
                   2.0          54    120
                   3.0         228    168
    

    但我怀疑还有更优雅的解决方案。

    【讨论】:

      猜你喜欢
      • 2017-07-23
      • 2013-12-28
      • 1970-01-01
      • 2023-02-25
      • 1970-01-01
      • 1970-01-01
      • 2020-10-10
      • 2023-03-06
      • 2016-10-16
      相关资源
      最近更新 更多