【问题标题】:How to add a row to a pandas DataFrame without flattening the MultiIndex如何在不展平 MultiIndex 的情况下向 pandas DataFrame 添加一行
【发布时间】:2017-07-06 21:18:17
【问题描述】:

我无法以有效的方式将单行添加到 MultiIndexed DataFrame。通过添加行,MultiIndex 被展平为一个简单的元组索引。奇怪的是,这对于 MultiIndexed 列来说不是问题。

系统信息:

Python 3.6.1 |Continuum Analytics, Inc.| (default, Mar 22 2017, 19:25:17) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> pd.__version__
'0.19.2'

示例数据:包含 MultiIndex 行和列的 DataFrame

import numpy as np
import pandas as pd

index = pd.MultiIndex(levels=[['bar', 'foo'], ['one', 'two']],
                      labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
                      names=['row_0', 'row_1'])
columns = pd.MultiIndex(levels=[['dull', 'shiny'], ['a', 'b']],
                      labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
                      names=['col_0', 'col_1'])
df = pd.DataFrame(np.ones((4,4)),columns=columns, index=index)

print(df)

    col_0       dull      shiny     
col_1          a    b     a    b
row_0 row_1                     
bar   one    1.0  1.0   1.0  1.0
      two    1.0  1.0   1.0  1.0
foo   one    1.0  1.0   1.0  1.0
      two    1.0  1.0   1.0  1.0

在DataFrame中多加一列也没问题:

df['last_col'] = 42 #define a new column and assign a value

print(df)

col_0       dull      shiny      last_col
col_1          a    b     a    b         
row_0 row_1                              
bar   one    1.0  1.0   1.0  1.0       42
      two    1.0  1.0   1.0  1.0       42
foo   one    1.0  1.0   1.0  1.0       42
      two    1.0  1.0   1.0  1.0       42

但是,如果我对添加行执行相同操作(通过使用 loc),MultiIndex 将被展平为 元组的简单索引:

df.loc['last_row'] = 43  #define a new row and assign a value

print(df)

col_0       dull       shiny       last_col
col_1          a     b     a     b         
(bar, one)   1.0   1.0   1.0   1.0       42
(bar, two)   1.0   1.0   1.0   1.0       42
(foo, one)   1.0   1.0   1.0   1.0       42
(foo, two)   1.0   1.0   1.0   1.0       42
last_row    43.0  43.0  43.0  43.0       43

有没有人知道如何以既简单又有效的方式在不展平索引的情况下添加行?非常感谢!!

【问题讨论】:

标签: python pandas dataframe


【解决方案1】:

我认为您需要定义 MultiIndex 的两个值的元组:

df.loc[('last_row', 'a'), :] = 43
print(df)
col_0           dull       shiny      
col_1              a     b     a     b
row_0    row_1                        
bar      one     1.0   1.0   1.0   1.0
         two     1.0   1.0   1.0   1.0
foo      one     1.0   1.0   1.0   1.0
         two     1.0   1.0   1.0   1.0
last_row a      43.0  43.0  43.0  43.0

对于列,它的工作方式类似:

df[('last_col', 'a')] = 43
print(df)
col_0       dull      shiny      last_col
col_1          a    b     a    b        a
row_0 row_1                              
bar   one    1.0  1.0   1.0  1.0       43
      two    1.0  1.0   1.0  1.0       43
foo   one    1.0  1.0   1.0  1.0       43
      two    1.0  1.0   1.0  1.0       43

编辑:

看来你需要定义列名,如果需要全部使用::

df.loc['last_row',:] = 43
print(df)
col_0           dull       shiny      
col_1              a     b     a     b
row_0    row_1                        
bar      one     1.0   1.0   1.0   1.0
         two     1.0   1.0   1.0   1.0
foo      one     1.0   1.0   1.0   1.0
         two     1.0   1.0   1.0   1.0
last_row        43.0  43.0  43.0  43.0

如果未定义级别则添加空字符串:

print(df.index)
MultiIndex(levels=[['bar', 'foo', 'last_row'], ['one', 'two', '']],
           labels=[[0, 0, 1, 1, 2], [0, 1, 0, 1, 2]],
           names=['row_0', 'row_1'])
df.loc['last_row','dull'] = 43
print(df)
col_0           dull       shiny     
col_1              a     b     a    b
row_0    row_1                       
bar      one     1.0   1.0   1.0  1.0
         two     1.0   1.0   1.0  1.0
foo      one     1.0   1.0   1.0  1.0
         two     1.0   1.0   1.0  1.0
last_row        43.0  43.0   NaN  NaN
df.loc['last_row', ('dull', 'a')] = 43
print(df)
col_0           dull      shiny     
col_1              a    b     a    b
row_0    row_1                      
bar      one     1.0  1.0   1.0  1.0
         two     1.0  1.0   1.0  1.0
foo      one     1.0  1.0   1.0  1.0
         two     1.0  1.0   1.0  1.0
last_row        43.0  NaN   NaN  NaN

【讨论】:

    猜你喜欢
    • 2014-09-15
    • 2018-08-20
    • 2019-04-28
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-01-04
    • 1970-01-01
    相关资源
    最近更新 更多