基于索引值设置 Panda Dataframe 列值的 Pythonic 方法答案

【问题标题】：Pythonic way to set a Panda Dataframe's column's value based on index values [duplicate]基于索引值设置 Panda Dataframe 列值的 Pythonic 方法
【发布时间】：2019-06-30 05:44:28
【问题描述】：

我想根据 DataFrame 的索引是否在集合中，将 Pandas 的一列 DataFrame 设置为 True/False。

我可以这样做：

import io

table = """
A,1,2
B,1,3
C,4,5
D,9,1
E,10,4
F,8,3
G,9,0
"""

df = pd.read_csv(io.StringIO(table), header=None, index_col=0)

fM7_notes = set(['F', 'A', 'C', 'E'])

df['in_maj_7'] = False
df.loc[fM7_notes, 'in_maj_7'] = True

但是，我想写的不是最后两行，而是

df['in_maj_7'] = df.index in fM7_notes

这似乎更具表现力、简洁和 Python 风格，但它也不起作用：

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-81-851b1efe0c36> in <module>()
----> 1 df['in_maj_7'] = df.index in fM7_notes

~/anaconda/lib/python3.6/site-packages/pandas/core/indexes/base.py in __hash__(self)
   2060 
   2061     def __hash__(self):
-> 2062         raise TypeError("unhashable type: %r" % type(self).__name__)
   2063 
   2064     def __setitem__(self, key, value):

TypeError: unhashable type: 'Index'

有没有更清洁的方法？

【问题讨论】：

我不认为引用的文章完全正确 - 我知道使用 Series.isin 进行各种操作以及使用集合（或列表）正确索引。我只是没想到在表达式的 RHS 上使用Index.isin。但幸运的是，它的开放时间足够长，可以得到我需要的东西。

标签： python pandas

【解决方案1】：

带有pandas.Index.isin()功能：

In [31]: df['in_maj_7'] = df.index.isin(fM7_notes)

In [32]: df
Out[32]:
    1  2  in_maj_7
0
A   1  2      True
B   1  3     False
C   4  5      True
D   9  1     False
E  10  4      True
F   8  3      True
G   9  0     False

【讨论】：