Pandas Dataframe系列：检查特定值是否存在[重复]答案

【问题标题】：Pandas Dataframe Series : check if specific value exists [duplicate]Pandas Dataframe系列：检查特定值是否存在[重复]
【发布时间】：2020-09-07 00:41:53
【问题描述】：

如果列表中的值存在于 pandas 数据框列之一中，我需要遍历列表并执行特定操作。我尝试如下做，但得到以下错误

'Error: #Series的真值不明确。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。'

import pandas as pd

people = {
    'fname':['Alex','Jane','John'],
    'age':[20,15,25],
    'sal':[100,200,300]
}

df=pd.DataFrame(people)

check_list=['Alex','John']

for column in check_list:
    if (column == df['fname']):
        df['new_column']=df['sal']/df['age']
    else:
        df['new_column']=df['sal']

df

所需输出：

fname   age sal new_column
Alex    20  100  5      <<-- sal/age
Jane    15  200  200    <<-- sal as it is
John    25  300  12     <<-- sal/age

【问题讨论】：

标签： python pandas pandas-groupby

【解决方案1】：

使用np.where 和.isin 来检查列是否包含特定值。

df['new_column'] = np.where(
        df['fname'].isin(['Alex','John']),
        df['sal']/df['age'],
        df['sal']
)

print(df)

  fname  age  sal  new_column
0  Alex   20  100         5.0
1  Jane   15  200       200.0
2  John   25  300        12.0

纯熊猫版。

df['new_column'] = (df['sal']/df['age']).where(
                            df['fname'].isin(['Alex','John']),other=df['sal'])

print(df)
 fname  age  sal  new_col
0  Alex   20  100      5.0
1  Jane   15  200    200.0
2  John   25  300     12.0

【讨论】：

感谢 Datanovice。有没有其他方法可以在不使用 np.where 的情况下做到这一点？
@steve 查看编辑。

【解决方案2】：

尝试使用df.apply

import pandas as pd

people = {
    'fname':['Alex','Jane','John'],
    'age':[20,15,25],
    'sal':[100,200,300]
}

df=pd.DataFrame(people)

def checker(item):
    check_list=['Alex','John']
    if item["fname"] in check_list:
        return item['sal']/item['age']
    else:
        return item['sal']

df["Exists"] = df.apply(checker, axis=1)

df

【讨论】：

【解决方案3】：

for index,row in df.iterrows():
    if row['fname'] in check_list:
           df.at[index,'new_column']=row['sal']/row['age']
    else:
           df.at[index,'new_column']=row['sal']

说明：要遍历数据帧，使用 iterrows()，行变量将具有所有列的值，索引是行的索引。

【讨论】：