基于系列条件创建新的熊猫列答案

【问题标题】：Creating new pandas column based on Series conditional基于系列条件创建新的熊猫列
【发布时间】：2017-06-23 19:19:34
【问题描述】：

从R 到Python，我似乎无法根据有条件地检查其他列来找出创建新列的简单案例。

# In R, create a 'z' column based on values in x and y columns
df <- data.frame(x=rnorm(100),y=rnorm(100))
df$z <- ifelse(df$x > 1.0 | df$y < -1.0, 'outlier', 'normal')
table(df$z)
# output below
normal outlier 
     66      34

尝试 Python 中的等效语句：

import numpy as np
import pandas as pd
df = pd.DataFrame({'x': np.random.standard_normal(100), 'y': np.random.standard_normal(100)})
df['z'] = 'outlier' if df.x > 1.0 or df.y < -1.0 else 'normal'

但是，会引发以下异常： ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

实现这一点的pythonic方法是什么？非常感谢:)

【问题讨论】：

标签： python r pandas dataframe

【解决方案1】：

试试这个：

df['z'] = np.where((df.x > 1.0) | (df.y < -1.0), 'outlier', 'normal')

【讨论】：

【解决方案2】：

如果你想对列进行元素操作，你不能像这样处理你的列。使用numpy where

【讨论】：