将新列添加到基于条件的类别值答案

【问题标题】：Adding a new column to categories values based conditions将新列添加到基于条件的类别值
【发布时间】：2021-06-09 15:19:53
【问题描述】：

我想添加一个具有基于 col1 和 col2 的字符串值的新列。因此，如果 col1 中的值大于或等于 4，并且如果 col2 值大于或等于 4，则在同一行中将“高”添加到 col3。如下图所示。

【问题讨论】：

那是 Excel 吗？熊猫数据框？你试过什么吗？ SO中有很多答案，你搜索过吗？
这能回答你的问题吗？ pandas create new column based on values from other columns / apply a function of multiple columns, row-wise

标签： python pandas numpy

【解决方案1】：

这样的事情应该可以工作，但这取决于您的数据格式。我假设它是一个熊猫数据框。

import numpy as np

df['col3'] = np.where((df['col1'] >= 4) & (df['col1'] >= 4), 'High', 'Low')

【讨论】：

【解决方案2】：

尝试使用 min() 和比较：

df['col3'] = np.where(df[['col1','col2']].min(1) >=4, 'High', 'Low')

或者由于你只有两列，你可以直接比较：

df['col3'] = np.where(df['col1'].ge(4) & df['col2'].ge(4), 'High', 'Low')

为此使用 lambda 函数：

df['col3'] = df.apply(lambda row: 'High' if row['col1'] >=4 and row['col2'] >=4 else 'Low' ,axis=1)

输出：

   col1 col2  col3
0   1    4    Low
1   2    5    Low
2   3    6    Low
3   4    7    High
4   5    2    Low

或者以另一种方式：

array = []
for item in df.values:
  if item[0] >=4 and item[1] >=4: array.append('High')
  else: array.append('Low')

df['col3'] = array

【讨论】：

【解决方案3】：

def test (row):
   if row['col1'] >= 4 and row['col2'] >= 4:
      return 'High'
   else:
      return 'Low'

df['col3'] = df.apply (lambda row: test (row), axis=1)

这是来自@Tomerikoo 的建议，希望是正确的。但是@Leonardo Viotti 的回答更快。谢谢！我现在也学习了 np.where 函数。

【讨论】：