【问题标题】:I am trying categorize data based on multiple column in pandas?我正在尝试根据熊猫中的多列对数据进行分类?
【发布时间】:2020-04-04 14:06:12
【问题描述】:

我有以下 new_correlation 数据框,其中包含以下输入

| Engagement Index | High Impact |
|------------------|-------------|
| 3.14             | 48.0        |
| 4.15             | 31.0        |
| 4.20             | 40.0        |

我的情况是

def priority_driver(corr, high_impact):
    if corr > 0.4 & high_impact > 40:
        return 'Sustenance'
    elif corr > 0.4 & high_impact < 40:
        return 'Improvement'
    elif corr < 0.4 & high_impact > 40:
        return 'Distraction'
    elif corr < 0.4 & high_impact < 40:
        return 'Low Focus'

我试过new_correlation['Priority of action'] = new_correlation.apply(lambda x: priority_driver(x['Engagement Index'], x['High Impact']), axis =1)

这给了我

TypeError: ("&: 'float' 和 'float' 不支持的操作数类型", '发生在索引 0')

所需输出:

| Engagement Index | High Impact | Priority of action |
|------------------|-------------|--------------------|
| 0.72             | 48.0        | Sustenance         |
| 0.74             | 31.0        | Improvement        |
| 0.78             | 40.0        | Sustenance         |

【问题讨论】:

  • 你需要括号来分隔条件

标签: python python-3.x pandas


【解决方案1】:

你应该写

if (corr > 0.4) & (high_impact > 40)

或者,这也应该有效(并且 IMO 更具可读性):

if corr > 0.4 and high_impact > 40

【讨论】:

    【解决方案2】:

    请注意,也有可能使用 numpy select 执行此操作,看起来像这样:

    import pandas as pd 
    
    df = pd.DataFrame({'A' : pd.np.random.choice([.2, .3, .4, .5, .6, .7], 200),                                       
                       'B' : pd.np.random.randint(30, 50, 200)})
    
    conds = [ (df['A'] >= .4) & (df['B'] >= 40),
              (df['A'] >= .4) & (df['B'] < 40),
              (df['A'] <= .4) & (df['B'] >= 40),
              (df['A'] <= .4) & (df['B'] < 40) ]
    
    cond_resp = ['Sustenance', 'Improvement', 'Distraction', 'Low Focus']
    
    df['C'] = np.select(conds, cond_resp)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2014-04-23
      • 1970-01-01
      • 2021-12-22
      • 1970-01-01
      • 2013-07-30
      • 1970-01-01
      • 1970-01-01
      • 2020-04-09
      相关资源
      最近更新 更多