【问题标题】:filter dataframe and add the newly created columns to original df过滤数据框并将新创建的列添加到原始 df
【发布时间】:2020-11-17 05:52:10
【问题描述】:

有没有一种简单的方法可以依次对每个水果进行计算,将新创建的列添加到原始 df 中?

df
 concatted  score      fruit        status   date              
 apple_bana  0.500      apple       high    2010-02-20         
      apple  0.600      apple      low     2010-02-21          
     banana  0.530      pear       low     2010-01-12        
Expected output:
 concatted  score      fruit        status   date              first_diff  
 apple_bana  0.500      apple       high    2010-02-20                     
      apple  0.600      apple      low     2010-02-21            0.1
     banana  0.530      pear       low     2010-01-12        
I tried:
fruits = ['apple', 'banana', 'pair']
for fruit in fruits :
    selected_rows = df[(df['fruit'] == fruit)]
    selected_rows['first_diff']= df.score.diff().dropna()
    df = df.append(selected_rows)

【问题讨论】:

  • 你的预期输出是什么?
  • 显示在中间
  • df.groupby('fruit').score.diff() ?

标签: python python-3.x pandas dataframe for-loop


【解决方案1】:

groupby(),并申请.diff()评分

df['first_diff']=df[['concatted', 'score', 'fruit', 'status', 'date']].groupby('fruit')['score'].diff().fillna('')

如果需要一般性的东西,请尝试;

df['first_diff']=df[[x for x in df.columns]].groupby('fruit')['score'].diff().fillna('')

     concatted  score  fruit status    date       first_diff
0  apple_bana   0.50  apple   high  2010-02-20           
1       apple   0.60  apple    low  2010-02-21        0.1
2      banana   0.53   pear    low  2010-01-12   

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2014-06-06
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-02-20
    • 2019-09-30
    • 1970-01-01
    相关资源
    最近更新 更多