【问题标题】:Sum column values in a dataframe if values in another column are next to each other如果另一列中的值彼此相邻,则对数据框中的列值求和
【发布时间】:2020-10-29 23:25:36
【问题描述】:

您好,我有一个数据框:

import pandas as pd

df1 = {'name': ["x","x","x","x","x","x","x","y","y","y","y","y","y","y"],
       'a': [3,4,5,11,14,15,16,2,3,4,10,13,14,15],
       'b': [9,8,7,12,23,22,21,8,7,6,11,22,21,20],
       'val': [2,1,3,4,5,6,3,21,11,31,41,51,61,31]    
        }

df1 = pd.DataFrame (df1, columns = ['name','a','b','val'])

如果“a”列中的数字彼此相邻,我希望对“val”列中的数字求和。例如。在“a”中,您有 3,4,5(彼此相邻),因此将它们在“val”列中的相关数字相加(即 2+1+3),然后创建一个新列,其中存在附加值.对我来说更难的是按“名称”对它们进行分组。

我不知道我解释得有多好,但这是我希望最终得到的数据框

df2 = {'name': ["x","x","x","x","x","x","x","y","y","y","y","y","y","y"],
       'a': [3,4,5,11,14,15,16,2,3,4,10,13,14,15],
       'b': [9,8,7,12,23,22,21,8,7,6,11,22,21,20],
       'val': [2,1,3,4,5,6,3,21,11,31,41,51,61,31],
       'sum_val': [6,6,6,4,14,14,14,63,63,63,41,143,143,143]
        }

df2 = pd.DataFrame (df2, columns = ['name','a','b','val','sum_val'])

【问题讨论】:

    标签: python python-3.x pandas dataframe dataset


    【解决方案1】:

    通过比较差异与 lambda 函数中每组的累积总和来创建组,并将Series 传递给GroupBy.transformsum

    g = df1.groupby('name')['a'].apply(lambda x: x.diff().ne(1).cumsum())
    
    df1['sum_val'] = df1.groupby([g, 'name'])['val'].transform('sum')
    print (df1)
    
       name   a   b  val  sum_val
    0     x   3   9    2        6
    1     x   4   8    1        6
    2     x   5   7    3        6
    3     x  11  12    4        4
    4     x  14  23    5       14
    5     x  15  22    6       14
    6     x  16  21    3       14
    7     y   2   8   21       63
    8     y   3   7   11       63
    9     y   4   6   31       63
    10    y  10  11   41       41
    11    y  13  22   51      143
    12    y  14  21   61      143
    13    y  15  20   31      143
    

    【讨论】:

    • 这很好用我可以在这里问一下'g'基本上是将相邻的'a' vlaues 分组在一起吗?在第二行代码中,“总和”在做什么?
    • @DanynPatel - 确切地说,它是比较值a,如果差异是1,则创建组
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-11-28
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多