【问题标题】:Last row of some column in dataframe not included不包括数据框中某些列的最后一行
【发布时间】:2023-02-10 17:57:39
【问题描述】:

所以我试图在索引 0 交换到另一个索引之前找到它的平均值。 数据框的示例:

column_a value_b sum_c count_d_ avg_e
0 10 10 1
0 20 30 2
0 30 60 3 20
1 10 10 1
1 20 30 2
1 30 60 3 20
0 10 10 1
0 20 30 2 15
1 10 10 1
1 20 30 2
1 30 60 3 20
0 10 10 1
0 20

但是,只有 sum 和 count 的最后一行不可用,因此无法为其计算 avg

part of the code...
#sum and avg for each section

for i, row in df.iloc[0:-1].iterrows():
  if df['column_a'][i] == 0:
    sum = sum + df['value_b'][i]
    df['sum_c'][i] = sum
    count = count + 1
    df['count_d'][i] = count
  else:
    sum = 0 
    count = 0
    df['sum_c'][i] = sum
    df['count_d'][i] = count

totcount = 0
for m, row in df.iloc[0:-1].iterrows():
  if df.loc[m, 'column_a'] == 0 :
    if (df.loc[m+1, 'sum_c'] == 0) :
      totcount = df.loc[m, 'count_d']
      avg_e = (df.loc[m, 'sum_c']) / totcount
      df.loc[m, 'avg_e'] = avg_e

已尝试仅使用 df.iloc[0:].iterrows 但它会产生错误。

【问题讨论】:

    标签: python pandas dataframe


    【解决方案1】:

    您可以使用groupby.cummaxgroupby.cumcountgroupby.transform('mean') 重写完整代码,并使用where 进行屏蔽。

    # compute a mask with True for the last value per successive group
    m = df['column_a'].ne(df['column_a'].shift(-1))[::-1]
    # make a grouper
    group = m.cumsum()
    
    # for each group
    g = df.groupby(group)['value_b']
    # compute the cumsum
    df['sum_c'] = g.cumsum()
    # compute the cumcount
    df['count_d_'] = g.cumcount().add(1)
    # compute the mean and assign to the last row per group
    df['avg_e'] = g.transform('mean').where(m)
    

    输出:

        column_a  value_b  sum_c  count_d_  avg_e
    0          0       10     10         1    NaN
    1          0       20     30         2    NaN
    2          0       30     60         3   20.0
    3          1       10     10         1    NaN
    4          1       20     30         2    NaN
    5          1       30     60         3   20.0
    6          0       10     10         1    NaN
    7          0       20     30         2   15.0
    8          1       10     10         1    NaN
    9          1       20     30         2    NaN
    10         1       30     60         3   20.0
    11         0       10     10         1    NaN
    12         0       20     30         2   15.0
    

    【讨论】:

      猜你喜欢
      • 2019-04-09
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2014-07-09
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多