【问题标题】:pandas reset cumsum when the previous value is negative当前一个值为负时,熊猫重置 cumsum
【发布时间】:2019-07-22 20:40:58
【问题描述】:

我需要对分组的数据框进行累积求和,但是当之前的值为负数而当前值为正数时,我需要将其重置。

在 R 中,我可以使用 ave() 函数对 groupby 应用条件,但我不能在 python 中这样做,所以我在思考解决方案时遇到了一些麻烦。谁能帮我吗?

这是一个示例:

import pandas as pd

df = pd.DataFrame({'PRODUCT': ['A'] * 40, 'GROUP': ['1'] * 40, 'FORECAST': [100, -40, -40, -40]*10, })

df['CS'] = df.groupby(['GROUP', 'PRODUCT']).FORECAST.cumsum()

# Reset cumsum if
# condition: (df.FORECAST > 0) & (df.groupby(['GROUP', 'PRODUCT']).FORECAST.shift(-1).fillna(0) <= 0)

【问题讨论】:

  • 您的预期输出是什么?所以,我们可以检查一下我们的解决方案。

标签: python pandas dataframe


【解决方案1】:

此解决方案将用于重置要求和的值从负数变为正数的任何示例的总和(无论数据集是否像您的示例中那样良好和周期性)

import numpy as np
import pandas as pd

df = pd.DataFrame({'PRODUCT': ['A'] * 40, 'GROUP': ['1'] * 40, 'FORECAST': [100, -40, -40, -40]*10, })

cumsum = np.cumsum(df['FORECAST'])

# Array of indices where sum should be reset
reset_ind = np.where(df['FORECAST'].diff() > 0)[0]

# Sums that need to be subtracted at resets
subs = cumsum[reset_ind-1].values

# Repeat subtraction values for every entry BETWEEN resets and values after final reset
rep_subs = np.repeat(subs, np.hstack([np.diff(reset_ind), df['FORECAST'].size - reset_ind[-1]]))

# Stack together values before first reset and resetted sums
df['CS'] = np.hstack([cumsum[:reset_ind[0]], cumsum[reset_ind[0]:] - rep_subs])

或者,基于on this solution to a similar question(以及我对groupby有用性的认识)

import pandas as pd
import numpy as np

df = pd.DataFrame({'PRODUCT': ['A'] * 40, 'GROUP': ['1'] * 40, 'FORECAST': [100, -40, -40, -40]*10, })

# Create indices to group sums together
df['cumsum'] = (df['FORECAST'].diff() > 0).cumsum()

# Perform group-wise cumsum
df['CS'] = df.groupby(['cumsum'])['FORECAST'].cumsum()

# Remove intermediary cumsum column
df = df.drop(['cumsum'], axis=1)

【讨论】:

  • 谢谢,一旦我将 GROUP 和 PRODUCT 列添加到 groupby 后,它似乎就起作用了,因为如果我不添加它,产品之间的 cumsum 可能会重叠。
猜你喜欢
  • 1970-01-01
  • 2023-04-03
  • 2018-05-31
  • 1970-01-01
  • 2017-11-14
  • 1970-01-01
  • 1970-01-01
  • 2021-05-30
  • 2022-12-13
相关资源
最近更新 更多