【问题标题】:Pandas Custom Cumulative Calculation Over Group By in DataFrame在 DataFrame 中通过 Group By 的 Pandas 自定义累积计算
【发布时间】:2021-07-12 23:41:23
【问题描述】:

我正在尝试对数据框内的组内的每一行的值进行简单的计算,但是我在语法上遇到了问题,我想我对什么数据对象感到特别困惑我应该返回,即数据框与系列等。

就上下文而言,我跟踪的每种产品都有一堆库存值,我想通过一个自定义函数估算销售数量,该函数基本上执行以下操作:

# Because stock can go up and down, I'm looking to record the difference 
# when the stock is less than the previous stock number from the previous row.
# How do I access each row of the dataframe and then return the series I need?

def get_stock_sold(x):
    # Written in pseudo
    stock_sold = previous_stock_no - current_stock_no if current_stock_no < previous_stock_no else 0
    return pd.Series(stock_sold)

然后我有以下数据框:

# 'Order' is a date in the real dataset.

data = { 
    'id'            : ['1', '1', '1', '2', '2', '2'],
    'order'         : [1, 2, 3, 1, 2, 3],
    'current_stock' : [100, 150, 90, 50, 48, 30]
}

df = pd.DataFrame(data)
df = df.sort_values(by=['id', 'order'])
df['previous_stock'] = df.groupby('id')['current_stock'].shift(1)

我想创建一个新列 (stock_sold) 并将上面的逻辑应用于分组数据框对象中的每一行:

df['stock_sold'] = df.groupby('id').apply(get_stock_sold)

所需的输出如下所示:

| id | order | current_stock | previous_stock | stock_sold |
|----|-------|---------------|----------------|------------|
| 1  | 1     | 100           | NaN            | 0          |
|    | 2     | 150           | 100.0          | 0          |
|    | 3     | 90            | 150.0          | 60         |
| 2  | 1     | 50            | NaN            | 0          |
|    | 2     | 48            | 50.0           | 2          |
|    | 3     | 30            | 48             | 18         |

【问题讨论】:

    标签: python pandas dataframe pandas-groupby custom-function


    【解决方案1】:

    试试:

    df["previous_stock"] = df.groupby("id")["current_stock"].shift()
    df["stock_sold"] = np.where(
        df["current_stock"] > df["previous_stock"].fillna(0),
        0,
        df["previous_stock"] - df["current_stock"],
    )
    print(df)
    

    打印:

      id  order  current_stock  previous_stock  stock_sold
    0  1      1            100             NaN         0.0
    1  1      2            150           100.0         0.0
    2  1      3             90           150.0        60.0
    3  2      1             50             NaN         0.0
    4  2      2             48            50.0         2.0
    5  2      3             30            48.0        18.0
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2013-02-24
      • 2017-04-19
      • 1970-01-01
      • 1970-01-01
      • 2019-08-10
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多