【问题标题】:Calculate Duration based on date and events根据日期和事件计算持续时间
【发布时间】:2018-05-09 20:37:27
【问题描述】:

我有一个 DF,我想计算两个收入日期之间的时间。

DF

Date          Earnings Reported           
2018-04-02    1          
2018-04-03    0
2018-04-04    0

DF - 想要的

Date          Earnings Reported       DaySinceEarnings      
2018-04-02    1                       0
2018-04-03    0                       1
2018-04-04    0                       2 

我试图做一个 lambda 函数,但无法让它工作

df['DaySinceEarnings'] = df.groupby['Earnings Reported'].apply(lambda x: (x == '1') * (x == '1').cumsum())

【问题讨论】:

  • df.Date.diff().dt.days.cumsum() ?
  • 那会怎样?

标签: python pandas lambda


【解决方案1】:
import pandas as pd

df = pd.DataFrame(
    {'Date': ['2018-04-02',
              '2018-04-03',
              '2018-04-04',
              '2018-04-05',
              '2018-04-06',
              '2018-04-07', ],
     'Earnings Reported': [1, 0, 0, 1, 1, 0]}
)

df['Date'] = pd.to_datetime(df['Date'])


def only_include_reported_days(x):
    x['DaySinceEarnings'] = 0
    if x['Earnings Reported'] == 1:
        return x

    sub = df[(df['Date'] < x['Date']) &
             (df['Earnings Reported'] == 1)]

    x['DaySinceEarnings'] = (x['Date'] - max(sub['Date'])).days
    return x

【讨论】:

  • 嘿,凯特琳,我厌倦了上面的代码。 DaySinceEarnings = 在 groupby 代码行之后缺少 x2 值。然后在 lambda 应用之后,它缺少 daysinceearnings 的 x3 值
猜你喜欢
  • 1970-01-01
  • 2019-12-24
  • 2016-12-31
  • 2014-02-18
  • 1970-01-01
  • 1970-01-01
  • 2019-07-21
  • 1970-01-01
  • 2012-01-23
相关资源
最近更新 更多