【发布时间】:2018-04-25 22:05:21
【问题描述】:
A B C D E
0 2002-01-12 2018-04-25 10:00:00 John 19 19
1 2002-01-12 2018-04-25 11:00:00 John 6 25
2 2002-01-13 2018-04-25 09:00:00 John 5 30
3 2002-01-13 2018-04-25 11:00:00 John -25 5
4 2002-01-14 2018-04-25 11:00:00 John 1 6
5 2002-01-14 2018-04-25 12:00:00 John 44 50
6 2002-01-25 2018-04-25 11:00:00 George 18 18
7 2002-01-25 2018-04-25 12:00:00 George 12 30
8 2002-01-26 2018-04-25 11:00:00 George -8 22
9 2002-01-26 2018-04-25 12:00:00 George -10 12
10 2002-01-27 2018-04-25 10:00:00 George 13 25
11 2002-01-27 2018-04-25 11:00:00 George 1 26
df['A'] = df['A'].apply(pd.to_datetime)
df['B'] = df['B'].apply(pd.to_datetime)
df["E"] = df.groupby("C")["D"].cumsum()
我想为每个C 组选择一行,并带有下一个条件:
- 在
E>=20和B==11:00:00的第一行,从每个C组的第二个A天开始申请。 - 如果不存在任何满足该条件的行,则取该
C组的第一行。
输出应该是:
A B C D E
0 2002-01-12 2018-04-25 10:00:00 John 19 19
8 2002-01-26 2018-04-25 11:00:00 George -8 22
我试过了:
def eleven(g):
cond = g[g.B==time(11)].E.ge(20)
if cond.any():
return g[cond].iloc[0]
else:
return g.iloc[1]
r = df.groupby('C', as_index=False).apply(eleven)
【问题讨论】:
标签: python pandas conditional