【发布时间】:2025-12-07 05:20:05
【问题描述】:
我希望在我当前的数据框中添加一个新列,它会根据足球比赛中的一系列事件添加一个新的序列号。
这是我当前的数据框
head(test_P)
index team.name possession_team.name minute second period possession type.name
1 5 Cardiff City Cardiff City 0 0 1 2 Pass
2 6 Cardiff City Cardiff City 0 2 1 2 Ball Receipt*
3 7 Cardiff City Cardiff City 0 2 1 2 Carry
4 8 Cardiff City Cardiff City 0 3 1 2 Pass
5 9 Cardiff City Cardiff City 0 6 1 2 Ball Receipt*
6 10 Preston North End Cardiff City 0 6 1 2 Duel
7 11 Preston North End Cardiff City 0 6 1 2 Pass
8 12 Preston North End Cardiff City 0 8 1 2 Miscontrol
9 13 Cardiff City Cardiff City 0 8 1 2 Pass
10 14 Cardiff City Cardiff City 0 9 1 2 Ball Receipt*
11 15 Cardiff City Cardiff City 0 9 1 2 Cross
12 16 Preston North End Cardiff City 0 10 1 2 Clearance
13 17 Cardiff City Cardiff City 0 11 1 2 Pass
14 18 Cardiff City Cardiff City 0 13 1 2 Ball Receipt*
15 19 Preston North End Preston North End 0 13 1 3 Ball Recovery
16 20 Preston North End Preston North End 0 13 1 3 Carry
17 21 Preston North End Preston North End 0 21 1 3 Pass
18 22 Preston North End Preston North End 0 22 1 3 Ball Receipt*.
但是,我想在拥有后添加一个名为 sequence 的附加列名称,用于标记拥有的序列号。
每一个新的拥有都应该从序列值 1 开始
但是如果对手用一个/多个事件打破了这个序列并且控球值仍然相同,那么下一次控球队触球时,它应该是一个新的序列号,例如 2 或者如果多次打破 3,4 等
对立事件应该使用与他们打破的事件相同的序列号
例如下面的数据
index team.name possession_team.name minute second period possession type.name sequence
1 5 Cardiff City Cardiff City 0 0 1 2 Pass 1
2 6 Cardiff City Cardiff City 0 2 1 2 Ball Receipt 1
3 7 Cardiff City Cardiff City 0 2 1 2 Carry 1
4 8 Cardiff City Cardiff City 0 3 1 2 Pass 1
5 9 Cardiff City Cardiff City 0 6 1 2 Ball Receipt* 1
6 10 Preston North End Cardiff City 0 6 1 2 Duel 1
7 11 Preston North End Cardiff City 0 6 1 2 Pass 1
8 12 Preston North End Cardiff City 0 8 1 2 Miscontrol 1
9 13 Cardiff City Cardiff City 0 8 1 2 Pass 2
10 14 Cardiff City Cardiff City 0 9 1 2 Ball Receipt 2
11 15 Cardiff City Cardiff City 0 9 1 2 Cross 2
12 16 Preston North End Cardiff City 0 10 1 2 Clearance 2
13 17 Cardiff City Cardiff City 0 11 1 2 Pass 3
14 18 Cardiff City Cardiff City 0 13 1 2 Ball Receipt 3
15 19 Preston North End Preston North End 0 13 1 3 Ball Recovery 1
16 20 Preston North End Preston North End 0 13 1 3 Carry 1
17 21 Preston North End Preston North End 0 21 1 3 Pass 1
18 22 Preston North End Preston North End 0 22 1 3 Ball Receipt 1
我尝试过结合 ifelse 语句的超前和滞后函数,但似乎无法让数据正常工作
test <- test %>% mutate(P = ifelse(dplyr::lag(team.id)!=team.id & dplyr::lag(possession) == possession, dplyr::lag(seq_id) + 1,
ifelse(dplyr::lead(team.id)!=team.id & dplyr::lead(possession)!=possession , seq_id, 1)))
任何帮助将不胜感激,并对这个问题的不整洁表示歉意
【问题讨论】:
-
如果您提供一个更容易复制的可重现示例,这将更容易提供帮助。阅读how to give a reproducible example。如何知道序列是否中断?
标签: r time-series lag dplyr lead