【发布时间】:2018-03-12 02:08:30
【问题描述】:
我想根据之前在 ColumnB 中出现的次数来获取 ColumnA 中某项的运行计数。理想情况下,这个计数也可以是 ColumnC 的子集。
例如,我想在这里获得获胜者之前的 LOSSES 或失败者之前的 WINS 的总和:
#create df
year <- c(2017, 2017, 2017, 2017, 2017, 2016, 2016, 2016, 2016, 2016)
winner <- c('sam', 'ryan', 'sally', 'sally', 'ryan', 'sally', 'mike', 'ryan', 'mike', 'sam')
loser <- c('mike', 'mike', 'ryan', 'sam', 'sam', 'mike', 'sally', 'mike', 'ryan', 'sally')
df <- data.frame(year, winner, loser)
#successul methods for getting winner's cumulative wins or loser's cumulative losses
df <- as.data.table(df)[, winner_wins := seq(.N), by = "winner"][]
df <- as.data.table(df)[, loser_losses := seq(.N), by = "loser"][]
#successul methods for getting winner's cumulative wins or loser's cumulative losses by year
df <- df %>% group_by(year, winner) %>% mutate(winner_wins = row_number())
df <- df %>% group_by(year, loser) %>% mutate(loser_losses = row_number())
#failed attempt to get winner's cumulative losses by year
df <- df %>% group_by(year) %>% mutate(winner_losses = cumsum(winner == loser & year == year))
我希望输出是我的原始数据框,但有四个新列:winner_cum_wins、winner_cum_losses、loser_cum_wins、loser_cum_losses。
【问题讨论】: