将两列与 ID 组合在一起，对两列进行累积答案

【问题标题】：Group together two columns with ID, do the cumulative for two columns将两列与 ID 组合在一起，对两列进行累积
【发布时间】：2017-06-16 14:17:28
【问题描述】：

编辑：我把问题写成非结构化的方式，让我再试一次。

我想为下面的数据集创建两个新列，winner_total_points 和 loser_total_points。

winner <- c(1,2,3,4,1,2)
loser <- c(2,3,1,3,3,1)
winner_points <- c(5,4,12,2,1,6)
loser_points <- c(5,2,2,6,6,2)
test_data <- data.frame(winner, loser, winner_points, loser_points)

我想要这两列做的是winner_total_points 将获胜者（不包括本场比赛）作为获胜者和失败者获得的所有积分相加。

loser_total_points 的功能相同，但对于失败者而言。

请注意，winner 和 loser 列包含各自的玩家 ID。

现在，使用ave() 函数相当容易，但它仅适用于仅对列进行分组并为一列进行累积总和。

期望的输出：

winner loser winner_points loser_points winner_total loser_total
1      2     5             5            5            5
2      3     4             2            9 (5+4)      2
3      1     12            2            14 (2+12)    7 (5+2)
4      3     2             6            2            20 (2+12+6)
1      3     1             6            8 (5+2+1)    26 (2+12+6+6)
2      1     6             2            15 (5+4+6)   10 (5+2+1+2)

【问题讨论】：

请提供一个可重现的例子
我可以提供一个可以加载的示例数据集。
我不遵循逻辑。
@Sotos 我现在重写了这个问题。希望它现在更有意义。
我还是不清楚。

标签： r cumsum

【解决方案1】：

我也很难理解，但也许这是...？

library(dplyr)

as.winner <- test_data %>% group_by(winner) %>% summarise(winner_sum = sum(winner_points))
as.loser <- test_data %>% group_by(loser) %>% summarise(loser_sum = sum(loser_points))
names(as.winner)[1] <- 'player'
names(as.loser)[1] <- 'player'
totals <- merge(as.winner, as.loser, by = 'player', all.x = T, all.y = T)
totals[is.na(totals)] <- 0
totals <- transform(totals, total_points = winner_sum + loser_sum)
totals

【讨论】：

【解决方案2】：

如果我正确理解了 OP 的要求，他想按玩家 id 计算积分的累积总和，无论是winner_points 还是loser_points。这里的重点是注意winner 和loser 列包含各自的玩家ID。

以下解决方案将数据从宽格式重塑为长格式，其中两个值变量同时重塑，计算每个玩家 id 的累积和，最后再次从长格式重塑为宽格式。

library(data.table
cols <- c("winner", "loser")
setDT(test_data)[
  # append row id column required for subsequent reshaping
  , rn := .I][
    # reshape multiple value variables simultaneously from wide to long format
    , melt(.SD, id.vars = "rn", 
           measure.vars = list(cols, paste0(cols, "_points")), 
           value.name = c("id", "points"))][
             # rename variable column
             , variable := forcats::lvls_revalue(variable, cols)][
               # order by row id and compute cumulative points by id
               order(rn), total := cumsum(points), by = id][
                 # reshape multiple value variables simultaneously from long to wide format
                 , dcast(.SD, rn ~ variable, value.var = c("id", "points", "total"))]

   rn id_winner id_loser points_winner points_loser total_winner total_loser
1:  1         1        2             5            5            5           5
2:  2         2        3             4            2            9           2
3:  3         3        1            12            2           14           7
4:  4         4        3             2            6            2          20
5:  5         1        3             1            6            8          26
6:  6         2        1             6            2           15          10

编辑：以上结果与 OP 发布的预期结果一致。它包括得分包括实际比赛。同时，the OP has posted a similar question 的预期结果排除实际匹配。

【讨论】：