R：当 2 支球队按照日程表进行互动时，每场比赛有 3 名参与者进行矩阵计数比赛答案

【问题标题】：R: Matrix counting matches when 2 teams interacted from schedule with 3 participants per matchR：当 2 支球队按照日程表进行互动时，每场比赛有 3 名参与者进行矩阵计数比赛
【发布时间】：2016-03-19 11:49:33
【问题描述】：

我想对 FIRST 机器人团队进行一些计算，并且需要构建一个二进制交互矩阵，因为没有更好的词。那是两支球队在同一个联盟的时候。每个联盟都有三支球队，因此在考虑 (i,j)、(j,i) 和 (i,i) 时，每场比赛有 7 个值添加到矩阵中。

我使用的完整数据在这里：http://frc-events.firstinspires.org/2016/MOKC/qualifications

但为简单起见，这里以 9 支球队每人打 1 场比赛为例。

> data.frame(Team.1=1:3,Team.2=4:6,Team.3=7:9)
  Team.1 Team.2 Team.3
1      1      4      7
2      2      5      8
3      3      6      9

矩阵应该计算每个二元相互作用，(1,4),(4,7),(3,6),(6,3),(9,9) 等，并且将是 N x N矩阵，在上面的例子中 N=9。这是代表上述列表的矩阵：

> matrix(data=c(1,0,0,1,0,0,1,0,0,+
+ 0,1,0,0,1,0,0,1,0,+
+ 0,0,1,0,0,1,0,0,1,+
+ 1,0,0,1,0,0,1,0,0,+
+ 0,1,0,0,1,0,0,1,0,+
+ 0,0,1,0,0,1,0,0,1,+
+ 1,0,0,1,0,0,1,0,0,+
+ 0,1,0,0,1,0,0,1,0,+
+ 0,0,1,0,0,1,0,0,1),9,9)
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
 [1,]    1    0    0    1    0    0    1    0    0
 [2,]    0    1    0    0    1    0    0    1    0
 [3,]    0    0    1    0    0    1    0    0    1
 [4,]    1    0    0    1    0    0    1    0    0
 [5,]    0    1    0    0    1    0    0    1    0
 [6,]    0    0    1    0    0    1    0    0    1
 [7,]    1    0    0    1    0    0    1    0    0
 [8,]    0    1    0    0    1    0    0    1    0
 [9,]    0    0    1    0    0    1    0    0    1

在真实数据中，团队编号不是连续的，更像是 5732、1345、3451 等，并且每个团队的匹配数更多，这意味着矩阵值将介于 0 和最大匹配数之间的球队。这在真实数据中可以看出。

感谢任何可以提供帮助的人。

【问题讨论】：

您能否详细说明矩阵的行和列的含义？我将其解释为“第 1 队曾见过第 4 队和第 7 队”。对吗？
没错。从比赛日程来看，1,4 和 7 一起，所以给 (1,1), (1,4), (4,7), (1,7), (4,1) 加了一个， (7,4) 和 (7,1) 在矩阵中，你有 (row,column)。

标签： r algorithm matrix

【解决方案1】：

可能有一种更优雅的方法，但这里是使用 data.table 的方法。

library(data.table)
dat <- data.table(Team.1=1:3,Team.2=4:6,Team.3=7:9)

#add match ID
dat[,match:=1:.N]
#turn to long
mdat <- melt(dat,id="match",value.name="team")[,variable:=NULL]

#merge with itself
dat2 <- merge(mdat, mdat, by=c("match"),all=T, allow.cartesian = T)

# reshape
dcast(dat2, team.x~team.y, fun.agg=length)

   team.x 1 2 3 4 5 6 7 8 9
1:      1 1 0 0 1 0 0 1 0 0
2:      2 0 1 0 0 1 0 0 1 0
3:      3 0 0 1 0 0 1 0 0 1
4:      4 1 0 0 1 0 0 1 0 0
5:      5 0 1 0 0 1 0 0 1 0
6:      6 0 0 1 0 0 1 0 0 1
7:      7 1 0 0 1 0 0 1 0 0
8:      8 0 1 0 0 1 0 0 1 0
9:      9 0 0 1 0 0 1 0 0 1

而且，因为我可以，所以我可以使用 base-R 中的一个。我认为使用 for 循环是合理的情况（因为您不断修改同一个对象）。

#make matrix to put results in

nteams = length(unique(unlist(dat)))
res <- matrix(0,nrow=nteams, ncol=nteams)


#split data by row, generate combinations for each row and add to matrix
for(i in 1:nrow(dat)){
  x=unlist(dat[i,])
  coords=as.matrix(expand.grid(x,x))
  res[coords] <- res[coords]+1
}

【讨论】：

【解决方案2】：

这是我对基本函数的建议。我试图创建一个矩阵。我的方法是寻找 1 的位置索引。

library(magrittr)

mydf <- data.frame(Team.1 = 1:3, Team.2 = 4:6,Team.3 = 7:9)

### Create a matrix with position indexes

lapply(1:nrow(mydf), function(x){

       a <- t(combn(mydf[x, ], 2)) # Get some combination
       b <- a[, 2:1] # Get other combination by reversing columns
       foo <- rbind(a, b)
       foo

     }) %>%
do.call(rbind, .) -> ana

ana <- matrix(unlist(ana), nrow = nrow(ana))


### Another set: Get indexes for self (e.g., (1,1), (2,2), (3,3))

foo <- rep(1:max(mydf), times = 2)
matrix(foo, nrow = length(foo) / 2) -> bob


### A matric with all position indexes
cammy <- rbind(ana, bob)


### Create a plain matrix
mat <- matrix(0, nrow = max(mydf), ncol = max(mydf))

### Fill in the matrix with 1
mat[cammy] <- 1

#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,]    1    0    0    1    0    0    1    0    0
# [2,]    0    1    0    0    1    0    0    1    0
# [3,]    0    0    1    0    0    1    0    0    1
# [4,]    1    0    0    1    0    0    1    0    0
# [5,]    0    1    0    0    1    0    0    1    0
# [6,]    0    0    1    0    0    1    0    0    1
# [7,]    1    0    0    1    0    0    1    0    0
# [8,]    0    1    0    0    1    0    0    1    0
# [9,]    0    0    1    0    0    1    0    0    1

编辑

这里是基于之前想法的修改版。这不像 Herka 的带有基本函数的想法那样简洁。在我修改后的数据中，第 1 队和第 4 队有两场比赛。这里的想法是我计算了每对出现在数据集中的次数。 dplyr 部分正在这样做。在 for 循环中，我通过遍历cammy 的每一行来填充矩阵 mat。

mydf <- data.frame(Team.1=c(1:3,1),Team.2=c(4:6,4),Team.3=c(7:9,5))


#  Team.1 Team.2 Team.3
#1      1      4      7
#2      2      5      8
#3      3      6      9
#4      1      4      5

library(dplyr)

lapply(1:nrow(mydf), function(x){

       a <- t(combn(mydf[x, ], 2)) # Get some combination
       b <- a[, 2:1] # Get other combination by reversing columns
       foo <- rbind(a, b)
       foo

     }) %>%
do.call(rbind, .) -> ana

ana <- data.frame(matrix(unlist(ana), nrow = nrow(ana)))


### Another set: Get indexes for self (e.g., (1,1), (2,2), (3,3))
foo <- rep(1:max(mydf), times = 2)
data.frame(matrix(foo, nrow = length(foo) / 2)) -> bob


cammy <- bind_rows(ana, bob) %>%
         group_by(X1, X2) %>%
         mutate(total = n()) %>%
         as.matrix


### Create a plain matrix
mat <- matrix(0, nrow = max(mydf), ncol = max(mydf))



for(i in 1:nrow(cammy)){

    mat[cammy[i, 1], cammy[i, 2]] <- cammy[i, 3]
}

print(mat)

#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,]    1    0    0    2    1    0    1    0    0
# [2,]    0    1    0    0    1    0    0    1    0
# [3,]    0    0    1    0    0    1    0    0    1
# [4,]    2    0    0    1    1    0    1    0    0
# [5,]    1    1    0    1    1    0    0    1    0
# [6,]    0    0    1    0    0    1    0    0    1
# [7,]    1    0    0    1    0    0    1    0    0
# [8,]    0    1    0    0    1    0    0    1    0
# [9,]    0    0    1    0    0    1    0    0    1

【讨论】：

不错。但是，如果一个团队多次出现，这将如何运作？（就像在 OP 的描述中一样）
@Heroka 感谢您的评论。我需要考虑这一点。
@Heroka 我做了我能做的。我的想法不像你的那样简洁。但我试图利用我最初的想法并添加更多内容来解决这个问题。现在是时候睡觉了。如果您能给我任何建议以改进我的想法，那将对我有所帮助。