【发布时间】:2018-09-08 16:33:36
【问题描述】:
DT:
HomeTeam AwayTeam Season Htpoints Atpoints
Mattersburg Salzburg 2015/2016 3 0
Salzburg Rapid Vienna 2015/2016 0 3
Admira Mattersburg 2015/2016 3 0
Admira Salzburg 2015/2016 1 1
Mattersburg Ried 2015/2016 3 0
Ried Salzburg 2015/2016 0 3
Altach Mattersburg 2015/2016 3 0
Austria Vie Mattersburg 2015/2016 3 0
Salzburg Altach 2015/2016 3 0
Mattersburg AC Wolfsberger2015/2016 3 0
Salzburg Austria Vienna2015/2016 1 1
Rapid Vienna Mattersburg 2015/2016 0 3
Sturm Graz Salzburg 2015/2016 0 3
Salzburg Grodig 2015/2016 3 0
计算球队最近3场主场比赛的平均分:
library(zoo)
roll <- function(x, n) {
if (length(x) <= n) NaN
else rollapply(x, list(-seq(n)), mean, fill = NaN)
}
transform(DT, last3.HT.av.points = ave(Htpoints,Season,HomeTeam, FUN = function(x) roll(x, 3)))
以上都不是问题。另一方面....
无论球队是主场还是客场,是否有可能计算最近3场比赛的平均分?
期望的输出(仅显示萨尔茨堡队的信息):
HomeTeam AwayTeam Season Htpoints Atpoints HT.av.last3 AT.av.last3
Mattersburg Salzburg 2015/2016 3 0 NA
Salzburg Rapid Vienna 2015/2016 0 3 NA
Admira Mattersburg 2015/2016 3 0
Admira Salzburg 2015/2016 1 1 NA
Mattersburg Ried 2015/2016 3 0
Ried Salzburg 2015/2016 0 3 0.33
Altach Mattersburg 2015/2016 3 0
Austria Vie Mattersburg 2015/2016 3 0
Salzburg Altach 2015/2016 3 0 1.33
Mattersburg AC Wolfsberger2015/2016 3 0
Salzburg Austria Vienna2015/2016 1 1 2.33
Rapid Vienna Mattersburg 2015/2016 0 3
Sturm Graz Salzburg 2015/2016 0 3 2.33
Salzburg Grodig 2015/2016 3 0 2.33
偏好: 数据表
可重现的数据集(与上述不同):
library(data.table)
DT <- fread("HomeTeam,AwayTeam,Season,Htpoints,Atpoints
Grodig,Salzburg,2015/2016,0,3
Rapid Vienna,Altach,2015/2016,1,1
Ried,Austria Vienna,2015/2016,3,0
Sturm Graz,Mattersburg,2015/2016,3,0
Admira,Rapid Vienna,2015/2016,1,1
Altach,Ried,2015/2016,0,3
Austria Vienna,Sturm Graz,2015/2016,1,1
Mattersburg,Grodig,2015/2016,3,0
Salzburg,AC Wolfsberger,2015/2016,3,0")
numTeams <- DT[,uniqueN(c(HomeTeam, AwayTeam))]
firstHalf <- lapply(seq_len(DT[,.N]),
function(n) data.table(
Matchday=n*2L-1L,
HomeTeam=DT[["HomeTeam"]],
AwayTeam=c(DT[["AwayTeam"]][-seq_len(n)], DT[["AwayTeam"]][seq_len(n)]),
Season=DT[["Season"]],
Htpoints=DT[["Htpoints"]],
Atpoints=DT[["Atpoints"]]
))
secondHalf <- lapply(seq_len(DT[,.N]),
function(n) data.table(
Matchday=n*2L,
HomeTeam=DT[["AwayTeam"]],
AwayTeam=c(DT[["HomeTeam"]][-seq_len(n)], DT[["HomeTeam"]][seq_len(n)]),
Season=DT[["Season"]],
Htpoints=DT[["Htpoints"]],
Atpoints=DT[["Atpoints"]]
))
DT <- rbindlist(c(firstHalf, secondHalf))[
HomeTeam!=AwayTeam][,
.SD[1L], by=.(HomeTeam, AwayTeam)]
setorder(DT, Matchday, HomeTeam)
DT <- DT[,-c("Matchday")]
【问题讨论】:
-
你能添加一个可重现的数据集吗?
-
@Salman 添加。与所需输出的不同。不过测试一下就OK了。
-
谢谢,但比赛都是在同一个赛季,所以
3 recent matches没有意义。你同意吗? -
@Salman 为什么不呢?我希望这些信息知道每个团队的形式。上的例子只有一个季节。稍后我必须在真实数据集上按季节分组。
标签: r