【发布时间】:2020-02-21 12:19:48
【问题描述】:
这是我对数据集 df1 的输入:
structure(list(Name = c("A.J. Ellis", "A.J. Ellis", "A.J. Pierzynski",
"A.J. Pierzynski", "Aaron Boone", "Adam Kennedy", "Adam Melhuse",
"Adrian Beltre", "Adrian Beltre", "Adrian Gonzalez", "Alan Zinter",
"Albert Pujols", "Albert Pujols"), Age = c(37, 36, 37, 36, 36,
36, 36, 37, 36, 36, 36, 37, 36), Year = c(2018, 2017, 2014, 2013,
2009, 2012, 2008, 2016, 2015, 2018, 2004, 2017, 2016), Tm = c("SDP",
"MIA", "TOT", "TEX", "HOU", "LAD", "TOT", "TEX", "TEX", "NYM",
"ARI", "LAA", "LAA"), Lg = c("NL", "NL", "ML", "AL", "NL", "NL",
"ML", "AL", "AL", "NL", "NL", "AL", "AL"), G = c(66, 51, 102,
134, 10, 86, 15, 153, 143, 54, 28, 149, 152), PA = c(183, 163,
362, 529, 14, 201, 32, 640, 619, 187, 40, 636, 650)), row.names = c(NA,
13L), class = "data.frame")
以下是我之前的问题中正确匹配配对的代码:
df1 %>%
arrange(Name, Age) %>%
group_by(Name) %>%
filter(last(G) < first(G))
每个分组对有两个观察值。每个还有一个名为 G 和一列年份。
以下是使用上述代码对数据进行分组后的样子:https://www.dropbox.com/s/hh2qgkbn4cy4k4l/Data%20after%20grouping.png?dl=0
现在,我想知道每个匹配对的“G 列”值在“37 岁”值和“36 岁”值之间的差异:(36 岁值)- (37 岁的价值)。阴性结果是可以的。
另外,对于数据集中所有匹配的对,我想要这些差异的总和。
【问题讨论】: