找到 r 中最大的因子和索引的最大值答案

【问题标题】：find max of factor and index that max in r找到 r 中最大的因子和索引的最大值
【发布时间】：2015-08-21 06:33:14
【问题描述】：

这应该非常简单，但我无法弄清楚。我想获得每组的最大值，我这样做如下。

ddply(dd,~group,summarise,max=max(value))

但是除了返回值和组之外，我还想返回值、组和另一列日期，索引在下面（显然不起作用）。我该怎么做？谢谢。

ddply(dd,~group,summarise,max=max(value))['date']

【问题讨论】：

标签： r dataframe plyr

【解决方案1】：

如果您在与具有最大值的行相对应的日期之后，请尝试subset 获取最大值的行以及select 以获取您所追求的列。

# reproducible example using `iris`

# your original
ddply(iris, ~Species, summarise, max=max(Sepal.Length))
#      Species max
# 1     setosa 5.8
# 2 versicolor 7.0
# 3  virginica 7.9


# now we want to get the Sepal.Width that corresponds to max sepal.length too.
ddply(iris, ~Species, subset, Sepal.Length==max(Sepal.Length),
      select=c('Species', 'Sepal.Length', 'Sepal.Width'))
#      Species Sepal.Length Sepal.Width
# 1     setosa          5.8         4.0
# 2 versicolor          7.0         3.2
# 3  virginica          7.9         3.8

（或者在subset 调用中不使用select，而是在ddply 之后使用[, c('columns', 'I', 'want')]）。如果同一物种有多行达到最大值，这将返回所有行。

您也可以使用summarise 来执行此操作，只需在调用中添加您的date 定义，但效率稍低（计算两次最大值）：

ddply(iris, ~Species, summarise,
      max=max(Sepal.Length),
      width=Sepal.Width[which.max(Sepal.Length)])

这将只返回每个物种的一行，如果有多个花的最大萼片长度为它们的物种，只返回第一个（which.max 返回匹配索引中的第一个）。

【讨论】：

完美运行。感谢您提供多种解决方案。为了将来参考，这是我最终选择的。 ddply(dd, ~group, 子集, value==max(value), select=c('date2', 'value'))

【解决方案2】：

如果我们使用data.table（使用iris数据集），我们将data.frame转换为data.table，按分组变量（'Species'）分组，我们得到max值的索引一个变量（'Sepal.Length'），并使用它来对 .SDcols 中指示的列进行子集化。

library(data.table)
dt <- as.data.table(iris)
dt[, .SD[which.max(Sepal.Length)]  , by = Species, 
                 .SDcols= c('Sepal.Length', 'Sepal.Width')]

【讨论】：