将 n() 与 summarise_all 一起使用答案

【问题标题】：Using n() with summarise_all将 n() 与 summarise_all 一起使用
【发布时间】：2019-09-14 10:06:03
【问题描述】：

工作正常：

stats = c('mean', 'median', 'sd', 'max', 'min')
sumtable = iris %>% select(-Species) %>%  summarise_all(.funs = stats)

不起作用：

stats = c('mean', 'median', 'sd', 'max', 'min', 'n')
sumtable = iris %>% select(-Species) %>% summarise_all(.funs = stats)
Error in summarise_impl(.data, dots) : `n()` does not take arguments

请指教。

【问题讨论】：

你可以用length代替n
意识到对于命名函数向量中的每个函数，summarize_all 以数据作为其第一个参数调用该函数。因此，虽然mean(x)（或更恰当地说，mean(c(1,2,5,22,...))）有意义，但n(...) 没有意义，因为n() 没有参数。您始终可以定义 my_n <- function(...) n()（明确接受并忽略所有参数），然后定义 stats <- c(..., "my_n")，但您也可以按照 akrun 的建议使用 length。
将n() 与summarize_all() 一起使用有什么意义？ n() 返回行数，我假设所有列都相同
知道了。谢谢。

标签： r dplyr tidyverse summarize

【解决方案1】：

我想要这个功能是因为我想计算非缺失观察值。正如 Rohit 指出的那样，长度将计算所有行，包括丢失的 obs。所以我最后做的是这样的：

not.na = function(x) {sum(!is.na(x))}
stats = c('mean', 'median', 'sd', 'max', 'min', 'not.na')
sum.acs = acs %>% group_by(year) %>% summarise_all(.funs = stats)

【讨论】：