如何使用 dplyr 和 ggplot2 将列名作为函数参数传递？答案

【问题标题】：How can I pass a column name as a function argument using dplyr and ggplot2?如何使用 dplyr 和 ggplot2 将列名作为函数参数传递？
【发布时间】：2017-07-26 17:07:52
【问题描述】：

我正在尝试编写一个会吐出模型诊断图的函数。

to_plot <- function(df, model, response_variable, indep_variable) {
  resp_plot <- 
    df %>%
    mutate(model_resp = predict.glm(model, df, type = 'response')) %>%
    group_by(indep_variable) %>%
    summarize(actual_response = mean(response_variable),
              predicted_response = mean(model_resp)) %>%
    ggplot(aes(indep_variable)) + 
    geom_line(aes(x = indep_variable, y = actual_response, colour = "actual")) + 
    geom_line(aes(x = indep_variable, y = predicted_response, colour = "predicted")) +
    ylab(label = 'Response')

}

当我在数据集上运行它时，dplyr 会抛出一个我不明白的错误：

fit <- glm(data = mtcars, mpg ~ wt + qsec + am, family = gaussian(link = 'identity')
to_plot(mtcars, fit, mpg, wt)

 Error in grouped_df_impl(data, unname(vars), drop) : 
  Column `indep_variable` is unknown

基于一些粗略的调试，我发现错误发生在 group_by 步骤中，因此它可能与我在函数中调用列的方式有关。谢谢！

【问题讨论】：

您需要另一层复杂性来处理标准评估（即，使用indep_variable 代表的值，而不是搜索indep_variable 本身）：@ 987654321@
这是因为 dplyr 使用非标准评估。 Hadley 在这里解释 NSE：dplyr.tidyverse.org/articles/programming.html 和一个非常不错的网络研讨会：rstudio.com/resources/webinars/whats-new-in-dplyr-0-7-0
谢谢。根据您的回复，我在下面添加了一个建议的答案，但希望能提供反馈以使其更清晰。

标签： r ggplot2 dplyr

【解决方案1】：

这段代码似乎可以修复它。正如上面的评论者所提到的，传入函数的变量必须包装在“enquo”函数中，然后用 !! 解包。注意 aes() 函数在处理字符串时变为 aes_()。

library(tidyverse)

to_plot <- function(df, model, response_variable, indep_variable) {
  response_variable <- enquo(response_variable)
  indep_variable <- enquo(indep_variable)

  resp_plot <- 
    df %>%
    mutate(model_resp = predict.glm(model, df, type = 'response')) %>%
    group_by(!!indep_variable) %>%
    summarize(actual_response = mean(!!response_variable),
              predicted_response = mean(model_resp)) %>%
    ggplot(aes_(indep_variable)) + 
    geom_line(aes_(x = indep_variable, y = quote(actual_response)), colour = "blue") + 
    geom_line(aes_(x = indep_variable, y = quote(predicted_response)), colour = "red") +
    ylab(label = 'Response')

  return(resp_plot)
}

fit <- glm(data = mtcars, mpg ~ wt + qsec + am, family = gaussian(link = 'identity'))
to_plot(mtcars, fit, mpg, wt)

【讨论】：

这可行，但不是很优雅。请随时对改进进行编辑，我认为我没有完全理解这一点。