dplyr 0.7 函数中的 if/else 条件答案

【问题标题】：If/else condition in dplyr 0.7 functiondplyr 0.7 函数中的 if/else 条件
【发布时间】：2018-12-09 02:42:02
【问题描述】：

我想在 dplyr 函数中创建一个简单的 if/else 条件。我查看了一些有用的帖子（例如，How to parametrize function calls in dplyr 0.7?），但仍然遇到问题。

下面是一个玩具示例，当我调用函数没有分组变量时，它可以工作。然后该函数因分组变量而失败。

# example dataset
test <- tibble(
  A = c(1:5,1:5),
  B = c(1,2,1,2,3,3,3,3,3,3),
  C = c(1,1,1,1,2,3,4,5,4,3)
)

# begin function, set default for group var to NULL.
prop_tab <- function(df, column, group = NULL) {

  col_name <- enquo(column)
  group_name <- enquo(group)

  # if group_by var is NOT null, then...
  if(!is.null(group)) {
      temp <- df %>%
        select(!!col_name, !!group_name) %>% 
        group_by(!!group_name) %>% 
        summarise(Percentages = 100 * length(!!col_name) / nrow(df))

  } else {
  # if group_by var is null, then...
      temp <- df %>%
        select(!!col_name) %>% 
        group_by(col_name = !!col_name) %>% 
        summarise(Percentages = 100 * length(!!col_name) / nrow(df)) 

  }

  temp
}

test %>% prop_tab(column = C)  # works

test %>% prop_tab(column = A, group = B)  # fails
# Error in prop_tab(., column = A, group = B) : object 'B' not found

【问题讨论】：

标签： r function if-statement dplyr

【解决方案1】：

这里的问题是，当您提供不带引号的参数时，is.null 不知道如何处理它。因此，此代码尝试检查对象 B 是否为 null 和错误，因为 B 在该范围内不存在。相反，您可以使用missing() 来检查是否向函数提供了参数，就像这样。可能有一种更清洁的方法，但至少可以使用，正如您在底部看到的那样。

library(tidyverse)
test <- tibble(
  A = c(1:5,1:5),
  B = c(1,2,1,2,3,3,3,3,3,3),
  C = c(1,1,1,1,2,3,4,5,4,3)
)

# begin function, set default for group var to NULL.
prop_tab <- function(df, column, group) {

  col_name <- enquo(column)
  group_name <- enquo(group)

  # if group_by var is not supplied, then:
  if(!missing(group)) {
    temp <- df %>%
      select(!!col_name, !!group_name) %>%
    group_by(!!group_name) %>%
    summarise(Percentages = 100 * length(!!col_name) / nrow(df))

  } else {
    # if group_by var is null, then...
    temp <- df %>%
      select(!!col_name) %>% 
      group_by(col_name = !!col_name) %>% 
      summarise(Percentages = 100 * length(!!col_name) / nrow(df)) 

  }

  temp
}

test %>% prop_tab(column = C)  # works
#> # A tibble: 5 x 2
#>   col_name Percentages
#>      <dbl>       <dbl>
#> 1        1          40
#> 2        2          10
#> 3        3          20
#> 4        4          20
#> 5        5          10

test %>% prop_tab(column = A, group = B)
#> # A tibble: 3 x 2
#>       B Percentages
#>   <dbl>       <dbl>
#> 1     1          20
#> 2     2          20
#> 3     3          60

由reprex package (v0.2.0) 于 2018 年 6 月 29 日创建。

【讨论】：

打败你 ;)
我认为说“非 tidyverse 函数不知道如何处理它们”并不准确。 NSE 在tidyverse 之前就在那里
是的，这很公平，已更改以反映这一点

【解决方案2】：

您可以使用missing 而不是is.null，因此您的参数不会被评估（这就是导致错误的原因）：

prop_tab <- function(df, column, group = NULL) {

  col_name <- enquo(column)
  group_name <- enquo(group)

  # if group_by var is NOT null, then...
  if(!missing(group)) {
    temp <- df %>%
      select(!!col_name, !!group_name) %>% 
      group_by(!!group_name) %>% 
      summarise(Percentages = 100 * length(!!col_name) / nrow(df))

  } else {
    # if group_by var is null, then...
    temp <- df %>%
      select(!!col_name) %>% 
      group_by(col_name = !!col_name) %>% 
      summarise(Percentages = 100 * length(!!col_name) / nrow(df)) 

  }

  temp
}

test %>% prop_tab(column = C) 
# example dataset
# # A tibble: 5 x 2
#   col_name Percentages
#      <dbl>       <dbl>
# 1        1          40
# 2        2          10
# 3        3          20
# 4        4          20
# 5        5          10

test %>% prop_tab(column = A, group = B)
# # A tibble: 3 x 2
#       B Percentages
#   <dbl>       <dbl>
# 1     1          20
# 2     2          20
# 3     3          60

您也可以使用length(substitute(group)) 而不是!missing(group)，它会更加健壮，因为它不会在有人用NULL 明确填写组参数的不太可能的情况下失败（前一个选项会崩溃本例）。

【讨论】：

【解决方案3】：

一种选择是检查“group_name”而不是“group”

prop_tab <- function(df, column, group = NULL) {

  col_name <- enquo(column)
  group_name <- enquo(group)

  # if group_by var is NOT null, then...
  if(as.character(group_name)[2] != "NULL") {
      temp <- df %>%
        select(!!col_name, !!group_name) %>% 
        group_by(!!group_name) %>% 
        summarise(Percentages = 100 * length(!!col_name) / nrow(df))

  } else {
  # if group_by var is null, then...
      temp <- df %>%
        select(!!col_name) %>% 
        group_by(col_name = !!col_name) %>% 
        summarise(Percentages = 100 * length(!!col_name) / nrow(df)) 

  }

  temp
}

-检查

prop_tab(test, column = C, group = B)
# A tibble: 3 x 2
#<     B Percentages
# <dbl>       <dbl>
#1     1          20
#2     2          20
#3     3          60  



prop_tab(test, column = C)
# A tibble: 5 x 2
#  col_name Percentages
#     <dbl>       <dbl>
#1        1          40
#2        2          10
#3        3          20
#4        4          20
#5        5          10

【讨论】：