【问题标题】:using `rlang` quasiquotation with `dplyr::_join` functions使用带有 dplyr::_join 函数的 rlang 准引用
【发布时间】:2020-03-09 18:47:30
【问题描述】:

我正在尝试编写一个使用rlang 的准引用的自定义函数。此函数还在内部使用dplyrjoin 函数。我在下面提供了一个最小的工作示例来说明我的问题。

# needed libraries 
library(tidyverse)

# function definition
df_combiner <- function(data, x, group.by) {
  # check how many variables were entered for this grouping variable
  group.by <- as.list(rlang::quo_squash(rlang::enquo(group.by)))

  # based on number of arguments, select `group.by` in cases like `c(cyl)`,
  # the first list element after `quo_squash` will be `c` which we don't need,
  # but if we pass just `cyl`, there is no `c`, this will take care of that
  # issue
  group.by <-
    if (length(group.by) == 1) {
      group.by
    } else {
      group.by[-1]
    }

  # creating internal dataframe
  df <- dplyr::group_by(.data = data, !!!group.by, .drop = TRUE)

  # creating dataframes to be joined: one with tally, one with summary
  df_tally <- dplyr::tally(df)
  df_mean <- dplyr::summarise(df, mean = mean({{ x }}, na.rm = TRUE))

  # without specifying `by` argument, this works but prints a message I want to avoid
  print(dplyr::left_join(x = df_tally, y = df_mean))

  # joining by specifying `by` argument (my failed attempt)
  dplyr::left_join(x = df_tally, y = df_mean, by = !!!group.by)
}

# using the function
df_combiner(diamonds, carat, c(cut, clarity))

#> Joining, by = c("cut", "clarity")

#> # A tibble: 40 x 4
#> # Groups:   cut [5]
#>    cut   clarity     n  mean
#>    <ord> <ord>   <int> <dbl>
#>  1 Fair  I1        210 1.36 
#>  2 Fair  SI2       466 1.20 
#>  3 Fair  SI1       408 0.965
#>  4 Fair  VS2       261 0.885
#>  5 Fair  VS1       170 0.880
#>  6 Fair  VVS2       69 0.692
#>  7 Fair  VVS1       17 0.665
#>  8 Fair  IF          9 0.474
#>  9 Good  I1         96 1.20 
#> 10 Good  SI2      1081 1.04 
#> # ... with 30 more rows

#> Error in !group.by: invalid argument type

从这里可以看出,我想避免#&gt; Joining, by = c("cut", "clarity") 消息,因此明确想为_join 函数输入by 参数,但我不知道该怎么做。 (我试过rlang::as_stringrlang::quo_name等)。

【问题讨论】:

  • 你能用suppressMessages 换行吗,即suppressMessages(head(mtcars) %&gt;% left_join(data.frame(carb = 4)))
  • 我当然可以使用suppressMessages,但这只是为了避免这个问题,当这个解决方案可能无法正常工作时,这个问题肯定会出现。我真的很想学习如何做到这一点rlang

标签: r dplyr rlang quasiquotes


【解决方案1】:

我们可以用as_string转成字符串

dplyr::left_join(x = df_tally, y = df_mean,
            by = map_chr(group.by, rlang::as_string))

df_combiner <- function(data, x, group.by) {
  # check how many variables were entered for this grouping variable
  group.by <- as.list(rlang::quo_squash(rlang::enquo(group.by)))

  # based on number of arguments, select `group.by` in cases like `c(cyl)`,
  # the first list element after `quo_squash` will be `c` which we don't need,
  # but if we pass just `cyl`, there is no `c`, this will take care of that
  # issue
  group.by <-
    if (length(group.by) == 1) {
      group.by
    } else {
      group.by[-1]
    }

  # creating internal dataframe
  df <- dplyr::group_by(.data = data, !!!group.by, .drop = TRUE)

  # creating dataframes to be joined: one with tally, one with summary
  df_tally <- dplyr::tally(df)
  df_mean <- dplyr::summarise(df, mean = mean({{ x }}, na.rm = TRUE))

  # without specifying `by` argument, this works but prints a message I want to avoid
  #print(dplyr::left_join(x = df_tally, y = df_mean))

  # joining by specifying `by` argument (my failed attempt)
   dplyr::left_join(x = df_tally, y = df_mean, by = map_chr(group.by, rlang::as_string))

}

-检查

df_combiner(diamonds, carat, c(cut, clarity))
# A tibble: 40 x 4
# Groups:   cut [5]
#   cut   clarity     n  mean
#   <ord> <ord>   <int> <dbl>
# 1 Fair  I1        210 1.36 
# 2 Fair  SI2       466 1.20 
# 3 Fair  SI1       408 0.965
# 4 Fair  VS2       261 0.885
# 5 Fair  VS1       170 0.880
# 6 Fair  VVS2       69 0.692
# 7 Fair  VVS1       17 0.665
# 8 Fair  IF          9 0.474
# 9 Good  I1         96 1.20 
#10 Good  SI2      1081 1.04 
# … with 30 more rows

【讨论】:

    【解决方案2】:

    Join 函数将一个字符串向量作为其by 参数。使用deparse 从表达式转为字符串:

    dplyr::left_join(x = df_tally, y = df_mean, by = map_chr(group.by, deparse))
    

    【讨论】:

      【解决方案3】:

      正如前面的作者所提到的,“by”需要一个字符串向量。 stanwood 在RStudio Community thread Should tidyeval be abandoned?

      上说明了一种从 quosures 列表移动到字符串的简单方法

      ...tidyr::left_join 仍然需要一个字符串列表: by = c("Species", “萼片长度”)。如果我想以编程方式提供这些最好的 我找到的解决方案是 = sapply(sepaldims, quo_text)。考虑这是一个 用于将 quo_text 抽象为 quosures 列表的插件。

      sepaldims <- quos(Species, Sepal.Length)
      

      【讨论】:

        猜你喜欢
        • 2015-03-23
        • 1970-01-01
        • 2021-01-19
        • 2019-12-11
        • 1970-01-01
        • 2015-03-13
        • 2018-08-12
        • 1970-01-01
        • 2019-04-15
        相关资源
        最近更新 更多