【问题标题】:Dplyr: case_when - how to use to select a column?Dplyr: case_when - 如何使用来选择一列?
【发布时间】:2021-05-21 10:01:49
【问题描述】:

我想使用dplyr 中的case_when 来选择一列以更改其在tidymodels 配方中的角色。

我做错了什么? 在以下 MWE 中,应将 ID 角色分配给“b”列:

library(tidyverse)
library(tidymodels)

# dummy data
a = seq(1:3)
b = seq(4:6)
c = seq(7:9)
df <- data.frame(a,b,c)

# filter variable
col_name = "foo"

rec <- recipe(a ~., data = df) %>%
  update_role(
              case_when(
                col_name == "foo" ~ b, # Not working too: .$b, df$b
                col_name == "foo2" ~ c), 
              new_role = "ID")
rec

【问题讨论】:

    标签: r dplyr tidymodels


    【解决方案1】:

    不幸的是,case_when 不适用于您想要实现的那种动态变量选择。相反,我建议使用包装在函数中的 if (...) 来执行动态选择:

    library(tidyverse)
    library(tidymodels)
    
    # dummy data
    a = seq(1:3)
    b = seq(4:6)
    c = seq(7:9)
    df <- data.frame(a,b,c)
    
    # filter variable
    col_name = "foo"
    
    update_select <- function(recipe, col_name) {
      if (col_name == "foo") {
        update_role(recipe, b, new_role = "ID") 
      } else if (col_name == "foo2") {
        update_role(recipe, c, new_role = "ID")  
      }
    }
    
    rec <- recipe(a ~., data = df) %>%
      update_select(col_name)
    rec
    #> Data Recipe
    #> 
    #> Inputs:
    #> 
    #>       role #variables
    #>         ID          1
    #>    outcome          1
    #>  predictor          1
    

    【讨论】:

      【解决方案2】:

      有几种不同的方法可以做到这一点。我认为对于您在此处显示的示例,我将使用具有列名的命名向量:

      library(recipes)
      
      # dummy data
      a = seq(1:3)
      b = seq(4:6)
      c = seq(7:9)
      df <- data.frame(a,b,c)
      
      selector_vec <- c("foo" = "b", "foo2" = "c")
      
      ## could select more than one term here
      my_terms <- selector_vec[["foo"]]
      rec1 <- recipe(a ~ ., data = df) %>%
        update_role(all_of(my_terms), new_role = "ID")
      prep(rec1)$term_info
      #> # A tibble: 3 x 4
      #>   variable type    role      source  
      #>   <chr>    <chr>   <chr>     <chr>   
      #> 1 b        numeric ID        original
      #> 2 c        numeric predictor original
      #> 3 a        numeric outcome   original
      
      my_terms <- selector_vec[["foo2"]]
      rec2 <- recipe(a ~ ., data = df) %>%
        update_role(all_of(my_terms), new_role = "ID")
      prep(rec2)$term_info
      #> # A tibble: 3 x 4
      #>   variable type    role      source  
      #>   <chr>    <chr>   <chr>     <chr>   
      #> 1 b        numeric predictor original
      #> 2 c        numeric ID        original
      #> 3 a        numeric outcome   original
      

      reprex package (v2.0.0) 于 2021-05-24 创建

      在可能被认为更现实的情况下,我会use across() as shown here

      【讨论】:

        猜你喜欢
        • 2021-01-19
        • 2019-10-20
        • 2020-09-16
        • 1970-01-01
        • 2023-02-04
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多