【问题标题】:"collapsing" multiple factors into a single character variable将多个因素“折叠”成单个字符变量
【发布时间】:2021-04-02 20:21:56
【问题描述】:

我有几个不重叠的因子变量,想将它们折叠成一个字符变量。例如,我有这个:

tibble(var1 = factor(c(1,2,3,NA,NA,NA,NA,NA,NA)),
          var2 = factor(c(NA,NA,NA,4,5,6,NA,NA,NA)),
          var3 = factor(c(NA,NA,NA,NA,NA,NA,"seven","eight","nine")))

# A tibble: 9 x 3
  var1  var2  var3 
  <fct> <fct> <fct>
1 1     NA    NA   
2 2     NA    NA   
3 3     NA    NA   
4 NA    4     NA   
5 NA    5     NA   
6 NA    6     NA   
7 NA    NA    seven
8 NA    NA    eight
9 NA    NA    nine 

我想生成 var4:

    # A tibble: 9 x 4
  var1  var2  var3  var4 
  <fct> <fct> <fct> <chr>
1 1     NA    NA    1    
2 2     NA    NA    2    
3 3     NA    NA    3    
4 NA    4     NA    4    
5 NA    5     NA    5    
6 NA    6     NA    6    
7 NA    NA    seven seven
8 NA    NA    eight eight
9 NA    NA    nine  nine

【问题讨论】:

    标签: r


    【解决方案1】:

    我们可以使用coalesce

    library(dplyr)
    df1 %>%
        mutate(var4 = coalesce(!!! .))
        # // or use
        # mutate(var4 = purrr::reduce(., coalesce))
    

    -输出

    # A tibble: 9 x 4
    #  var1  var2  var3  var4 
    #  <fct> <fct> <fct> <fct>
    #1 1     <NA>  <NA>  1    
    #2 2     <NA>  <NA>  2    
    #3 3     <NA>  <NA>  3    
    #4 <NA>  4     <NA>  4    
    #5 <NA>  5     <NA>  5    
    #6 <NA>  6     <NA>  6    
    #7 <NA>  <NA>  seven seven
    #8 <NA>  <NA>  eight eight
    #9 <NA>  <NA>  nine  nine 
    

    【讨论】:

      【解决方案2】:

      我们可以使用tidyr中的unite

      df %>% 
        unite(var4, var1:var3, remove = F,  na.rm = TRUE) %>% 
        select(var1, var2, var3, var4)
      

      输出:

        var1  var2  var3  var4 
        <fct> <fct> <fct> <chr>
      1 1     NA    NA    1    
      2 2     NA    NA    2    
      3 3     NA    NA    3    
      4 NA    4     NA    4    
      5 NA    5     NA    5    
      6 NA    6     NA    6    
      7 NA    NA    seven seven
      8 NA    NA    eight eight
      9 NA    NA    nine  nine 
      

      【讨论】:

        【解决方案3】:

        也许这对你有帮助

        df$var4 <- na.omit(unlist(df))
        

        这样

        > df
        # A tibble: 9 x 4
          var1  var2  var3  var4
          <fct> <fct> <fct> <fct>
        1 1     NA    NA    1
        2 2     NA    NA    2
        3 3     NA    NA    3
        4 NA    4     NA    4    
        5 NA    5     NA    5
        6 NA    6     NA    6
        7 NA    NA    seven seven
        8 NA    NA    eight eight
        9 NA    NA    nine  nine
        

        data.table 选项与 fcoalesce

        > setDT(df)[, var4 := do.call(fcoalesce, Map(as.charac .... [TRUNCATED]
           var1 var2  var3  var4
        1:    1 <NA>  <NA>     1
        2:    2 <NA>  <NA>     2
        3:    3 <NA>  <NA>     3
        4: <NA>    4  <NA>     4
        5: <NA>    5  <NA>     5
        6: <NA>    6  <NA>     6
        7: <NA> <NA> seven seven
        8: <NA> <NA> eight eight
        9: <NA> <NA>  nine  nine
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 2017-11-10
          • 2017-11-30
          • 2015-10-18
          • 2019-12-20
          • 1970-01-01
          • 1970-01-01
          • 2012-03-08
          • 2010-10-19
          相关资源
          最近更新 更多