【问题标题】:pivot_wider a dataframe with complex names Rpivot_wider 具有复杂名称的数据框 R
【发布时间】:2021-04-30 22:15:07
【问题描述】:

所以我有一个如下所示的数据框:


datInput <- tibble(id = 1:2,
             c.0.opt = c("a,b", "c,d"),
             c.0.optI = c("1,2", "3,4"),
             c.0.sel = c("a", "c"),
             c.1.opt = c("e,f", "g,h"),
             c.1.optI = c("5,6", "7,8"),
             c.1.sel = c("e", "g"))

datInput
#     id c.0.opt c.0.optI c.0.sel c.1.opt c.1.optI c.1.sel
 
#1     1 a,b     1,2      a       e,f     5,6      e      
#2     2 c,d     3,4      c       g,h     7,8      g 



我需要它看起来像这样:


datOutput <- tibble(id = c(1,1,2,2),
                   c_opt = c("a,b", "e,f", "c,d", "g,h"),
                   c_optI = c("1,2", "5,6", "3,4", "7,8"),
                   c_sel = c("a", "e", "c", "g"))

#     id c_opt c_optI c_sel

#1     1 a,b   1,2    a    
#2     1 e,f   5,6    e    
#3     2 c,d   3,4    c    
#4     2 g,h   7,8    g 

我通常使用dplyr::pivot_longer 来处理这类任务,但我不知道如何处理那些复杂的列名,如果行标识符在中间的话。有没有办法做到这一点?

谢谢

【问题讨论】:

    标签: r dataframe dplyr tidy


    【解决方案1】:

    我们也可以使用pivot_longernames_sep 作为正则表达式环视来匹配列名中接一个数字的.

    library(dplyr)
    library(tidyr)
    library(stringr)
    pivot_longer(datInput, cols = -id, names_to = c("grp", ".value"), 
             names_sep = "(?<=\\d)\\.") %>%
        select(-grp) %>%
        rename_with(~ str_c('c_', .), -id)
     # A tibble: 4 x 4
     #   id c_opt c_optI c_sel
     #  <int> <chr> <chr>  <chr>
    #1     1 a,b   1,2    a    
    #2     1 e,f   5,6    e    
    #3     2 c,d   3,4    c    
    #4     2 g,h   7,8    g    
    

    【讨论】:

      【解决方案2】:
      datInput %>% 
        gather(colname, val,-1 ) %>% 
        mutate(colname = gsub("\\.\\d\\.","_",colname)) %>% 
        pivot_wider(id_cols = id, names_from = colname, values_from = val, values_fn = list) %>% 
        unnest(cols = c(colnames(.)))
      
      
      # A tibble: 4 x 4
           id c_opt c_optI c_sel
        <int> <chr> <chr>  <chr>
      1     1 a,b   1,2    a    
      2     1 e,f   5,6    e    
      3     2 c,d   3,4    c    
      4     2 g,h   7,8    g 
      

      【讨论】:

      • 您可以将 values_fn = list 添加到数据透视表以禁止显示该警告。 gather 已被取代,因此您应将其替换为 pivot_longer(-id, names_to = "colname", values_to = "val")
      • 谢谢大家!这正是我所需要的
      【解决方案3】:

      我用 zimia 的 cmets 修改了 Akrun 的答案,如下所示:

      datOutput <- datInput %>% 
        pivot_longer(-id, names_to = "colname", values_to = "val") %>%
        mutate(colname = gsub("\\.\\d\\.","_",colname)) %>% 
        pivot_wider(id_cols = id, names_from = colname, values_from = val, values_fn = list) %>% 
        unnest(cols = c(colnames(.)))
      
      

      完美运行。谢谢两位。

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2021-03-29
        • 1970-01-01
        • 2013-02-14
        • 2015-07-17
        • 1970-01-01
        • 2021-03-30
        • 1970-01-01
        • 2012-08-14
        相关资源
        最近更新 更多