【问题标题】:R format the wide table to long tableR将宽表格式化为长表
【发布时间】:2021-09-28 18:14:49
【问题描述】:
cutoff        KM KM_lo KM_hi  rstm rstm_lo rstm_hi           
   <chr>      <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>           
 1 2017-01-01   2.1   1.4   4.9   7.2     3.9    10.2 
 2 2017-04-01   3.5   2.1   4.7   8.9     6.6    10.8 
 3 2017-07-01   3.7   2.8   4.2   7.2     6.2     8.4 

如何将其转换为长表?我正在努力将其创建为我想要的格式。我尝试了聚集和融化功能。输出表看起来像这样

      cutoff        VAR    Val   Val-hi Val-lo
       <chr>        <chr>  <dbl> <dbl> <dbl>       
     1 2017-01-01   KM     2.1   4.9   1.4     
     2 2017-01-01   rstm   7.2   4.7   3.9     
     3 2017-07-01   KM     3.7   4.2   2.8   

样品日期

structure(list(cutoff = c("2017-01-01", "2017-04-01", "2017-07-01"
), KM = c(2.1, 3.5, 3.7), KM_lo = c(1.4, 2.1, 2.8), KM_hi = c(4.9, 
4.7, 4.2), rstm = c(7.2, 8.9, 7.2), rstm_lo = c(3.9, 6.6, 6.2
), rstm_hi = c(10.2, 10.8, 8.4)), row.names = c(NA, -3L), class = c("tbl_df", 
"tbl", "data.frame"))

【问题讨论】:

  • 您的预期输出值与输入不匹配

标签: r dplyr tidyr longtable


【解决方案1】:

我们可以做

library(dplyr)
library(tidyr)
library(stringr)
df1 %>% 
   rename_with(~ str_c(., "_none"), c("KM", "rstm")) %>%
   pivot_longer(cols = -cutoff, names_to = c("VAR", ".value"), 
       names_sep = "_") %>% 
  rename_with(~ c("Val", "Val-lo", "Val-hi"), 3:5)

-输出

# A tibble: 6 × 5
  cutoff     VAR     Val `Val-lo` `Val-hi`
  <chr>      <chr> <dbl>    <dbl>    <dbl>
1 2017-01-01 KM      2.1      1.4      4.9
2 2017-01-01 rstm    7.2      3.9     10.2
3 2017-04-01 KM      3.5      2.1      4.7
4 2017-04-01 rstm    8.9      6.6     10.8
5 2017-07-01 KM      3.7      2.8      4.2
6 2017-07-01 rstm    7.2      6.2      8.4

【讨论】:

    【解决方案2】:

    这是另一个pivot_longer 方法:

    library(dplyr)
    library(tidyr)
    
    df %>% 
      pivot_longer(
        -cutoff,
        names_to = c("VAR", ".value"),
        names_pattern = "(.+)_(.+)"
      ) %>% 
      na.omit()
    
      cutoff     VAR      lo    hi
      <chr>      <chr> <dbl> <dbl>
    1 2017-01-01 KM      1.4   4.9
    2 2017-01-01 rstm    3.9  10.2
    3 2017-04-01 KM      2.1   4.7
    4 2017-04-01 rstm    6.6  10.8
    5 2017-07-01 KM      2.8   4.2
    6 2017-07-01 rstm    6.2   8.4
    

    【讨论】:

      【解决方案3】:
      library(tidyverse)
      df <-
        structure(
          list(
            cutoff = c("2017-01-01", "2017-04-01", "2017-07-01"),
            KM = c(2.1, 3.5, 3.7),
            KM_lo = c(1.4, 2.1, 2.8),
            KM_hi = c(4.9, 4.7, 4.2),
            rstm = c(7.2, 8.9, 7.2),
            rstm_lo = c(3.9, 6.6, 6.2),
            rstm_hi = c(10.2, 10.8, 8.4)
          ),
          row.names = c(NA,-3L),
          class = c("tbl_df",
                    "tbl", "data.frame")
        )
      
      df %>% 
        pivot_longer(cols = -cutoff) %>% 
        separate(col = name, into = c("name", "suffix"), sep = "_", remove = TRUE) %>% 
        mutate(id = data.table::rleid(name)) %>% 
        pivot_wider(id_cols = c(id, cutoff, name), names_from = suffix, names_prefix = "VAL_", values_from = value) %>% 
        select(-id) %>% 
        rename(VAL = VAL_NA)
      #> Warning: Expected 2 pieces. Missing pieces filled with `NA` in 6 rows [1, 4, 7,
      #> 10, 13, 16].
      #> # A tibble: 6 x 5
      #>   cutoff     name    VAL VAL_lo VAL_hi
      #>   <chr>      <chr> <dbl>  <dbl>  <dbl>
      #> 1 2017-01-01 KM      2.1    1.4    4.9
      #> 2 2017-01-01 rstm    7.2    3.9   10.2
      #> 3 2017-04-01 KM      3.5    2.1    4.7
      #> 4 2017-04-01 rstm    8.9    6.6   10.8
      #> 5 2017-07-01 KM      3.7    2.8    4.2
      #> 6 2017-07-01 rstm    7.2    6.2    8.4
      

      reprex package (v2.0.1) 于 2021 年 9 月 28 日创建

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2023-02-25
        • 1970-01-01
        • 2017-11-04
        • 1970-01-01
        • 2022-09-29
        • 2022-07-28
        • 2016-07-12
        相关资源
        最近更新 更多