R将宽表格式化为长表答案

【问题标题】：R format the wide table to long tableR将宽表格式化为长表
【发布时间】：2021-09-28 18:14:49
【问题描述】：

cutoff        KM KM_lo KM_hi  rstm rstm_lo rstm_hi           
   <chr>      <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>           
 1 2017-01-01   2.1   1.4   4.9   7.2     3.9    10.2 
 2 2017-04-01   3.5   2.1   4.7   8.9     6.6    10.8 
 3 2017-07-01   3.7   2.8   4.2   7.2     6.2     8.4

如何将其转换为长表？我正在努力将其创建为我想要的格式。我尝试了聚集和融化功能。输出表看起来像这样

      cutoff        VAR    Val   Val-hi Val-lo
       <chr>        <chr>  <dbl> <dbl> <dbl>       
     1 2017-01-01   KM     2.1   4.9   1.4     
     2 2017-01-01   rstm   7.2   4.7   3.9     
     3 2017-07-01   KM     3.7   4.2   2.8

样品日期

structure(list(cutoff = c("2017-01-01", "2017-04-01", "2017-07-01"
), KM = c(2.1, 3.5, 3.7), KM_lo = c(1.4, 2.1, 2.8), KM_hi = c(4.9, 
4.7, 4.2), rstm = c(7.2, 8.9, 7.2), rstm_lo = c(3.9, 6.6, 6.2
), rstm_hi = c(10.2, 10.8, 8.4)), row.names = c(NA, -3L), class = c("tbl_df", 
"tbl", "data.frame"))

【问题讨论】：

您的预期输出值与输入不匹配

标签： r dplyr tidyr longtable

【解决方案1】：

我们可以做

library(dplyr)
library(tidyr)
library(stringr)
df1 %>% 
   rename_with(~ str_c(., "_none"), c("KM", "rstm")) %>%
   pivot_longer(cols = -cutoff, names_to = c("VAR", ".value"), 
       names_sep = "_") %>% 
  rename_with(~ c("Val", "Val-lo", "Val-hi"), 3:5)

-输出

# A tibble: 6 × 5
  cutoff     VAR     Val `Val-lo` `Val-hi`
  <chr>      <chr> <dbl>    <dbl>    <dbl>
1 2017-01-01 KM      2.1      1.4      4.9
2 2017-01-01 rstm    7.2      3.9     10.2
3 2017-04-01 KM      3.5      2.1      4.7
4 2017-04-01 rstm    8.9      6.6     10.8
5 2017-07-01 KM      3.7      2.8      4.2
6 2017-07-01 rstm    7.2      6.2      8.4

【讨论】：

【解决方案2】：

这是另一个pivot_longer 方法：

library(dplyr)
library(tidyr)

df %>% 
  pivot_longer(
    -cutoff,
    names_to = c("VAR", ".value"),
    names_pattern = "(.+)_(.+)"
  ) %>% 
  na.omit()

  cutoff     VAR      lo    hi
  <chr>      <chr> <dbl> <dbl>
1 2017-01-01 KM      1.4   4.9
2 2017-01-01 rstm    3.9  10.2
3 2017-04-01 KM      2.1   4.7
4 2017-04-01 rstm    6.6  10.8
5 2017-07-01 KM      2.8   4.2
6 2017-07-01 rstm    6.2   8.4

【讨论】：

【解决方案3】：

library(tidyverse)
df <-
  structure(
    list(
      cutoff = c("2017-01-01", "2017-04-01", "2017-07-01"),
      KM = c(2.1, 3.5, 3.7),
      KM_lo = c(1.4, 2.1, 2.8),
      KM_hi = c(4.9, 4.7, 4.2),
      rstm = c(7.2, 8.9, 7.2),
      rstm_lo = c(3.9, 6.6, 6.2),
      rstm_hi = c(10.2, 10.8, 8.4)
    ),
    row.names = c(NA,-3L),
    class = c("tbl_df",
              "tbl", "data.frame")
  )

df %>% 
  pivot_longer(cols = -cutoff) %>% 
  separate(col = name, into = c("name", "suffix"), sep = "_", remove = TRUE) %>% 
  mutate(id = data.table::rleid(name)) %>% 
  pivot_wider(id_cols = c(id, cutoff, name), names_from = suffix, names_prefix = "VAL_", values_from = value) %>% 
  select(-id) %>% 
  rename(VAL = VAL_NA)
#> Warning: Expected 2 pieces. Missing pieces filled with `NA` in 6 rows [1, 4, 7,
#> 10, 13, 16].
#> # A tibble: 6 x 5
#>   cutoff     name    VAL VAL_lo VAL_hi
#>   <chr>      <chr> <dbl>  <dbl>  <dbl>
#> 1 2017-01-01 KM      2.1    1.4    4.9
#> 2 2017-01-01 rstm    7.2    3.9   10.2
#> 3 2017-04-01 KM      3.5    2.1    4.7
#> 4 2017-04-01 rstm    8.9    6.6   10.8
#> 5 2017-07-01 KM      3.7    2.8    4.2
#> 6 2017-07-01 rstm    7.2    6.2    8.4

^{由reprex package (v2.0.1) 于 2021 年 9 月 28 日创建}

【讨论】：