【问题标题】:R: Computing difference in values for multiple groups/variables in RR:计算R中多个组/变量的值差异
【发布时间】:2020-11-03 13:56:31
【问题描述】:

有没有办法有效地计算每组之间的差异?理想情况下,我想使用mutate() 函数创建一个新列来显示差异(在一列中,以长格式)。我不想单独计算每个组之间的差异。

我想在给定的日期和时间找到每个组之间的值差异:
arc1045 - arc1046,
arc1045 - arc1047,
arc1045 - arc1048,
arc1045 - arc1050,
arc1046 - arc1047,
arc1046 - arc1048,
.
.
.

可以使用下面的代码检索数据框。

structure(list(date = structure(c(18215, 18215, 18215, 18215, 
18215), class = "Date"), hour = 9:13, arc1045 = c(15.2933333333333, 
16.1275, 17.0366666666667, 18.36, 19.2725), arc1046 = c(14.8133333333333, 
15.615, 16.3733333333333, 17.405, 18.4), arc1047 = c(15.0233333333333, 
15.93, 16.8133333333333, 18.17, 18.6125), arc1048 = c(14.45, 
15.31, 15.9333333333333, 16.965, 18.06), arc1050 = c(14.45, 15.2875, 
15.9466666666667, 16.89, 18.1025)), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

#>         date hour  arc1045  arc1046  arc1047  arc1048  arc1050
#> 1 2019-11-15    9 15.29333 14.81333 15.02333 14.45000 14.45000
#> 2 2019-11-15   10 16.12750 15.61500 15.93000 15.31000 15.28750
#> 3 2019-11-15   11 17.03667 16.37333 16.81333 15.93333 15.94667
#> 4 2019-11-15   12 18.36000 17.40500 18.17000 16.96500 16.89000
#> 5 2019-11-15   13 19.27250 18.40000 18.61250 18.06000 18.10250

reprex package (v0.3.0) 于 2020 年 11 月 4 日创建

devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15.7      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_AU.UTF-8                 
#>  ctype    en_AU.UTF-8                 
#>  tz       Australia/Melbourne         
#>  date     2020-11-04                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date       lib source                     
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.0.2)             
#>  backports     1.1.10  2020-09-15 [1] CRAN (R 4.0.2)             
#>  callr         3.5.1   2020-10-13 [1] CRAN (R 4.0.2)             
#>  cli           2.1.0   2020-10-12 [1] CRAN (R 4.0.2)             
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 4.0.2)             
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 4.0.2)             
#>  devtools      2.3.2   2020-09-18 [1] CRAN (R 4.0.2)             
#>  digest        0.6.27  2020-10-24 [1] CRAN (R 4.0.2)             
#>  ellipsis      0.3.1   2020-05-15 [1] CRAN (R 4.0.2)             
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.0.1)             
#>  fansi         0.4.1   2020-01-08 [1] CRAN (R 4.0.2)             
#>  fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.2)             
#>  glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.2)             
#>  highr         0.8     2019-03-20 [1] CRAN (R 4.0.2)             
#>  htmltools     0.5.0   2020-06-16 [1] CRAN (R 4.0.2)             
#>  knitr         1.30    2020-09-22 [1] CRAN (R 4.0.2)             
#>  magrittr      1.5     2014-11-22 [1] CRAN (R 4.0.2)             
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 4.0.2)             
#>  pkgbuild      1.1.0   2020-07-13 [1] CRAN (R 4.0.2)             
#>  pkgload       1.1.0   2020-05-29 [1] CRAN (R 4.0.2)             
#>  prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.0.2)             
#>  processx      3.4.4   2020-09-03 [1] CRAN (R 4.0.2)             
#>  ps            1.4.0   2020-10-07 [1] CRAN (R 4.0.2)             
#>  R6            2.5.0   2020-10-28 [1] CRAN (R 4.0.2)             
#>  remotes       2.2.0   2020-07-21 [1] CRAN (R 4.0.2)             
#>  rlang         0.4.8   2020-10-08 [1] CRAN (R 4.0.2)             
#>  rmarkdown     2.5     2020-10-21 [1] CRAN (R 4.0.2)             
#>  rprojroot     1.3-2   2018-01-03 [1] CRAN (R 4.0.2)             
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.2)             
#>  stringi       1.5.3   2020-09-09 [1] CRAN (R 4.0.2)             
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.0.2)             
#>  testthat      2.3.2   2020-03-02 [1] CRAN (R 4.0.2)             
#>  usethis       1.6.3   2020-09-17 [1] CRAN (R 4.0.2)             
#>  withr         2.3.0   2020-09-22 [1] CRAN (R 4.0.2)             
#>  xfun          0.19.1  2020-10-31 [1] Github (yihui/xfun@621896e)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.2)             
#> 
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

谢谢。

【问题讨论】:

标签: r dataframe


【解决方案1】:

您可以使用pivot_longer 将数据框放入长格式,然后执行full_join 以获取datehour 和行号的所有组合。使用distinct,您可以获得唯一的组合并删除重复项(例如,arc1045 - arc1046arc1046 - arc1045)。

library(tidyverse)

df_long <- df %>%
  mutate(rn = row_number()) %>%
  pivot_longer(cols = starts_with("arc")) 

df_long %>%
  full_join(df_long, by = c("date", "hour", "rn")) %>%
  filter(name.x != name.y) %>%
  distinct(date, hour, rn, 
           comb_name = paste0(pmin(name.x, name.y), pmax(name.x, name.y)),
           .keep_all = TRUE) %>%
  mutate(diff = value.x - value.y) %>%
  select(date, hour, comb_name, diff)

输出

   date        hour comb_name        diff
   <date>     <int> <chr>           <dbl>
 1 2019-11-15     9 arc1045arc1046  0.480
 2 2019-11-15     9 arc1045arc1047  0.270
 3 2019-11-15     9 arc1045arc1048  0.843
 4 2019-11-15     9 arc1045arc1050  0.843
 5 2019-11-15     9 arc1046arc1047 -0.210
 6 2019-11-15     9 arc1046arc1048  0.363
 7 2019-11-15     9 arc1046arc1050  0.363
 8 2019-11-15     9 arc1047arc1048  0.573
 9 2019-11-15     9 arc1047arc1050  0.573
10 2019-11-15     9 arc1048arc1050  0  
...

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2022-01-17
    • 2021-01-07
    • 1970-01-01
    • 2021-05-23
    • 1970-01-01
    • 2023-02-09
    • 1970-01-01
    相关资源
    最近更新 更多