【问题标题】:Mean of a variable from multiple data.frame in R?R中多个data.frame的变量的平均值?
【发布时间】:2022-01-13 15:08:22
【问题描述】:

我有以下data.frame,我想创建另一个data.frame(比如M)来存储来自A1mean 来自M1,M2,M3mean 来自B1,来自@987654328 @ 和 C1 也是如此。

library(tidyverse)


set.seed(123)

M1 <- data.frame(Date = seq(as.Date("2001-01-01"), to = as.Date("2005-12-31"), by = "month"),
           A1 = runif(60,1,5),
           B1 = runif(60,1,5),
           C1 = runif(60,1,5))

M2 <- data.frame(Date = seq(as.Date("2001-01-01"), to = as.Date("2005-12-31"), by = "month"),
                 A1 = runif(60,1,5),
                 B1 = runif(60,1,5),
                 C1 = runif(60,1,5))

M3 <- data.frame(Date = seq(as.Date("2001-01-01"), to = as.Date("2005-12-31"), by = "month"),
                 A1 = runif(60,1,5),
                 B1 = runif(60,1,5),
                 C1 = runif(60,1,5))

期望的输出

输出将是一个data.frame (M),变量为A (mean of A1 from M1,M2,M3)B (mean of B1 from M1,M2,M3)C (mean of C1 from M1,M2,M3)

【问题讨论】:

    标签: r dataframe tidyverse mean lubridate


    【解决方案1】:

    我们在list (mget) 中获取数据集,对感兴趣的列进行子集化,得到元素总和 (+) 并除以 3(因为只有 3 个数据集)

    M <- Reduce(`+`, lapply(mget(ls(pattern = "^M\\d+")), `[`, 
         -1))/3
    names(M) <- sub("\\d+", "", names(M))
    

    -输出

    > head(M)
             A        B        C
    1 2.694883 3.345868 3.196847
    2 2.724759 2.868531 2.524246
    3 3.685341 2.535909 2.859400
    4 2.941540 2.639169 3.182535
    5 3.530815 3.690165 3.576402
    6 2.747724 3.399104 3.107880
    

    或者另一种选择是将数据集绑定在一起,然后通过mean进行分组

    library(dplyr)
    library(stringr)
    bind_rows(M1, M2, M3) %>%
       group_by(Date) %>%
       summarise(across(everything(), mean, na.rm = TRUE, 
          .names = "{str_remove(.col, '[0-9]+')}"))
    # A tibble: 60 × 4
       Date           A     B     C
       <date>     <dbl> <dbl> <dbl>
     1 2001-01-01  2.69  3.35  3.20
     2 2001-02-01  2.72  2.87  2.52
     3 2001-03-01  3.69  2.54  2.86
     4 2001-04-01  2.94  2.64  3.18
     5 2001-05-01  3.53  3.69  3.58
     6 2001-06-01  2.75  3.40  3.11
     7 2001-07-01  2.32  3.02  1.58
     8 2001-08-01  2.97  3.89  2.04
     9 2001-09-01  3.49  3.56  2.36
    10 2001-10-01  3.46  3.71  3.69
    # … with 50 more rows
    

    【讨论】:

    • 感谢 Akrun- 我喜欢整洁的解决方案。
    猜你喜欢
    • 2020-12-10
    • 1970-01-01
    • 1970-01-01
    • 2013-04-28
    • 1970-01-01
    • 2014-04-30
    • 2015-09-05
    • 2021-03-04
    相关资源
    最近更新 更多