【问题标题】:Extracting the same column from multiple data frames and cbind them into a new data frame in R从多个数据框中提取同一列并将它们绑定到 R 中的新数据框中
【发布时间】:2019-02-28 22:10:00
【问题描述】:

我正在寻找允许我从多个数据帧中提取同名列并将它们绑定到单个数据帧中的代码行。我还希望在新数据框中的每一列来自它的数据框之后对其进行命名。

以下是我一直在使用可重现数据的代码。我一直在尝试 do.call 但是我无法让它工作:

Asset   <- structure(c(63.281303433027, 63.3979720475464, 63.6714334032718, 
            62.9559893597375, 63.0078420773017, 62.8893215800121, 31.6989860237732, 
            31.8357167016359, 31.4779946798687, 31.5039210386508, 31.4446607900061, 
            31.0492838185792, 63.3979720475464, 63.6714334032718, 62.9559893597375, 
            63.0078420773017, 62.8893215800121, 62.0985676371584), 
            class = c("xts","zoo"), .indexCLASS = "Date", .indexTZ = "UTC", tclass = "Date", tzone = "UTC", 
            index = structure(c(1550534400, 1550620800, 1550707200, 1550793600, 1551052800, 1551139200),tzone = "UTC", tclass = "Date"), .Dim = c(6L, 3L), 
            .Dimnames = list(NULL, c("Beginning.Value", "Unit.Price", "Ending.Value")))

Register<- structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 212.156319855224, 
            213.718845942538, 211.63547782612, 211.809091835821, 211.63547782612, 
            207.989583622389),
            class = c("xts", "zoo"), .indexCLASS = "Date", .indexTZ = "UTC", tclass = "Date", tzone = "UTC", 
            index = structure(c(1550534400,1550620800, 1550707200, 1550793600, 1551052800, 1551139200), tzone = "UTC", tclass = "Date"), .Dim = c(6L, 3L), 
            .Dimnames = list(NULL, c("Amount", "Taxes", "Ending.Value")))

Ledger<- structure(c(0.994402284972246, 1.00685740995534, 0.991497559782253, 
            1.00156143848816, 1.00071020618011, 0.995451606923588, 161.592601088027, 
            160.688051756542, 161.789955602362, 160.414346177021, 160.664823311196, 
            160.778928461638, 160.688051756542, 161.789955602362, 160.414346177021, 
            160.664823311196, 160.778928461638, 160.04764269659), class = c("xts", "zoo"), .indexCLASS = "Date", .indexTZ = "UTC", tclass = "Date", tzone = "UTC", 
            index = structure(c(1550534400, 1550620800, 1550707200, 1550793600, 1551052800, 1551139200), tzone = "UTC", tclass = "Date"), .Dim = c(6L, 3L), 
            .Dimnames = list(NULL, c("Discount_Proxy", "Beginning.Value","Ending.Value")))

dfs <- data.frame(c("Ledger","Registry","Ledger"))
names(dfs) <- "Data Frame"

Values <- do.call('cbind', list(dfs[,1]$Ending.Value))

【问题讨论】:

    标签: r


    【解决方案1】:

    如果您不介意在列表中命名 data.frames:

    list_ls <- list("Asset" = Asset, "Register" = Register, "Ledger" = Ledger)
    
    foo <- do.call(cbind, lapply(list_ls, function(x) x$Ending.Value))
    
    test <- cbind(Asset$Ending.Value, Register$Ending.Value, Ledger$Ending.Value)
    colnames(test) <- c("Asset", "Register", "Ledger")
    
    length(which(foo != test))
    

    【讨论】:

    • 工作出色,谢谢。我尝试了几种自动化 list_ls 的方法,有没有一种编码方式,这样我就不必手动创建列表。我正在做的工作将有越来越多的数据框。再次感谢您
    • 也许this 帖子可以提供帮助?根据您的数据的确切来源,您可以每次使用该方法命名,然后将其添加到列表中。不确定这是否有意义!
    【解决方案2】:

    提出这个问题时可能不存在基于 tidyverse 的想法,但应该可以很好地扩展。唯一需要指定数据集名称的地方是将数据框绑定到列表中;在返回数据框列表的情况下(例如从目录中读取文件),您甚至不需要在那里执行此操作。 tibble::lst 是基础 list 的包装器,它通过原始对象名称命名值。 {{ ds }} := Ending.Value 是 tidyeval 表示法,用于根据变量 ds 中的值动态重命名 Ending.Value 列,在本例中是数据集的名称。

    library(dplyr)
    library(xts)
    
    tibble::lst(Asset, Register, Ledger) %>%
      purrr::map(as.data.frame) %>%
      purrr::imap(function(dat, ds) select(dat, {{ ds }} := Ending.Value)) %>%
      bind_cols()
    #>               Asset Register   Ledger
    #> 2019-02-19 63.39797 212.1563 160.6881
    #> 2019-02-20 63.67143 213.7188 161.7900
    #> 2019-02-21 62.95599 211.6355 160.4143
    #> 2019-02-22 63.00784 211.8091 160.6648
    #> 2019-02-25 62.88932 211.6355 160.7789
    #> 2019-02-26 62.09857 207.9896 160.0476
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-11-27
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-02-01
      • 1970-01-01
      相关资源
      最近更新 更多