【问题标题】:Merging cross-sectional data to panel data without NA rows [duplicate]将横截面数据合并到没有NA行的面板数据[重复]
【发布时间】:2023-12-31 09:16:01
【问题描述】:

我有 2005 - 2020 年的 15 个数据表,如下所示:

DT_2005 = data.table(
  ID = c("1","2","3","4","5","6"),
  year = c("2005,"2005","2005","2005","2005","2005")
  score = c("98","89","101","78","97","86")
)

# Data tables for every year...

DT_2020 = data.table(
  ID = c("1","2","4","6","7","8"),
  year = c("2020,"2020","2020","2020","2020","2020")
  score = c("89","79","110","98","74","88")
)

# DT_2020 output
ID, year, score
1, 2020, 89
2, 2020, 79
4, 2020, 110
6, 2020, 98
7, 2020, 74
8, 2020, 88

即有一些年没有出现的 ID。

我想将表格组合成这样的“长”格式:

ID, year, score
1, 2005, 98
1, 2006, 95
1, 2007, 97
...
1, 2019, 90
1, 2020, 89
2, 2005, 79
2, 2006, 81
...
2, 2019, 83
2, 2020, 79

有没有办法在data.table 中执行此操作,以便每一行都是ID,其中年份按升序排列,并且没有IDNA 行不在某个特定位置年?

【问题讨论】:

    标签: r merge data.table panel-data longitudinal


    【解决方案1】:

    您可以将全局环境中的所有数据帧组合到一个组合数据帧中并对结果进行排序。

    library(data.table)
    dt <- rbindlist(mget(paste0('DT_', c(2005:2020))))
    dt <- dt[order(ID)]
    

    等效的 dplyr 和基本 R 替代方案是 -

    #dplyr
    library(dplyr)
    res <- bind_rows(mget(paste0('DT_', c(2005:2020)))) %>% arrange(ID)
    
    
    #Base R
    res <- do.call(rbind, mget(paste0('DT_', c(2005:2020))))
    res <- res[order(res$ID), ]
    

    【讨论】:

      最近更新 更多