【问题标题】:Reshaping a data frame --- changing rows to columns重塑数据框 --- 将行更改为列
【发布时间】:2012-07-28 09:13:44
【问题描述】:

假设我们有一个看起来像这样的数据框

set.seed(7302012)

county         <- rep(letters[1:4], each=2)
state          <- rep(LETTERS[1], times=8)
industry       <- rep(c("construction", "manufacturing"), 4)
employment     <- round(rnorm(8, 100, 50), 0)
establishments <- round(rnorm(8, 20, 5), 0)

data <- data.frame(state, county, industry, employment, establishments)

  state county      industry employment establishments
1     A      a  construction        146             19
2     A      a manufacturing        110             20
3     A      b  construction        121             10
4     A      b manufacturing         90             27
5     A      c  construction        197             18
6     A      c manufacturing         73             29
7     A      d  construction         98             30
8     A      d manufacturing        102             19

我们想重新调整它,使每一行代表一个(州和)县,而不是一个县-行业,列 construction.employmentconstruction.establishments 和类似的制造版本。有什么有效的方法来做到这一点?

一种方法是子集

construction <- data[data$industry == "construction", ]
names(construction)[4:5] <- c("construction.employment", "construction.establishments")

对于制造业也是如此,然后进行合并。如果只有两个行业,这还不错,但想象一下有 14 个;这个过程会变得乏味(尽管通过在industry 的级别上使用for 循环来减少麻烦)。

还有其他想法吗?

【问题讨论】:

    标签: r reshape


    【解决方案1】:

    同样使用 reshape 包:

    library(reshape) 
    m <- reshape::melt(data) 
    cast(m, state + county~...) 
    

    产量:

    > cast(m, state + county~...) 
      state county construction_employment construction_establishments manufacturing_employment manufacturing_establishments
    1     A      a                     146                          19                      110                           20
    2     A      b                     121                          10                       90                           27
    3     A      c                     197                          18                       73                           29
    4     A      d                      98                          30                      102                           19
    

    我个人使用基础 reshape,所以我可能应该使用 reshape2 (Wickham) 来展示它,但忘记了有一个 reshape2 包。略有不同:

    library(reshape2) 
    m <- reshape2::melt(data) 
    dcast(m, state + county~...) 
    

    【讨论】:

    • 啊,好吧,我使用的是. 而不是...,所以它不起作用。谢谢!
    【解决方案2】:

    如果我正确理解您的问题,这可以在基础 R 重塑中完成:

    reshape(data, direction="wide", idvar=c("state", "county"), timevar="industry")
    #   state county employment.construction establishments.construction
    # 1     A      a                     146                          19
    # 3     A      b                     121                          10
    # 5     A      c                     197                          18
    # 7     A      d                      98                          30
    #   employment.manufacturing establishments.manufacturing
    # 1                      110                           20
    # 3                       90                           27
    # 5                       73                           29
    # 7                      102                           19 
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2015-12-07
      • 1970-01-01
      • 2016-02-14
      • 2016-12-05
      • 2022-12-05
      • 2020-06-05
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多