【问题标题】:Spreading one column to Multiple columns in R (Spread?, Reshape?)将一列传播到 R 中的多列(传播?,重塑?)
【发布时间】:2018-09-26 20:22:50
【问题描述】:

我想删除下表中的 Station.ID 列,方法是将其分配(Spread?,Reshape?)到月份列,以便获得 Jan_323、Feb_323、...、Jan_452、Feb_452 等列, ...

 Have:

 Station.ID  Year    Jan   Feb    Mar   Apr  May   Jun
 323         1995    31.3  25.2   19.0   15.0 ..    .. 
 323         1996    28.0  20.2   17.5   14.0 ..    ..
 323         ...     ..    ..     ..     ..   ..    ..
 ...         ...     ..    ..     ..     ..   ..    ..
 452         1995    16.2  ..     ..     ..   ..    ..
 452         1996    14.3  ..     ..     ..   ..    ..
 452         ...     ..    ..     ..     ..   ..    ..
 ..          ...     ..    ..     ..     ..

 Want:

 Year    Jan_323  Feb_323   Mar_323  ...  Jan_452   Feb_452   Mar_452   ...
 1995     31.3     25.2     19.0     ...   16.2       ...       ...     ...
 1996     28.0      ...       ...    ...   14.3       ...       ...     ...
 1997     ...       ...       ...    ...   ...        ...       ...     ...    
 1998     ...       ...       ...    ...   ...        ...       ...     ...
 ....     ...       ...       ...    ...   ...        ...       ...     ...

【问题讨论】:

  • 嗨 - 你能提供一个可重复的数据集吗? (见*.com/help/mcve)。谢谢!

标签: r subset reshape spread


【解决方案1】:

我们可以在gathering 后使用spread 转换为“长”格式

library(tidyverse)
gather(df1, key, value, -Station.ID, -Year) %>% 
      unite(Station.ID, key, Station.ID) %>% 
      mutate(Station.ID = factor(Station.ID, levels = unique(Station.ID))) %>%
      spread(Station.ID, value)
#   Year Jan_323 Jan_452 Feb_323 Feb_452 Mar_323 Mar_452 Apr_323 Apr_452 May_323 May_452 Jun_323 Jun_452
#1 1995    31.3    16.2    25.2    20.1    19.0    31.2    15.0    12.2    12.0    12.5    10.1    22.1
#2 1996    28.0    14.3    20.2    23.1    17.5    34.2    14.0    14.2    10.5    13.5    12.2    19.1
#3 1997    20.2    20.1    22.1    22.1    31.2    32.2    12.2    13.2    12.5    11.5    22.1    18.1

数据

df1 <- structure(list(Station.ID = c(323L, 323L, 323L, 452L, 452L, 452L
), Year = c(1995L, 1996L, 1997L, 1995L, 1996L, 1997L), Jan = c(31.3, 
28, 20.2, 16.2, 14.3, 20.1), Feb = c(25.2, 20.2, 22.1, 20.1, 
23.1, 22.1), Mar = c(19, 17.5, 31.2, 31.2, 34.2, 32.2), Apr = c(15, 
14, 12.2, 12.2, 14.2, 13.2), May = c(12, 10.5, 12.5, 12.5, 13.5, 
11.5), Jun = c(10.1, 12.2, 22.1, 22.1, 19.1, 18.1)), .Names = c("Station.ID", 
"Year", "Jan", "Feb", "Mar", "Apr", "May", "Jun"), class = "data.frame", row.names = c(NA, 
 -6L))

【讨论】:

  • 成功!我希望使用只是传播的更简单的解决方案,但是再多几行代码不会杀死我。谢谢!
  • mutate 行只是为了按顺序创建列