【问题标题】:Making long data wide and collapsing rows [duplicate]使长数据变宽并折叠行[重复]
【发布时间】:2021-02-12 01:18:31
【问题描述】:

我正在尝试将数据帧从长格式转换为宽格式。目前df的设置如下:

dput(head(df,10))
structure(list(TECH_ID = c("14050154", "14050154", "13835650", 
"13835650", "13469601", "13469601", "13782883", "13782883", "12342837", 
"12342837"), MNSCU_QUES = c("What language did you learn to speak first?", 
"Which language do you speak most often at home?", "What language did you learn to speak first?", 
"Which language do you speak most often at home?", "What language did you learn to speak first?", 
"Which language do you speak most often at home?", "What language did you learn to speak first?", 
"Which language do you speak most often at home?", "What language did you learn to speak first?", 
"Which language do you speak most often at home?"), MNSCU_RESP = c("English and another language", 
"Another", "English only", "English", "English only", "English", 
"English and another language", "English", "English only", "English"
)), row.names = c(NA, 10L), class = "data.frame")

我正在尝试设置数据框,使其如下所示:

我一直在这里尝试使用此代码:

df_wide <- dcast(df, TECH_ID+MNSCU_RESP~MNSCU_QUES)

但生成的数据框如下所示:

代码:

dput(head(df_wide,10))
structure(list(TECH_ID = c("00007179", "00007179", "00008201", 
"00008201", "00020900", "00020900", "00021757", "00021757", "00031227", 
"00031227"), MNSCU_RESP = c("English", "English only", "English", 
"English only", "English", "English only", "English", "English only", 
"English", "English only"), `What language did you learn to speak first?` = c(0L, 
1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L), `Which language do you speak most often at home?` = c(1L, 
0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L)), row.names = c(NA, 10L), class = "data.frame")

视觉:

【问题讨论】:

  • tidyr::pivot_wider(df, names_from = MNSCU_QUES, values_from = MNSCU_RESP)
  • @RonakShah 我收到以下错误:警告消息:MNSCU_RESP 中的值不是唯一标识的;输出将包含列表列。 * 使用values_fn = list(MNSCU_RESP = list) 抑制此警告。 * 使用values_fn = list(MNSCU_RESP = length) 确定重复出现的位置 * 使用values_fn = list(MNSCU_RESP = summary_fun) 汇总重复
  • 共享数据不会发生这种情况,但您可以试试这个答案stackoverflow.com/questions/58837773/…

标签: r dataframe dplyr tidyr plyr


【解决方案1】:
library(reshape2)

df <- structure(list(TECH_ID = c("14050154", "14050154", "13835650", 
                           "13835650", "13469601", "13469601", "13782883", "13782883", "12342837", 
                           "12342837"), MNSCU_QUES = c("What language did you learn to speak first?", 
                                                       "Which language do you speak most often at home?", "What language did you learn to speak first?", 
                                                       "Which language do you speak most often at home?", "What language did you learn to speak first?", 
                                                       "Which language do you speak most often at home?", "What language did you learn to speak first?", 
                                                       "Which language do you speak most often at home?", "What language did you learn to speak first?", 
                                                       "Which language do you speak most often at home?"), MNSCU_RESP = c("English and another language", 
                                                                                                                          "Another", "English only", "English", "English only", "English", 
                                                                                                                          "English and another language", "English", "English only", "English"
                                                       )), row.names = c(NA, 10L), class = "data.frame")

df_wide <- reshape2::dcast(df, TECH_ID~MNSCU_QUES, value.var = "MNSCU_RESP")

> df_wide
   TECH_ID What language did you learn to speak first? Which language do you speak most often at home?
1 12342837                                English only                                         English
2 13469601                                English only                                         English
3 13782883                English and another language                                         English
4 13835650                                English only                                         English
5 14050154                English and another language                                         Another

【讨论】:

  • 我收到以下错误:缺少聚合函数:默认为长度,输出如下所示:TECH_ID 你先学什么语言?你在家最常说哪种语言? 7179, 1,1
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2020-01-30
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多