【发布时间】:2017-07-28 11:15:01
【问题描述】:
library(tidyr)
library(dplyr)
library(tidyverse)
下面是一个简单数据框的代码。我有一些杂乱的数据被导出,列因子类别分布在不同的列中。
Client<-c("Client1","Client2","Client3","Client4","Client5")
Sex_M<-c("Male","NA","Male","NA","Male")
Sex_F<-c(" ","Female"," ","Female"," ")
Satisfaction_Satisfied<-c("Satisfied"," "," ","Satisfied","Satisfied")
Satisfaction_VerySatisfied<-c(" ","VerySatisfied","VerySatisfied"," "," ")
CommunicationType_Email<-c("Email"," "," ","Email","Email")
CommunicationType_Phone<-c(" ","Phone ","Phone "," "," ")
DF<-tibble(Client,Sex_M,Sex_F,Satisfaction_Satisfied,Satisfaction_VerySatisfied,CommunicationType_Email,CommunicationType_Phone)
我想使用 tidyr 的“联合”将类别重新组合成单列。
DF<-DF%>%unite(Sat,Satisfaction_Satisfied,Satisfaction_VerySatisfied,sep=" ")%>%
unite(Sex,Sex_M,Sex_F,sep=" ")
但是,我必须编写多个“合并”行,我觉得这违反了三倍规则,所以必须有一种方法可以使这更容易,特别是因为我的真实数据包含需要合并的数十列。有没有办法使用“unite”一次但以某种方式引用匹配的列名,以便所有相似的列名(例如,“Sex_M”和“Sex_F”包含“Sex”,“CommunicationType_Email”包含“CommunicationType”和“CommunicationType_Phone”)与上述公式相结合?
我也在考虑一个允许我输入列名的函数,但这对我来说太难了,因为它涉及复杂的标准评估。
【问题讨论】:
-
DF %>% unite(Sat, contains("Sat"))? -
DF %>% unite(Sat, matches("^Sat"))