【发布时间】:2017-05-04 03:43:46
【问题描述】:
library(dplyr) devel version, soon-to-be released 0.6
library(tidyr)
下面是一个简单的数据集。 Q1Sat-Q3Sat 变量是满意度,Q1Used-Q3Used 变量是指调查对象是否使用了他们正在评分的项目。这些问题是在调查中一起提出的。实际上,真实数据集包含至少 50 个 Sat 变量和 Used 变量。
Q1Sat<-c("Neutral","Neutral","VSat","Sat","Neutral","Sat","VDis","Sat","Sat","VSat")
Q2Sat<-c("Neutral","VSat","Dis","Dis","VDis","Sat","Sat","VSat","Neutral","Dis")
Q3Sat<-c("Sat","Sat","Diss","Neutral","VSat","VDis","Sat","Sat","Sat","Neutral")
Q3Used<-c("Yes","No","Yes","Yes","Yes","Yes","Yes","Yes","Yes","No")
Q2Used<-c("Yes","Yes","Yes","Yes","No","No","Yes","Yes","Yes","Yes")
Q1Used<-c("Yes","Yes","Yes","No","No","Yes","Yes","Yes","No","Yes")
House<-c("Yes","No","Unsure","Yes","Yes","No","Unsure","Unsure","Yes","Yes")
Test<-data_frame(Q1Sat,Q2Sat,Q3Sat,Q1Used,Q2Used,Q3Used,House)
我想使用下面的代码将数据重组为带有百分比的表格。但是,我需要过滤 q1Used - q3Used 变量只包含“是”,而 House 变量只包含“是”。如前所述,q1Sat 与 q1Used 相关联,因此仅当 q1Used 为“Yes”且 House 变量为“Yes”时才应包含 q1Sat。我需要为 q2Sat 和 q3Sat 执行此操作。
但是,我不知道如何实现这一点。我尝试使用 dplyr 的开发版本中的范围过滤器,但我不确定如何将它与多个变量一起使用 - q1Used:q3Used 以及 House
那么如何将 House != "Yes" 的过滤器添加到下面代码中的作用域过滤器?
Test%>%
filter_at(vars(Q1Used:Q35Used),all_vars(. != 1)%>%
select(Q1Sat:Q3Sat)%>%
gather()%>%
count(key,value)%>%
mutate(perc=round(n/sum(n),2))%>%
select(-n)%>%
spread(value,perc)
【问题讨论】:
-
如果您只选择了“Sat”变量,您如何获得
filter的“已使用”变量?另外,根据您的条件(q1Used - q3Used variables to only include "Yes", and the House variable to only include "No"),过滤后将有 0 行,因为没有行满足条件 -
我想我应该在选择中包含“使用”变量然后......这也是问题的一部分,我只是希望找到一种更简单的方法来使用管道编写上面的代码和tidyverse。至于没有符合条件的行,我将“House”变量从no更改为yes。真的没关系,更多的是学习如何对不同类型的变量一起使用作用域过滤器...
-
我编辑了代码...现在应该更好了吗?