【发布时间】:2020-03-16 00:48:10
【问题描述】:
data1=data.frame("group1"=c(1,1,1,1,2,2,2,2,3,3,3,3,1,1,1,1,2,2,2,2,3,3,3,3),
"group2"=c(1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2),
"var1"=c(1,0,0,1,0,0,0,1,1,1,1,1,0,0,1,0,0,1,0,1,1,0,0,0),
"var2"=c(1,0,1,1,0,0,1,0,0,0,0,0,0,0,1,1,0,0,1,0,1,0,0,1),
"var3"=c(1,1,4,3,3,1,1,2,4,1,4,4,4,2,1,2,1,2,2,2,3,1,2,4))
data2=data.frame("group1"=rep(c(rep(1:3,2)),2),
"group2"=rep(c(rep(1:2,3))),
"var1"=sort(rep(0:1,6)),
"svar1" = c(2,2,0,3,3,3,1,2,4,1,1,1),
"var2"=sort(rep(0:1,6)),
"svar2" = c(rep(NA,12)))
我有'data1'并希望制作'data2'。它所做的是折叠“var1”和“var2”的实际计数以在“data2”中创建“svar1”和“svar2”。
要创建“svar1”,我们筛选“data1”中“group1”和“group2”的所有组合,然后只存储“0”和“1”的所有出现的总和,这是“var1”的响应选项'。我也希望为 'var2' 执行此操作以生成 'svar2'
考虑到大数据,我也希望有一个 data.table 解决方案!现在我们可以忽略'var3'!
【问题讨论】:
标签: r data.table