【发布时间】:2015-06-30 11:58:19
【问题描述】:
我有几个具有相同结构的表(特别是两个)。我想加入 ID_Position 和 ID_Name 并在输出表中生成 1 月和 2 月的总和(两列中可能都有一些 NA)
ID_Position<-c(1,2,3,4,5,6,7,8,9,10)
Position<-c("A","B","C","D","E","H","I","J","X","W")
ID_Name<-c(11,12,13,14,15,16,17,18,19,20)
Name<-c("Michael","Tobi","Chris","Hans","Likas","Martin","Seba","Li","Sha","Susi")
jan<-c(10,20,30,22,23,2,22,24,26,28)
feb<-c(10,30,20,12,NA,3,NA,22,24,26)
df1 <- data.frame(ID_Position,Position,ID_Name,Name,jan,feb)
ID_Position<-c(1,2,3,4,5,6,7,8,9,10)
Position<-c("A","B","C","D","E","H","I","J","X","W")
ID_Name<-c(11,12,13,14,15,16,17,18,19,20)
Name<-c("Michael","Tobi","Chris","Hans","Likas","Martin","Seba","Li","Sha","Susi")
jan<-c(10,20,30,22,NA,NA,22,24,26,28)
feb<-c(10,30,20,12,23,3,3,22,24,26)
df2 <- data.frame(ID_Position,Position,ID_Name,Name,jan,feb)
我尝试了内部连接和完全连接。但这似乎如我所愿:
library(plyr)
test<-join(df1, df2, by =c("ID_Position","ID_Name") , type = "inner", match = "all")
期望的输出:
ID_Position Position ID_Name Name jan feb
1 A 11 Michael 20 20
2 B 12 Tobi 40 60
3 C 13 Chris 60 40
4 D 14 Hans 44 24
5 E 15 Likas 23 23
6 H 16 Martin 2 6
7 I 17 Seba 44 22
8 J 18 Li 48 44
9 X 19 Sha 52 48
10 W 20 Susi 56 52
【问题讨论】:
-
那么你想要实现内部连接还是完全连接?此外,您的数据集是相同的。你能提供你想要的输出吗?例如,与以下工作?
library(data.table) ;setkey(setDT(df1), ID_Position, ID_Name) ; setkey(setDT(df2), ID_Position, ID_Name) ; df2[df1, .(jan = sum(jan, i.jan, na.rm = TRUE), sum(feb = feb, i.feb, na.rm = TRUE)), by = .EACHI] -
您的数据集在六行的
feb中没有任何信息
标签: r