【发布时间】:2019-04-03 14:32:15
【问题描述】:
继续我之前的问题How do I return multiple columns without consider Na values and group by other columns name in R?
Mexico_01 <- c(1,2,5,1,NA,1)
Mexico_02 <- c(3,NA,2,0,4,1)
Argentina_01 <- c(2,1,5,2,NA,2)
Argentina_02 <- c(2,3,NA,2,2,2)
Italy<- c(NA,10,10,10,NA,10)
Spain_01 <- c(2,NA,4,6,8,11)
Spain_02 <- c(3,4,NA,11,11,11)
England <- c(5,NA,10,NA,NA,12)
Germany <- c(1,NA,NA,NA,NA,10)
Data_Risk = data.frame( Mexico_01, Mexico_02, Argentina_01, Argentina_02,
Italy, Spain_01, Spain_02, England, Germany)
Data_Risk <- as.data.table(Data_Risk)
library(data.table)
library(magrittr)
all_variable <- as.data.table(which(!is.na(Data_Risk), arr.ind = T))
all_variable [, .(colnm = names(Data_Risk)[col], col = paste0('var',
order(col))) , by = row] %>% dcast(row ~ col, value.var = 'colnm')
给予
row var1 var2 var3 var4 var5 var6
var7
1: 1 Mexico_01 Mexico_02 Argentina_01 Argentina_02 Spain_01 Spain_02
England
2: 2 Mexico_01 Argentina_01 Argentina_02 Italy Spain_02 <NA>
<NA>
3: 3 Mexico_01 Mexico_02 Argentina_01 Italy Spain_01 England
<NA>
4: 4 Mexico_01 Mexico_02 Argentina_01 Argentina_02 Italy Spain_01
Spain_02
5: 5 Mexico_02 Argentina_02 Spain_01 Spain_02 <NA> <NA>
<NA>
6: 6 Mexico_01 Mexico_02 Argentina_01 Argentina_02 Italy Spain_01
Spain_02
var8 var9
1: Germany <NA>
2: <NA> <NA>
3: <NA> <NA>
4: <NA> <NA>
5: <NA> <NA>
6: England Germany
对于这种情况,我只需要考虑具有相同前缀的所有变量中的单个变量,例如:而不是 mexico_01 或 mexico_02 只选择墨西哥。
所以决赛桌必须是这样的:
var1 var2 var3 var4 var5 var6
mexico argentina england germany null null
mexico argentina italy null null null
mexico argentina italy spain england null
mexico argentina italy spain null null
spain null null null null null
mexico argentina italy spain england germany
【问题讨论】: