【发布时间】:2015-05-08 03:13:23
【问题描述】:
使用下面的示例,我想知道是否有更有效的包或函数来对匹配的字符串元素进行条件计数和表格——例如,使用data.table 包、dplyr 包、lapply() 之类的功能?
produce = c("apple", "blueberry", "blueberry", "corn",
"horseradish", "rutabega", "rutabega", "tomato") # Long list
veggies = c("carrot", "corn", "horseradish", "rutabega") # Short list
basket = matrix(rep(0, length(unique(veggies))*length(unique(produce)) ), nrow = length(unique(veggies)),
ncol = length(unique(produce)) )
rownames(basket) <- unique(veggies)
colnames(basket) <- unique(produce)
basket
输出:
# apple blueberry corn horseradish rutabega tomato
# carrot 0 0 0 0 0 0
# corn 0 0 0 0 0 0
# horseradish 0 0 0 0 0 0
# rutabega 0 0 0 0 0 0
使用共享实例查找计数
for(i in 1:length(veggies)) {
counter = NULL
for (j in 1:length(produce)){
if(veggies[i] == produce[j]){
basket[i, which( colnames(basket) == produce[j] ) ] <- basket[i,
which( colnames(basket) == produce[j] ) ] + 1
}
}
}
basket
我使用更快/更优雅的方法寻求的结果:
# apple blueberry corn horseradish rutabega tomato
# carrot 0 0 0 0 0 0
# corn 0 0 1 0 0 0
# horseradish 0 0 0 1 0 0
# rutabega 0 0 0 0 2 0
【问题讨论】:
-
这可能是问题的good reference。
标签: r data.table dplyr lapply