【发布时间】:2015-03-04 00:34:24
【问题描述】:
这是一些虚拟数据:
class<-c("ab","ab","ad","ab","ab","ad","ab","ab","ad","ab","ad","ab","av")
otu<-c("ab","ac","ad","ab","ac","ad","ab","ac","ad","ab","ad","ac","av")
value<-c(0,1,12,13,300,1,2,3,4,0,0,2,4)
type<-c("b","c","d","a","b","c","d","d","d","c","b","a","a")
location<-c("b","c","d","a","b","d","d","d","d","c","b","a","a")
datafr1<-data.frame(class,otu,value,type,location)
如果组“位置”和“类型”中的任何复制为 0,我想删除任何 OTU,因为我对组内所有复制之间共享的 OTU 感兴趣。
我想计算两件事。 一:组“位置”和类型之间共享的所有 OTU 的“价值”百分比丰度(丰度) 二:统计每个类共享的OTU个数(otu.freq)
需要注意的是,我希望 OTU 按“类”分类,而不是 OTU 名称(因为它没有意义)。
预期输出:
class location type abundance otu.freq
ab a a 79 2
av a a 21 1
ab b b 100 1
ab c c 100 1
ad d c 100 1
ab d d 24 2
ad d d 76 2
我有一个更大的数据框,并使用 dplyr here 尝试了这些建议,但我的 RAM 用完了,所以我不知道它是否有效。
@Akron 下面提供的解决方案不计算丰度为 0 的出现次数,但它不会从该组内的其他复制中删除该 OTU。如果任何 OTU 的丰度为 0,则它不会在该组之间共享,我需要将其从丰度和 otu.freq 计算中完全打折。
library(dplyr)
so_many_shared3<-datafr1 %>%
group_by(class, location, type) %>%
summarise(abundance=sum(value)/sum(datafr1[['value']])*100, otu.freq=sum(value !=0))
class location type abundance otu.freq
1 ab a a 4.3859649 2
2 ab b b 87.7192982 1
3 ab c c 0.2923977 1
4 ab d d 1.4619883 2
5 ad b b 0.0000000 0
6 ad d c 0.2923977 1
7 ad d d 4.6783626 2
8 av a a 1.1695906 1
【问题讨论】: