【发布时间】:2019-08-16 02:25:06
【问题描述】:
我有一个data.frame,它指定线性区间(沿染色体),其中每个区间都分配给一个组:
df <- data.frame(chr = c(rep("1",5),rep("2",4),rep("3",5)),
start = c(seq(1,50,10),seq(1,40,10),seq(1,50,10)),
end = c(seq(10,50,10),seq(10,40,10),seq(10,50,10)),
group = c(c("g1.1","g1.1","g1.2","g1.3","g1.1"),c("g2.1","g2.2","g2.3","g2.2"),c("g3.1","g3.2","g3.2","g3.2","g3.3")),
stringsAsFactors = F)
我正在寻找一种快速的方法来折叠 df 到 chr 和 group 以便沿着分配给相同 group 的 chr 的连续间隔组合起来,它们的start 和end 坐标也相应修改。
这是此示例的预期结果:
res.df <- data.frame(chr = c(rep("1",4),rep("2",4),rep("3",3)),
start = c(c(1,21,31,41),c(1,11,21,31),c(1,11,41)),
end = c(c(20,30,40,50),c(10,20,30,40),c(10,40,50)),
group = c("g1.1","g1.2","g1.3","g1.1","g2.1","g2.2","g2.3","g2.2","g3.1","g3.2","g3.3"),
stringsAsFactors = F)
【问题讨论】:
标签: r dataframe datatable dplyr aggregate