【发布时间】:2025-11-23 03:10:01
【问题描述】:
我有两个如下所示的数据框:
df1 <- data.frame(geneID=c("gene1","gene2","gene3","gene4",
"gene5","gene6","gene7","gene8","gene9","gene10"),
patient_ID=c(700,0,3,387,30724,1,609,4,0,1729))
head(df1)
geneID patient_ID
1 gene1 700
2 gene2 0
3 gene3 3
4 gene4 387
5 gene5 30724
6 gene6 1
df2 <- data.frame(component1=c("gene1","gene2","gene3","gene4","gene5"),
component2=c("gene2","gene4","gene5","gene10","gene9"))
head(df2)
component1 component2
1 gene1 gene2
2 gene2 gene4
3 gene3 gene5
4 gene4 gene10
5 gene5 gene9
我想生成一个使用来自 df1 的基因值的数据框,并从 df2 计算组件 1 和 2 之间的欧几里得距离。例如,对于gene3 和gene5 对,df3 中的输出应使用以下等式计算:
val = sqrt((gene3)^2+(gene5)^2) =sqrt(700^2+30724^2)
我的最终目标是得到这样的桌子:
gene1 gene2 gene3 gene4 gene5 gene6 gene7 gene8 gene9 gene10
1 gene1 0 0 0 0 0 0 0 0 0 0
2 gene2 val 0 0 0 0 0 0 0 0 0
3 gene3 0 0 0 0 0 0 0 0 0 0
4 gene4 0 val 0 0 0 0 0 0 0 val
5 gene5 0 0 val 0 0 0 0 0 val 0
6 gene6 0 0 0 0 0 0 0 0 0 0
7 gene7 0 0 0 0 0 0 0 0 0 0
8 gene8 0 0 0 0 0 0 0 0 0 0
9 gene9 0 0 0 0 val 0 0 0 0 0
10 gene10 0 0 0 val 0 0 0 0 0 0
我非常感谢任何帮助和建议。
谢谢!
欧哈
【问题讨论】:
-
你想要什么值?你能举例说明
gene1 gene2应该输出什么吗? -
你如何定义分类变量的欧几里得距离?
-
输出应该是来自df2的一对基因之间的欧几里得距离,例如,对于gene3和gene5对val = sqrt((gene3)^2+(gene5)^2) = sqrt(700^2+30724^2).
标签: r dataframe match euclidean-distance