【发布时间】:2020-07-23 02:28:26
【问题描述】:
我有一个包含 100 列的 data.frame,遵循约定 word 和 word_answer
df <- data.frame(apple = "57%", apple_answer = "22%", dog = "82%", dog_answer = "16%")
我这样设置上述两个因子变量的水平:
levels(df$apple) <- c( "66%","57%","48%","39%","30%","22%","12%" )
levels(df$dog) <- c( "82%","71%","60%","49%","38%","27%","16%" )
我正在尝试计算一个距离分数,它是 word 的一个因子的数字级别与其对应的 word_answer. 的数字级别之间的距离
因此,例如,在“apple”答案的情况下,apple 的第一行是“57%”,这是该因素中的第二个因素水平
> which(levels(df$apple) == "57%")
[1] 2
对应的apple_answer 列的因子水平为 6
> which(levels(df$apple) == "22%")
[1] 6
所以在这种情况下,距离得分将是 2-6 = -4
如何计算数据集中每个变量的这些距离分数?
【问题讨论】: