两个定量变量之间的独立性答案

【问题标题】：Independence between two quantitative variables两个定量变量之间的独立性
【发布时间】：2021-05-11 13:17:19
【问题描述】：

我想测试两个定性变量之间是否存在依赖关系。在使用任何测试之前，我会绘制 geom_bar()。

Bar Chart

对我来说，这很明显，当因子变量等于 1 时，因变量等于 3 比因子变量等于 0 时更频繁。而当因子变量等于 0 时，与因子变量等于 1 时相比，因变量等于 2 的情况更多。

但如果我执行 chisq.test 或 fisher.test，我会得到一个优于 0.3 的 p 值，这意味着这两个定性变量是独立的。但我真的不明白为什么测试不重要。为了执行测试，我使用了以下代码：

chisq.test(table(variable1,variable2))

其中 variable1 和 variable2 是分类变量

提前感谢您的帮助，

【问题讨论】：

我们确实需要查看数据。显着差异基于样本量，因此查看百分比条形图无济于事。使用 dput(variable1) 和 dput(variable2) 并将结果作为代码示例粘贴到您的问题中。

标签： r chi-squared

【解决方案1】：

这里有一个详细的方法：

#function borrowed from https://*.com/a/32544987/4938484
#to maintain the right sum of entries when rounding
smart.round <- function(x) {
  y <- floor(x)
  indices <- tail(order(x-y), round(sum(x)) - sum(y))
  y[indices] <- y[indices] + 1
  y
}

N = 100 #change to appropriate sample size
tab <- matrix(c(8.1, 51.4, 40.5, 3.7, 37.0, 59.3), ncol=3, byrow=TRUE)
tab <- smart.round(tab/100 * N)
#values in tab were assigned from your bar chart
rownames(tab) <- c("0", "1")
colnames(tab) <- c("1", "2","3")
tab <- as.table(tab)
chisq.test(tab)
#which gives p-value = 0.03

【讨论】：

@user20650 是的，应用百分比可能不准确。理想情况下，他们会将表中的所有条目乘以样本大小。
同意；我认为问题中的代码看起来是正确的。对于 OP，也许计数 / n 很小，因此不显着。只显示 % 可能会产生误导。
@user20650 已更新以反映这一点。