【发布时间】:2020-11-11 12:34:49
【问题描述】:
如何将 8 个子集分别与两个不同的因变量相关联?对于两个不同的子集,我一直得到相同的相关系数(下面的示例)。这是输入:
with(subset(mydata2, PARTYID_Strength = 1), cor.test(PARTYID_Strength,
mean.legit))
with(subset(mydata2, PARTYID_Strength = 1), cor.test(PARTYID_Strength,
mean.leegauthor))
with(subset(mydata2, PARTYID_Strength = 2), cor.test(PARTYID_Strength,
mean.legit))
with(subset(mydata2, PARTYID_Strength = 2), cor.test(PARTYID_Strength,
mean.leegauthor))
输出(我得到了 PARTY_Strength = 1 和 2):
皮尔逊积矩相关性
数据:PARTYID_Strength 和 mean.legit t = 3.1005,df = 607,p 值 = 0.002022 备择假设:真实相关性不等于 0 95% 置信区间:
0.0458644 0.2023031 样本估计:
相关
0.1248597皮尔逊积矩相关性
数据:PARTYID_Strength 和 mean.leegauthor t = 2.8474, df = 607, p 值 = 0.004557 备择假设:真正的相关性不是 等于 0 95% 置信区间:
0.03568431 0.19250344 样本估计:
相关
0.1148091
样本数据:
> dput(head(mydata2, 10))
``structure(list(PARTYID = c(1, 3, 1, 1, 1, 4, 3, 1, 1, 1), PARTYID_Other =
c("NA",
"NA", "NA", "NA", "NA", "Green", "NA", "NA", "NA", "NA"), PARTYID_Strength =
c(1,
7, 1, 2, 1, 8, 1, 6, 1, 1), PARTYID_Strength_Other = c("NA",
"NA", "NA", "NA", "NA", "Green", "NA", "NA", "NA", "NA"), THERM_Dem = c(80,
65, 85, 30, 76, 15, 55, 62, 90, 95), THERM_Rep = c(1, 45, 10,
5, 14, 14, 0, 4, 10, 3), Gender = c("Female", "Male", "Male",
"Female", "Female", "Male", "Male", "Female", "Female", "Male"
), `MEAN Age` = c(29.5, 49.5, 29.5, 39.5, 29.5, 21, 39.5, 39.5,
29.5, 65), Age = c("25 - 34", "45 - 54", "25 - 34", "35 - 44",
"25 - 34", "18 - 24", "35 - 44", "35 - 44", "25 - 34", "65+"),
Ethnicity = c("White or Caucasian", "Asian or Asian American",
"White or Caucasian", "White or Caucasian", "Hispanic or Latino",
"White or Caucasian", "White or Caucasian", "White or Caucasian",
"White or Caucasian", "White or Caucasian"), Ethnicity_Other = c("NA",
"NA", "NA", "NA", "NA", "NA", "NA", "NA", "NA", "NA"), States = c("Texas",
"Texas", "Ohio", "Texas", "Puerto Rico", "New Hampshire",
"South Carolina", "Texas", "Texas", "Texas"), Education = c("Master's
degree",
"Bachelor's degree in college (4-year)", "Bachelor's degree in college (4-
year)",
"Master's degree", "Master's degree", "Less than high school degree",
"Some college but no degree", "Master's degree", "Master's degree",
"Some college but no degree"), `MEAN Income` = c(30000, 140000,
150000, 60000, 80000, 30000, 30000, 120000, 150000, 60000
), Income = c("Less than $30,000", "$130,001 to $150,000",
"More than $150,000", "$50,001 to $70,000", "$70,001 to $90,000",
"Less than $30,000", "Less than $30,000", "$110,001 to $130,000",
"More than $150,000", "$50,001 to $70,000"), mean.partystrength = c(3.875,
2.875, 2.375, 3.5, 2.625, 3.125, 3.375, 3.125, 3.25, 3.625
), mean.traitrep = c(2.5, 2.625, 2.25, 2.625, 2.75, 1.875,
2.75, 2.875, 2.75, 3), mean.traitdem = c(2.25, 2.625, 2.375,
2.75, 2.625, 2.125, 1.875, 3, 2, 2.5), mean.leegauthor = c(1,
2, 2, 4, 1, 4, 1, 1, 1, 1), mean.legit = c(1.71428571428571,
3.28571428571429, 2.42857142857143, 2.42857142857143, 2.14285714285714,
1.28571428571429, 1.42857142857143, 1.14285714285714, 2.14285714285714,
1.28571428571429)), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"))``
谢谢!
【问题讨论】:
-
逻辑语句需要
==而不是=所以PARTYID_Strength == 1 -
@dcarlson 谢谢!虽然我得到了这个结果:Pearson 的积矩相关数据:PARTYID_Strength 和 mean.legit t = NA,df = 67,p 值 = NA 替代假设:真正的相关性不等于 0 95% 置信区间:NA NA 样本估计: 心不适用
-
您只选择了带有
PARTYID_Strength==1的行,因此该变量是一个常量。该变量与任何其他变量的相关性为零。如果要对数据进行子集化,请不要在相关性中使用子集化变量。 -
@dcarlson 啊,我明白了。所以也许我不应该单独衡量政党,而是组合在一起?另外,如果我使用 = 而不是 ==,那么测量的原始公式是什么?
-
它什么也没做。 R 没有抱怨,只是返回了原始数据。
标签: r regression correlation lm