【发布时间】:2019-05-01 21:25:02
【问题描述】:
我正在尝试使用代码进行逻辑回归:
model <- glm (Participation ~ Gender + Race + Ethnicity + Education + Comorbidities + WLProgram + LoseWeight + EverLoseWeight + PastYearLW + Age + BMI, data = LogisticData, family = binomial)
总结(模型)
我不断收到错误:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels
在查看论坛后,我查看了哪些变量是因素:
str(LogisticData)
'data.frame': 994 obs. of 13 variables:
$ outcome : Factor w/ 2 levels "No","Yes": 1 1 2 2 1 2 2 1 2 2 ...
$ Gender : Factor w/ 3 levels "Male","Female",..: 1 2 2 1 2 1 1 1 1
$ Race : Factor w/ 3 levels "White","Black",..: 1 1 1 3 1 1 1 1 1 1
$ Ethnicity : Factor w/ 2 levels "Hispanic/Latino",..: 2 2 2 2 2 2 2 2 2
$ Education : Factor w/ 2 levels "Below Bachelors",..: 1 1 1 2 1 1 1 2 1
$ Comorbidities : Factor w/ 2 levels "No","Yes": 1 1 2 1 1 1 2 2 1 1 ...
$ WLProgram : Factor w/ 2 levels "No","Yes": NA 1 2 2 1 1 1 NA 1 1 ...
$ LoseWeight : Factor w/ 2 levels "Yes","No": 2 1 1 1 1 1 1 2 1 1 ...
$ PastYearLW : Factor w/ 2 levels "Yes","No": NA 2 1 1 1 2 1 NA 1 1 ...
$ EverLoseWeight: Factor w/ 2 levels "Yes","No": 2 1 1 1 1 1 1 2 1 1 ...
$ Age : int 29 35 69 32 21 45 40 62 59 58 ...
$ Participation : Factor w/ 2 levels "Yes","No": 2 2 1 1 1 1 1 2 1 2 ...
$ BMI : num 25.7 33.8 26.4 32.3 27.5 ...
所有因素似乎都有 2 个或更多水平。
我还尝试省略 NA,但仍然给我这个错误。
我想要回归中的所有变量,但不知道为什么它不会运行。
表演时:
newdata <- droplevels(na.omit(LogisticData))
> str(newdata)
'data.frame': 840 obs. of 13 variables:
$ outcome : Factor w/ 2 levels "No","Yes": 1 2 2 1 2 2 2 2 2 2 ...
$ Gender : Factor w/ 3 levels "Male","Female",..: 2 2 1 2 1 1 1 2 1
$ Race : Factor w/ 3 levels "White","Black",..: 1 1 3 1 1 1 1 1 3
$ Ethnicity : Factor w/ 2 levels "Hispanic/Latino",..: 2 2 2 2 2 2 2 2
$ Education : Factor w/ 2 levels "Below Bachelors",..: 1 1 2 1 1 1 1 1
$ Comorbidities : Factor w/ 2 levels "No","Yes": 1 2 1 1 1 2 1 1 1 2 ...
$ WLProgram : Factor w/ 2 levels "No","Yes": 1 2 2 1 1 1 1 1 1 1 ...
$ LoseWeight : Factor w/ 1 level "Yes": 1 1 1 1 1 1 1 1 1 1 ...
$ PastYearLW : Factor w/ 2 levels "Yes","No": 2 1 1 1 2 1 1 1 1 2 ...
$ EverLoseWeight: Factor w/ 1 level "Yes": 1 1 1 1 1 1 1 1 1 1 ...
$ Age : int 35 69 32 21 45 40 59 58 23 32 ...
$ Participation : Factor w/ 2 levels "Yes","No": 2 1 1 1 1 1 1 2 2 1 ...
$ BMI : num 33.8 26.4 32.3 27.5 45.4 ...
- attr(*, "na.action")=Class 'omit' Named int [1:154] 1 8 13 14 21 24 25
46 55 58 ...
.. ..- attr(*, "names")= chr [1:154] "1" "8" "13" "14" ...
这对我来说没有意义,因为您可以在第一个 str(Logisitic Data) 中看到 EverLoseWeight 中显然有 2 个级别,因为您可以看到 Yes 和 No 以及 1 和 2?如何解决此异常?
【问题讨论】:
-
检查
newdata <- droplevels(na.omit(LogisticData))的级别是否相同 -
乍一看,
Ethnicity看起来很可疑。因素可能有两个水平,但只存在一个水平。考虑x = as.factor(c(1,1,1)); levels(x) = c(1, 2)。 -
@akrun 级别不一样,但这对我来说没有意义。请查看其他帖子。
-
可能存在未使用的关卡,即不存在的关卡
-
我更新了一些解释。但我现在明白了逻辑——如果删除与该变量相关的观察结果,那么它将提供一个级别。我的错。谢谢。