【发布时间】:2016-01-11 21:30:23
【问题描述】:
当我运行multinom() 时,比如说Y ~ X1 + X2 + X3,如果对于某一特定行X1 是NA(即缺失),但Y、X2 和X3 都有一个值,这整行会被扔掉吗(就像在 SAS 中一样)? multinom() 中的缺失值如何处理?
【问题讨论】:
标签: r na missing-data logistic-regression multinomial
当我运行multinom() 时,比如说Y ~ X1 + X2 + X3,如果对于某一特定行X1 是NA(即缺失),但Y、X2 和X3 都有一个值,这整行会被扔掉吗(就像在 SAS 中一样)? multinom() 中的缺失值如何处理?
【问题讨论】:
标签: r na missing-data logistic-regression multinomial
这是一个简单的示例(来自nnet 包中的?multinom)来探索不同的na.action:
> library(nnet)
> library(MASS)
> example(birthwt)
> (bwt.mu <- multinom(low ~ ., bwt))
有意创建NA 值:
> bwt[1,"age"]<-NA # Intentionally create NA value
> nrow(bwt)
[1] 189
测试4个不同的na.action:
> predict(multinom(low ~ ., bwt, na.action=na.exclude)) # Note length is 189
# weights: 12 (11 variable)
initial value 130.311670
iter 10 value 97.622035
final value 97.359978
converged
[1] <NA> 0 0 0 0 0 0 0 0 0 0 0 1 1 0
[16] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
....
> predict(multinom(low ~ ., bwt, na.action=na.omit)) # Note length is 188
# weights: 12 (11 variable)
initial value 130.311670
iter 10 value 97.622035
final value 97.359978
converged
[1] 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0
[38] 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0
.....
> predict(multinom(low ~ ., bwt, na.action=na.fail)) # Generates error
Error in na.fail.default(list(low = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, :
missing values in object
> predict(multinom(low ~ ., bwt, na.action=na.pass)) # Generates error
Error in qr.default(X) : NA/NaN/Inf in foreign function call (arg 1)
所以na.exclude 在预测中生成NA,而na.omit 则完全忽略它。 na.pass 和 na.fail 不会创建模型。
如果未指定na.action,则显示默认值:
> getOption("na.action")
[1] "na.omit"
【讨论】:
您可以指定行为
- na.omit and na.exclude: returns the object with observations removed if they contain any missing values; differences between omitting and excluding NAs can be seen in some prediction and residual functions
- na.pass: returns the object unchanged
- na.fail: returns the object only if it contains no missing values
【讨论】: