R：使用新 lme4 包的 bootMer() 引导二进制混合模型逻辑回归答案

【问题标题】：R: Bootstrapped binary mixed-model logistic regression using bootMer() of the new lme4 packageR：使用新 lme4 包的 bootMer() 引导二进制混合模型逻辑回归
【发布时间】：2013-08-28 21:12:36
【问题描述】：

我想使用新 lme4 包（目前是开发者版本）的新 bootMer() 功能。我是 R 新手，不知道应该为其 FUN 参数编写哪个函数。它说它需要一个数值向量，但我不知道该函数将执行什么。所以我有一个混合模型公式，它被转换为 bootMer()，并且有许多复制。所以我不知道那个外部函数是做什么的？它应该是引导方法的模板吗？ bootMer中不是已经实现了引导方法吗？那么为什么他们需要一个外部的“感兴趣的统计数据”呢？我应该使用哪个感兴趣的统计数据？

以下语法是否适合使用？ R 不断产生错误，即 FUN 必须是数字向量。我不知道如何将估计与“适合”分开，甚至我应该首先这样做吗？我只能说我迷失了那个“有趣”的论点。另外我不知道我应该使用变量“Mixed5”传递混合模型 glmer() 公式还是应该传递一些指针和引用？我在示例中看到 X（bootMer() 的第一个参数是 *lmer() 对象。我想写 *Mixed5 但它呈现错误。

非常感谢。

我的代码是：

library(lme4)
library(boot)

(mixed5 <- glmer(DV ~ (Demo1 +Demo2 +Demo3 +Demo4 +Trt)^2 
                 + (1 | PatientID) + (0 + Trt | PatientID)
                 , family=binomial(logit), MixedModelData4))


FUN <- function(formula) {
  fit <- glmer(DV ~ (Demo1 +Demo2 +Demo3 +Demo4 +Trt)^2 
               + (1 | PatientID) + (0 + Trt | PatientID)
               , family=binomial(logit), MixedModelData4)
  return(coef(fit))
}

result <- bootMer(mixed5, FUN, nsim = 3, seed = NULL, use.u = FALSE,
        type = c("parametric"),
        verbose = T, .progress = "none", PBargs = list())

result
FUN
fit

还有错误：

Error in bootMer(mixed5, FUN, nsim = 3, seed = NULL, use.u = FALSE, type = c("parametric"),  : 
  bootMer currently only handles functions that return numeric vectors

----------------------------------- - - - - - 更新 - - - - - - - - - - - - - - - - - - - - -------------

我按照 Ben 的指示编辑了代码。代码运行得非常好，但 SE 和 Biase 都为零。你也知道如何从这个输出中提取 P 值（对我来说很奇怪）？我应该使用 afex 包的 mixed() 吗？

我修改后的代码：

library(lme4)
library(boot)

(mixed5 <- glmer(DV ~ (Demo1 +Demo2 +Demo3 +Demo4 +Trt)^2 
                + (0 + Trt | PatientID)
                 , family=binomial(logit), MixedModelData4))


FUN <- function(fit) {
  fit <- glmer(DV ~ (Demo1 +Demo2 +Demo3 +Demo4 +Trt)^2 
               + (1 | PatientID) + (0 + Trt | PatientID)
               , family=binomial(logit), MixedModelData4)
  return(fixef(fit))
}

result <- bootMer(mixed5, FUN, nsim = 3)

result

----------------------------------- --------- 更新 2 ---------------------------------------- --------------

我也尝试了以下方法，但代码生成了警告并且没有给出任何结果。

(mixed5 <- glmer(DV ~ Demo1 +Demo2 +Demo3 +Demo4 +Trt 
                 + (1 | PatientID) + (0 + Trt | PatientID)
                 , family=binomial(logit), MixedModelData4))

FUN <- function(mixed5) {
  return(fixef(mixed5))}

result <- bootMer(mixed5, FUN, nsim = 2)

警告信息：

In bootMer(mixed5, FUN, nsim = 2) : some bootstrap runs failed (2/2)
> result

Call:
bootMer(x = mixed5, FUN = FUN, nsim = 2)

Bootstrap Statistics :
WARNING: All values of t1* are NA
WARNING: All values of t2* are NA
WARNING: All values of t3* are NA
WARNING: All values of t4* are NA
WARNING: All values of t5* are NA
WARNING: All values of t6* are NA

----------------------------------- --------- 更新 3 ---------------------------------------- --------------

这段代码也产生了警告：

FUN <- function(fit) {
  return(fixef(fit))}

result <- bootMer(mixed5, FUN, nsim = 2)

警告和结果：

Warning message:
In bootMer(mixed5, FUN, nsim = 2) : some bootstrap runs failed (2/2)
> result

Call:
bootMer(x = mixed5, FUN = FUN, nsim = 2)

Bootstrap Statistics :
WARNING: All values of t1* are NA
WARNING: All values of t2* are NA
WARNING: All values of t3* are NA
WARNING: All values of t4* are NA
WARNING: All values of t5* are NA
WARNING: All values of t6* are NA

【问题讨论】：

标签： r regression statistics-bootstrap

【解决方案1】：

这里基本上有两个（简单的）混淆。

第一个在coef()（返回矩阵列表）和fixef()（返回固定效应向量之间）系数）：我假设 fixef() 是你想要的，尽管你可能想要像 c(fixef(mixed),unlist(VarCorr(mixed))) 这样的东西。
第二个是FUN 应该将拟合的模型对象作为输入...

例如：

library(lme4)
library(boot)

mixed <- glmer(incidence/size ~ period + (1|herd),
               weights=size, data=cbpp, family=binomial)

FUN <- function(fit) {
    return(fixef(fit))
}

result <- bootMer(mixed, FUN, nsim = 3)

result

## Call:
## bootMer(x = mixed, FUN = FUN, nsim = 3)
## Bootstrap Statistics :
##      original      bias    std. error
## t1* -1.398343 -0.20084060  0.09157886
## t2* -0.991925  0.02597136  0.18432336
## t3* -1.128216 -0.03456143  0.05967291
## t4* -1.579745 -0.08249495  0.38272580
##

【讨论】：

我相应地修改了代码（我的问题下的更新），它运行顺利且没有错误。但是我有两个问题：所有的偏差和 SE 都为零。 2：这些线是什么？（我的意思是t1，t2 ...）？原始自举重采样？有没有办法从这个输出中提取估计值和 P 值？
(1) 您应该不在FUN 内重新运行您的模型——只需单独使用fixef() 调用即可。 (2) t1, t2, ... 在输出中是固定效应估计值。 (3) 你一定要看?confint.merMod ...
非常感谢您提供所有这些宝石。我也使用了这个：“FUN
哦，我明白了，对不起！我给了它一个合适的对象（而不是“合适”的对象）
嗯，我将“fit”传递给它，但发生了其他警告，结果是工件警告。（在更新 3 中有详细说明）

【解决方案2】：

这可能是同一个问题，that I reported as an issue here。至少它会导致同样的、无用的错误消息，并且也花了我一段时间。

这意味着您的数据中有缺失，lmer 会忽略这些缺失，但会杀死 bootMer。

试试：

(mixed5 <- glmer(DV ~ (Demo1 +Demo2 +Demo3 +Demo4 +Trt)^2 
                 + (1 | PatientID) + (0 + Trt | PatientID)
                 , family=binomial(logit), na.omit(MixedModelData4[,c('DV','Demo1','Demo2','Demo3','Trt','PatientId')])))

【讨论】：

非常感谢。但是，我没有任何丢失的数据。但我应该说我的数据可能是病态的。
@Vic 你可能已经明白了。但现在至少这个简单的答案就在这里，对于那些在谷歌上搜索错误信息的人。我真的不明白你所说的病理数据是什么意思？对于具有相同数据的简单模型，您是否会遇到同样的问题？
我很欣赏鲁本的表演。另外，我建议您按照 Ben Bolker 对我的指示，将您的数据或类似数据上传到在线存储库，以便软件包开发人员可以处理该错误。（请参阅我们与 Ben Bolker 的对话，并阅读他的说明）......病理数据是指遭受完全/部分分离或严重多重共线性或两者兼而有之的数据。
@vic 好的。如果您点击 GitHub 问题的链接，您会看到我已经使用示例数据创建了一个可重复的示例 - 它确实只需要在其他情况良好的数据中丢失一些。