基于 R 中模型平均系数的部分残差图答案

【问题标题】：Partial residual plot based on model average coefficients in R基于 R 中模型平均系数的部分残差图
【发布时间】：2015-04-22 09:37:27
【问题描述】：

我正在使用 R 包 MuMIn 进行多模型推理，并使用函数 model.avg 来平均由一组模型估计的系数。为了直观地将数据与基于平均系数的估计关系进行比较，我想使用部分残差图，类似于 car 包的 crPlots 函数创建的残差图。我已经尝试了三种方法，但我不确定是否合适。这是一个演示。

library(MuMIn)
# Loading the data
data(Cement)
# Creating a full model with all the covariates we are interested in
fullModel <- lm(y ~ ., data = Cement, na.action=na.fail)
# Getting all possible models based on the covariates of the full model
muModel <- dredge(fullModel)
# Averaging across all models
avgModel <- model.avg(muModel)
# Getting the averaged coefficients
coefMod <- coef(avgModel)
coefMod
# (Intercept)          X1          X2          X4          X3 
# 65.71487660  1.45607957  0.61085531 -0.49776089 -0.07148454

选项 1：使用crPlots

library(car) # For crPlots
# Creating a duplicate of the fullMode
hackModel <- fullModel
# Changing the coefficents to the averaged coefficients
hackModel$coefficients <- coefMod[names(coef(fullModel))]
# Changing the residuals
hackModel$residuals <- Cement$y - predict(hackModel)
# Plot the hacked model vs the full model
layout(matrix(1:8, nrow=2, byrow=TRUE))
crPlots(hackModel, layout=NA)
crPlots(fullModel, layout=NA)

请注意，具有平均系数的完整版和破解版的 crPlot 是不同的。

这里的问题是：这合适吗？结果依赖于我在answer 中发现的一个 hack。除了残差和系数之外，我是否需要更改模型的其他部分？

选项 2：自制地块

# Partial residuals: residuals(hacked model) + beta*x
# X1
# Get partial residuals
prX1 <- resid(hackModel) + coefMod["X1"]*Cement$X1
# Plot the partial residuals
plot(prX1 ~ Cement$X1)
# Add modeled relationship
abline(a=0,b=coefMod["X1"])
# X2 - X4
plot(resid(hackModel) + coefMod["X2"]*X2 ~ X2, data=Cement); abline(a=0,b=coefMod["X2"])
plot(resid(hackModel) + coefMod["X3"]*X3 ~ X3, data=Cement); abline(a=0,b=coefMod["X3"])
plot(resid(hackModel) + coefMod["X4"]*X4 ~ X4, data=Cement); abline(a=0,b=coefMod["X4"])

情节看起来与上面crPlots 产生的情节不同。

部分残差具有相似的模式，但它们的值和建模关系不同。值的差异似乎是由于 crPlots 使用了居中的部分残差这一事实（参见 answer 以了解 R 中的部分残差的讨论）。这让我想到了第三个选择。

选项 3：具有居中部分残差的自制图

# Get the centered partial residuals
pRes <- resid(hackModel, type='partial')
# X1
# Plot the partial residuals
plot(pRes[,"X1"] ~ Cement$X1)
# Plot the component - modeled relationship
lines(coefMod["X1"]*(X1-mean(X1))~X1, data=Cement)
# X2 - X4
plot(pRes[,"X2"] ~ Cement$X2); lines(coefMod["X2"]*(X2-mean(X2))~X2, data=Cement) 
plot(pRes[,"X3"] ~ Cement$X3); lines(coefMod["X3"]*(X3-mean(X3))~X3, data=Cement)
plot(pRes[,"X4"] ~ Cement$X4); lines(coefMod["X4"]*(X4-mean(X4))~X4, data=Cement)

现在我们的值与上面的crPlots 相似，但关系仍然不同。差异可能与截距有关。但我不确定我应该使用什么来代替 0。

关于哪种方法更合适的建议？有没有更直接的方法来获得基于模型平均系数的部分残差图？

非常感谢！

【问题讨论】：

这对我来说可能是一个CrossValidated 的问题...我不确定为什么您既没有在选项#2 中添加截距也没有将预测变量居中？您是否查看过car:::crPlot 的内部，看看它在做什么？（看起来它实际上是在对相关变量的部分残差进行回归——不仅仅是使用简单的多元斜率估计......我需要更多地复习这个理论。这是否在 Fox 的 Companion 中阐明应用回归 ?)
@BenBolker 谢谢！我无法访问 Fox 的书，但我已经从 car 下载了源代码，是的，crPlot 回归了部分残差：abline(lm(partial.res[,var]~.x), lty=2, lwd=lwd, col=col.lines[1])。我不知道为什么这是有道理的，但你是对的，理解为什么会让这个问题成为 CrossValidated 问题。我希望有一个已经制作的函数，可以从MuMIn 包中绘制平均模型对象的分量+残差图，或者确认仅更改模型中的残差和系数对于crPlot 是可以的。

标签： r plot lm r-car mumin

【解决方案1】：

通过查看crPlot.lm 源代码，模型对象上似乎只使用了residuals(model, type="partial")、predict(model, type="terms", term=var) 和与查找变量名称相关的函数。正如@BenBolker 建议的那样，这种关系看起来也倒退了。 crPlot.lm 中使用的代码是：abline(lm(partial.res[,var]~.x), lty=2, lwd=lwd, col=col.lines[1])。因此，我认为改变模型的系数和残差足以能够在其上使用crPlots。我现在也可以用自制的方式重现结果。

library(MuMIn)
# Loading the data
data(Cement)
# Creating a full model with all the covariates we are interested in
fullModel <- lm(y ~ ., data = Cement, na.action=na.fail)
# Getting all possible models based on the covariates of the full model
muModel <- dredge(fullModel)
# Averaging across all models
avgModel <- model.avg(muModel)
# Getting the averaged coefficients
coefMod <- coef(avgModel)

# Option 1 - crPlots
library(car) # For crPlots
# Creating a duplicate of the fullMode
hackModel <- fullModel
# Changing the coefficents to the averaged coefficient
hackModel$coefficients <- coefMod[names(coef(fullModel))]
# Changing the residuals
hackModel$residuals <- Cement$y - predict(hackModel)

# Plot the crPlots and the regressed homemade version 
layout(matrix(1:8, nrow=2, byrow=TRUE))
par(mar=c(3.5,3.5,0.5,0.5), mgp=c(2,1,0))
crPlots(hackModel, layout=NA, ylab="Partial Res", smooth=FALSE)

# Option 4 - Homemade centered and regressed
# Get the centered partial residuals
pRes <- resid(hackModel, type='partial')
# X1 - X4 plot partial residuals and used lm for the relationship
plot(pRes[,"X1"] ~ Cement$X1); abline(lm(pRes[,"X1"]~Cement$X1))
plot(pRes[,"X2"] ~ Cement$X2); abline(lm(pRes[,"X2"]~Cement$X2))
plot(pRes[,"X3"] ~ Cement$X3); abline(lm(pRes[,"X3"]~Cement$X3))
plot(pRes[,"X4"] ~ Cement$X4); abline(lm(pRes[,"X4"]~Cement$X4))

【讨论】：

我认为“将系数更改为平均系数”应该是：hackModel$coefficients
在 glmmadmb 类的对象上执行 crPlots 的任何经验？？
@Marie Auger-Methe - 有没有在没有模型对象的情况下创建这些图的经验？我基本上将许多模型对象的系数估计值存储在一个 excel 文件中，并且不再具有模型对象。有什么建议吗？