【问题标题】:confidence interval for segmented glm分段 glm 的置信区间
【发布时间】:2014-03-27 10:38:47
【问题描述】:

我正在尝试将分段 glm 拟合到某些数据:

x <- c(0.25,0.5,0.75,1,1.25,1.5,1.75,2,2.25,2.5,2.75,3,3.25)
y <- c(5.516,5.725,5.9781,6,6.453,6.88,7.3,11,11.89,15.6,21.3,27,32.8)
d <- data.frame(x = x,
                y = y)

if(!require("segmented")) {
  install.packages("segmented")
  require("segmented")
}

g1 <- glm(y ~ x,data = d)
g2 <- segmented(g1, seg.Z = ~ x,
                psi = list(x = c(1.5)))
pdat <- data.frame(x = d$x,
                   y = broken.line(g2, link = FALSE)[,1])
pdat <- pdat[with(pdat, order(x)), ]
plot(y ~ x, data = d, pch = 21, bg = "white")
lines(y ~ x, data = pdat, type = "l", col = "red")

我现在想围绕分段线绘制置信区间,但不知道如何执行此操作。我可以为非分段图绘制置信区间:

## use quadratic function
g3 <- lm(y ~ poly(x, 2), data = d)
pdat <- with(d, data.frame(x = exp(seq(min(x),
                                         max(x), length = 100))))

tmp2 <- predict(g3, newdata = pdat, se.fit = TRUE)
critVal <- qt(0.975, df = g3$df.residual)
pdat <- transform(pdat, pred = tmp2$fit, se = tmp2$se.fit)
pdat <- transform(pdat, yhat = pred,
                   upr = pred + (critVal * se),
                   lwr = pred - (critVal * se))
plot(y ~ x, data = d)
lines(yhat ~ x, data = pdat, type = "l", col = "red") # gam model
lines(upr ~ x, data = pdat, type = "l", lty = "dashed", col = "red") # upper limit
lines(lwr ~ x, data = pdat, type = "l", lty = "dashed", col = "red") # lower limit

但是当我对分段版本重复此操作时,它似乎不太正确:

# repeat same method for segmented
g1 <- glm(y ~ x,data = d)
g2 <- segmented(g1, seg.Z = ~ x,
                psi = list(x = c(1.5)))
pdat <- with(d, data.frame(x = exp(seq(min(x),
                                       max(x), length = 100))))

tmp2 <- predict(g2, newdata = pdat, se.fit = TRUE)
critVal <- qt(0.975, df = g2$df.residual)
pdat <- transform(pdat, pred = tmp2$fit, se = tmp2$se.fit)
pdat <- transform(pdat, yhat = pred,
                  upr = pred + (critVal * se),
                  lwr = pred - (critVal * se))
plot(y ~ x, data = d)
lines(yhat ~ x, data = pdat, type = "l", col = "red") # gam model
lines(upr ~ x, data = pdat, type = "l", lty = "dashed", col = "red") # upper limit
lines(lwr ~ x, data = pdat, type = "l", lty = "dashed", col = "red") # lower limit

所以,我的第一个问题是为什么二次函数没有延伸到整个 x 轴,即为什么它会停在 1.25?其次,我对分段线的置信区间采用的方法是否正确,或者有更好的方法吗?

【问题讨论】:

  • 你为什么不在predict中设置interval = "confidence"

标签: r glm


【解决方案1】:

这个怎么样?波段代表 95% CI。

x <- c(0.25,0.5,0.75,1,1.25,1.5,1.75,2,2.25,2.5,2.75,3,3.25)
y <- c(5.516,5.725,5.9781,6,6.453,6.88,7.3,11,11.89,15.6,21.3,27,32.8)
d <- data.frame(x = x,
                y = y)

mdl <- glm(y ~ x + I(x^2) + I(x^3), data = d)

prd <- predict(mdl, newdata = d[, "x", drop = FALSE], se = TRUE)
d$fit <- prd$fit
d$lci <- d$fit - 1.96 * prd$se.fit
d$uci <- d$fit + 1.96 * prd$se.fit

library(ggplot2)
ggplot(d, aes(x = x, y = y, ymin = lci, ymax = uci)) +
  theme_bw() +
  geom_point(size = 3) +
  geom_smooth(aes(x = x, y = fit), stat = "identity")

【讨论】:

  • 这并没有使用 OP 中的分段 glm 函数,是吗?
  • 当然不是,但它描述了数据,@Kate。
  • 是的,很好。但我正在寻找一种使用分段 glm 执行此操作的方法。问题中使用的指数函数只是为了说明我如何确定该方法的置信区间
【解决方案2】:

以@Roman 的回答为基础,这里有一个类似的方法,可能更接近您的要求:

x <- c(0.25,0.5,0.75,1,1.25,1.5,1.75,2,2.25,2.5,2.75,3,3.25)
y <- c(5.516,5.725,5.9781,6,6.453,6.88,7.3,11,11.89,15.6,21.3,27,32.8)
d <- data.frame(x = x,
                y = y)
d$thing <- c(rep("a",8), rep("b",5))

library(ggplot2)
ggplot(d, aes(x = x, y = y, group = thing)) +
  geom_point() +
  theme_bw() +
  stat_smooth(method = "lm", formula = y ~ I(x^2) + I(x^3),
              fill = NA, linetype = 3, geom = "ribbon", colour = "red") +
  stat_smooth(method = "lm", formula = y ~ I(x^2) + I(x^3),
              fill = "transparent", colour = "black")

【讨论】:

    猜你喜欢
    • 2020-07-14
    • 1970-01-01
    • 2016-04-02
    • 2018-01-02
    • 1970-01-01
    • 2021-06-26
    • 1970-01-01
    • 2017-11-03
    相关资源
    最近更新 更多