【发布时间】:2014-08-13 15:47:06
【问题描述】:
我有这些数据:
structure(list(Run = c("A013", "A015", "A023", "A024", "A031",
"A032", "A035", "A040", "A045", "A046", "A049", "A013", "A015",
"A023", "A024", "A031", "A032", "A035", "A040", "A045", "A046",
"A013", "A015", "A023", "A024", "A031", "A032", "A035", "A040",
"A013", "A015", "A023", "A024", "A031", "A032", "A035", "A040",
"A013", "A015", "A023", "A024", "A031", "A032", "A013", "A015",
"A023", "A024", "A013", "A015", "A023", "A024"), Step = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L,
7L, 7L), .Label = c("1", "e", "k", "2", "q", "b", "m"), class = "factor"),
Weight = c(87.4064, 79.5822, 117.0674, 102.6384, 134.0752,
111.2398, 107.8464, 111.2576, 104.2428, 110.2848, 28.7292,
41.65656, 73.9356, 84.18504, 89.4845, 71.55106, 86.04072,
76.27296, 92.8749, 85.203, 91.92112, 39.5009258, 58.6035081,
75.13589946, 83.43157667, 88.8993795, 68.85183559, 64.77081269,
77.56733054, 32.5025, 51.45329, 66.29101, 73.79125, 79.95483,
60.9573, 58.34856, 68.83193, 29.65289, 40.74267, 56.97243,
61.48708, 70.24226, 54.79253, 22.8231064, 38.9966088, 55.2736576,
62.6077916, 20.7458048, 38.306526, 54.7937568, 61.1417148
)), .Names = c("Run", "Step", "Weight"), row.names = c(NA,
-51L), class = "data.frame")
我正在尝试使用 0.99 置信度获得漂亮的 geom_smooth()
require(ggplot2)
require(directlabels)
g1 <- ggplot(m1,
aes(x=Step,y=Weight,label=Run,group=Run,color=Run)) +
geom_point() + geom_line()
g2 <- g1 + geom_dl(method="first.bumpup")
g2 + geom_smooth(aes(group=1),level=0.99)
这是我的问题:
错误带看起来不像 99% 的置信度,图表中的很多点都在它之外。
当我展开数据集时,错误功能区会收缩到非常窄,大部分点都在它之外。
我在这里做错了吗? 谢谢,
编辑:这是我在运行时看到的: 当我查看更大的数据集时,功能区变得更窄,几乎位于平滑线的顶部。
【问题讨论】:
-
这是 mean 的置信区间,你在说。我认为你所追求的是预测间隔。无耻的自我推销:rpubs.com/RomanL/7024
-
你是对的。在这种情况下,我很困惑置信区间的含义。我会探索你的选择。
-
很遗憾,您没有在此处添加示例,我将其标记为正确。此外,最好推导出一个可以应用于特定点的方程,例如步骤 e 的 A046,它可以预测最后一步的权重,步骤 m。