如何按组将绘图元素叠加到 ggplot2 方面？答案

【问题标题】：How can I overlay by-group plot elements to ggplot2 facets?如何按组将绘图元素叠加到 ggplot2 方面？
【发布时间】：2026-02-14 13:20:04
【问题描述】：

我的问题与刻面有关。在下面的示例代码中，我查看了一些分面散点图，然后尝试在每个方面叠加信息（在本例中为平均线）。

tl;dr 版本是我的尝试失败了。要么我添加的平均线计算所有数据（不尊重 facet 变量），要么我尝试编写一个公式并且 R 抛出一个错误，然后是对我母亲的尖锐和特别贬低的 cmets。

library(ggplot2)

# Let's pretend we're exploring the relationship between a car's weight and its
# horsepower, using some sample data
p <- ggplot()
p <- p + geom_point(aes(x = wt, y = hp), data = mtcars)
print(p)

# Hmm. A quick check of the data reveals that car weights can differ wildly, by almost
# a thousand pounds.
head(mtcars)

# Does the difference matter? It might, especially if most 8-cylinder cars are heavy,
# and most 4-cylinder cars are light. ColorBrewer to the rescue!
p <- p + aes(color = factor(cyl))
p <- p + scale_color_brewer(pal = "Set1")
print(p)

# At this point, what would be great is if we could more strongly visually separate
# the cars out by their engine blocks.
p <- p + facet_grid(~ cyl)
print(p)

# Ah! Now we can see (given the fixed scales) that the 4-cylinder cars flock to the
# left on weight measures, while the 8-cylinder cars flock right. But you know what
# would be REALLY awesome? If we could visually compare the means of the car groups.
p.with.means <- p + geom_hline(
                      aes(yintercept = mean(hp)),
                      data = mtcars
         )
print(p.with.means)

# Wait, that's not right. That's not right at all. The green (8-cylinder) cars are all above the
# average for their group. Are they somehow made in an auto plant in Lake Wobegon, MN? Obviously,
# I meant to draw mean lines factored by GROUP. Except also obviously, since the code below will
# print an error, I don't know how.
p.with.non.lake.wobegon.means <- p + geom_hline(
                                       aes(yintercept = mean(hp) ~ cyl),
                                       data = mtcars
                                     )
print(p.with.non.lake.wobegon.means)

必须有一些我缺少的简单解决方案。

【问题讨论】：

标签： r ggplot2

【解决方案1】：

你的意思是这样的：

rs <- ddply(mtcars,.(cyl),summarise,mn = mean(hp))

p + geom_hline(data=rs,aes(yintercept=mn))

也许可以在使用stat_* 的ggplot 调用中执行此操作，但我必须回去修改一下。但一般来说，如果我将摘要添加到多面图中，我会分别计算摘要，然后将它们添加到它们自己的 geom 中。

编辑

只是对您最初尝试的一些扩展说明。通常，将aes 调用放入将在整个绘图中持续存在的ggplot 是一个好主意，然后在与“基本”绘图不同的geom 中指定不同的数据集或美学。那么您就不需要在每个geom 中继续指定data = ...。

最后，我想出了一种巧妙地使用geom_smooth 来做类似你所要求的事情：

p <- ggplot(data = mtcars,aes(x = wt, y = hp, colour = factor(cyl))) + 
    facet_grid(~cyl) + 
    geom_point() + 
    geom_smooth(se=FALSE,method="lm",formula=y~1,colour="black")

水平线（即常数回归 eqn）只会延伸到每个方面的数据限制，但它会跳过单独的数据汇总步骤。

【讨论】：

所以在您的策略中，按组汇总数据由 ddply 单独计算，然后交给 geom_hline() 函数，而我一直试图强制 geom_hline 只接受汇总公式。你的工作流程很有意义。
谢谢...查看我的编辑以了解完成类似事情的另一种方法（至少在这种情况下）。一般来说，您描述的工作流程是个好主意。
我喜欢你“巧妙地使用geom_smooth()，但我又感到困惑了。为什么geom_smooth() 尊重每个方面？这与你指定的formula = y ~ 1 公式有关吗？
否；这就是我试图解释在原始ggplot 调用中设置data 和aes 值的原因。不带参数调用ggplot() 意味着您从零开始使用geom。在您的情况下，我认为混淆来自 ggplot 执行操作的顺序：首先它评估 mean(hp) 然后然后将其传递给任何方面。这就是为什么你不能让geom_hline 尊重你的刻面； ggplot 在计算平均值时不知道分面。（但我在这里推测了一下......）