【发布时间】:2026-02-04 06:20:13
【问题描述】:
我正在尝试创建一个自定义直方图,其中包含在 X 轴上显示原始值的地毯图。
我将使用 mtcars 数据集来说明。它不是这个问题的最佳数据集......但希望读者能理解我想要实现的目标......
下面显示了基本的直方图,没有任何地毯图尝试。
我想使用 geom_bar 创建直方图,因为这样可以更灵活地使用自定义 bin。
我还希望直方图条之间有一个小间隙(即宽度 = 0.95)......这增加了这一点 问题的复杂性。
library(dplyr)
library(ggplot2)
# create custom bins
vct_seq <- c(seq(from = 10, to = 25, by = 5), 34)
mtcars$bin <- cut(mtcars$mpg, breaks = vct_seq)
# create data.frame for the ggplot graph..using bins above
df_mtcars_count <- mtcars %>% group_by(bin) %>% summarise(count = n())
# indicative labels
vct_labels <- c("bin 1", "bin 2", "bin 3", "bin 4")
# attempt 1 - basic plot -- no rug plot
p <- ggplot(data = df_mtcars_count, aes(x = bin, y = count))
p <- p + geom_bar(stat = "identity", width = 0.95)
p <- p + geom_text(aes(label = count), vjust = -0.5)
p <- p + scale_x_discrete("x title to go here", labels = df_mtcars_count$bin, breaks = df_mtcars_count$bin)
p
接下来,尝试在 X 轴上添加一个基本的地毯图。这显然不起作用,因为 geom_bar 和 geom_rug 具有完全不同的比例。
# attempt 2 with no scaling.... doesn't work as x scale for ordinal (bins) and
# x scale for continuous (mpg) do not match
p <- ggplot(data = df_mtcars_count, aes(x = bin, y = count))
p <- p + geom_bar(stat = "identity", width = 0.95)
p <- p + geom_text(aes(label = count), vjust = -0.5)
p <- p + scale_x_discrete("x title to go here", labels = df_mtcars_count$bin, breaks = df_mtcars_count$bin)
p <- p + geom_rug(data = mtcars, aes(x = mpg), inherit.aes = F, alpha = 0.3)
p
现在,尝试重新调整 mpg 列以匹配序数比例......
先定义一个线性映射函数...
fn_linear_map <- function(vct_existing_val, vct_new_range) {
# example....converts 1:20 into the range 1 to 10 like this:
# fn_linear_map(1:20, c(1, 10))
fn_r_diff <- function(x) x %>% range() %>% diff()
flt_ratio <- fn_r_diff(vct_new_range) / fn_r_diff(vct_existing_val)
vct_old_min_offset <- vct_existing_val - min(vct_existing_val)
vct_new_range_val <- (vct_old_min_offset * flt_ratio) + min(vct_new_range)
return(vct_new_range_val)
}
现在应用函数...我们尝试将 mpg 映射到 1 到 4 的范围(这是尝试匹配 序数)
mtcars$mpg_remap <- fn_linear_map(mtcars$mpg, c(1, 4))
再试一次情节……越来越近了……但不是很准确……
# attempt 3: getting closer but doesn't really match the ordinal scale
p <- ggplot(data = df_mtcars_count, aes(x = bin, y = count))
p <- p + geom_bar(stat = "identity", width = 0.95)
p <- p + geom_text(aes(label = count), vjust = -0.5)
p <- p + scale_x_discrete("x title to go here", labels = df_mtcars_count$bin, breaks = df_mtcars_count$bin)
p <- p + geom_rug(data = mtcars, aes(x = mpg_remap), inherit.aes = F, alpha = 0.3)
p
上面的图表越来越接近我想要的了....但是地毯图没有对齐 使用实际数据...示例应显示最大观察值 (33.9) 几乎与栏的右侧对齐.. 见下文:
mtcars %>% filter(bin == "(25,34]") %>% arrange(mpg) %>% dplyr::select(mpg, mpg_remap)
【问题讨论】: