【发布时间】:2021-06-19 15:56:43
【问题描述】:
我有一个数据框 (df),其中包括以下列:马名、年龄和速度数据(值)。最初,我使用 ggplot geom_boxplot 绘制数据,以查看按年龄划分的平均速度值。
现在我想做同样的情节,但这次只包括在两岁时参加过 3 次以上比赛的马匹,但我正在努力弄清楚如何实现这一点。
我尝试分组(马,年龄),然后总结每匹马在每个年龄的比赛次数,最后过滤掉 2 岁时 n
谁能想到一个优雅的方式来完成这个。这看起来很简单,但我很挣扎。
library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 4.0.5
library(brew)
#> Warning: package 'brew' was built under R version 4.0.3
df <- tibble(horse=c("a","a","a","a","a","a","a","a","a","a","b","b","b","b","b","b","c","c","c","c","c","c","c","c","c","c","c","c","d","d","d","d","d","d"),
age = c(2,2,2,2,2,3,3,3,4,4,2,2,3,3,3,4,2,2,2,2,2,3,3,3,3,3,4,4,2,3,3,3,3,4),
value = c(20,21,19,23,20,17,16,23,24,14,23,24,18,19,16,19,17,24,19,18,17,15,18,12,12,14,15,11,23,24,14,23,24,18))
df
#> # A tibble: 34 x 3
#> horse age value
#> <chr> <dbl> <dbl>
#> 1 a 2 20
#> 2 a 2 21
#> 3 a 2 19
#> 4 a 2 23
#> 5 a 2 20
#> 6 a 3 17
#> 7 a 3 16
#> 8 a 3 23
#> 9 a 4 24
#> 10 a 4 14
#> # ... with 24 more rows
df %>%
ggplot(aes(x=as.factor(age), y=value, fill=as.factor(age))) +
geom_boxplot(alpha=0.7) +
stat_summary(fun.y=mean, geom="point", shape=20, size=8, color="red", fill="red") +
stat_summary(fun = mean, geom = "text", col = "black", # Add text to plot
vjust = -1.5, aes(label = paste("X:", round(..y.., digits = 1)))) +
theme(legend.position="none") +
scale_fill_brewer(palette="Set1")
#> Warning: `fun.y` is deprecated. Use `fun` instead.
由reprex package (v0.3.0) 于 2021-06-19 创建
【问题讨论】: