【发布时间】:2019-01-18 20:09:10
【问题描述】:
我有一个数据集,我试图通过计算一个类别来仅选择前 n 个类别,然后使用数据集中的其他变量进行绘图——基本上是前 n 个级别的聚合,但需要返回在ggplot 中绘制完整数据。
所以在下面的问题中,我想要两个最常见的examNames,然后按year 的计数对它们进行绘图和facetwrap。
ap <-
tribble(
~year, ~examName,
2014, "Statistics",
2015, "Statistics",
2016, "Statistics",
2016, "Statistics",
2016, "Statistics",
2016, "Statistics",
2017, "Statistics",
2017, "Statistics",
2017, "Statistics",
2017, "Statistics",
2017, "Statistics",
2013, "Macroeconomics",
2013, "Macroeconomics",
2014, "Macroeconomics",
2015, "Macroeconomics",
2016, "Macroeconomics",
2016, "Macroeconomics",
2016, "Macroeconomics",
2016, "Macroeconomics",
2016, "Macroeconomics",
2017, "Macroeconomics",
2017, "Macroeconomics",
2017, "Macroeconomics",
2017, "Macroeconomics",
2017, "Macroeconomics",
2017, "Macroeconomics",
2013, "Calculus",
2014, "Calculus",
2015, "Calculus",
2016, "Calculus",
2017, "Calculus",
2017, "Psychology",
2017, "Psychology",
2017, "Psychology",
2017, "Psychology",
2017, "Psychology",
2018, "Psychology",
2018, "Psychology")
ap_top <- ap %>%
count(examName, sort = TRUE) %>%
head(2) %>%
inner_join(ap, by = "examName") %>%
select(-n)
ap_top %>%
count(examName, year) %>%
ggplot(aes(x = year, y = n, group = examName)) +
geom_line() +
facet_wrap(~ examName)
我的想法是让我的前 n 个,然后 inner_join 回到原始数据集。然后使用它进行绘图;本质上使用内部连接作为过滤器。
我知道有更好的方法可以做到这一点,我希望有更优雅的解决方案!我全是耳朵!给出了示例数据集(抱歉,它太长了)。
【问题讨论】: