在 ggplot 中绘制分位数回归敏感性答案

【问题标题】：Plotting quantile regression sensitivity in ggplot在 ggplot 中绘制分位数回归敏感性
【发布时间】：2020-04-08 10:00:39
【问题描述】：

我想请教如何绘制分位数标准错误，例如 R 中 quantreg 包中的基本功能

library(quantreg)
library(ggplot2)
library(dplyr)

QR.2 <- rq(hp ~  disp + mpg + I(mpg^2) + qsec + am, data = mtcars, tau = 1:9/10)
plot(summary(QR.2, se="boot"), ols=T)

上图显示了分位数标准误差和置信区间。有没有办法在 ggplot 中重现相同的情节？

尝试下面的代码是一个很好的开始，但是 geom_ribbon(aes(ymin=conf.low,ymax=conf.high),alpha=0.25, fill="#27408b") 确实返回“置信区间”，但显然这些与上面的情节不同。

有没有办法获得conf。如上图所示的间隔？

rq(data=mtcars,
   tau= 1:9/10,
   formula = hp ~  disp + mpg + I(mpg^2) + qsec + am) %>% 
  broom::tidy() %>% 
  filter(!grepl("factor", term)) %>%   
  filter(!grepl("Intercept", term)) %>%   
  ggplot(aes(x=tau,y=estimate))+
  geom_point(color="#27408b", size = 3)+ 
  geom_line(color="#27408b", size = 1)+ 
  geom_smooth(method=  "lm", colour = "red", se = T)+  
  facet_wrap(~term, scales="free", ncol=2) + 
  geom_ribbon(aes(ymin=conf.low,ymax=conf.high),alpha=0.25, fill="#27408b")

【问题讨论】：

标签： r ggplot2 regression

【解决方案1】：

要得到等值，我认为你需要使用broom::tidy(se.type = "boot") %>%，否则标准误差是使用不同的方法计算的。

基础 R 输出：

ggplot2 等效：

  rq(data=mtcars, 
     tau= 1:9/10,
     formula = hp ~  disp + mpg + I(mpg^2) + qsec + am) %>%
  broom::tidy(se.type = "boot") %>%
  filter(!grepl("factor", term)) %>%
  ggplot(aes(x=tau,y=estimate))+
  geom_point(color="#27408b", size = 3)+ 
  geom_line(color="#27408b", size = 1)+ 
  geom_smooth(method=  "lm", colour = "red", se = T)+  
  facet_wrap(~term, scales="free", ncol=2) + 
  geom_ribbon(aes(ymin=conf.low,ymax=conf.high),alpha=0.25, fill="#27408b")

【讨论】：

非常感谢我在这方面花了很多时间

【解决方案2】：

Petr 估计的九个分位数回归模型的图显示，对于每个常数边际效应，qr 点估计值（对于每个选定的九个十分位数）和相应的置信区间（逐点，使用默认 alpha）。 Petr 选择同时显示 OLS 点估计值（红色），再次显示 ci（红色虚线），这是一种经常用于比较条件中位数和条件均值的常规做法。后者在 quantreg 生成的图中正确显示 - 作为水平线。答案中使用 ggplot2 显示的红线显然是不同的。

【讨论】：

【解决方案3】：

接受的答案没有解决所提出的问题，因为它没有绘制 OLS 点估计（红色）和 ci（红色虚线），因为它是由 quantreg 包完成的（正如 Perl 突出显示的那样）。这个问题的正确答案如下：

# OLS
lm <- lm(data=mtcars, 
         formula =  hp ~  disp + mpg + I(mpg^2) + qsec + am)

ols <- as.data.frame(coef(lm))
ols.ci <- as.data.frame(confint(lm))
ols2 <- cbind(ols, ols.ci)
ols2 <- tibble::rownames_to_column(ols2, var="term")

# Quantile
rq(data=mtcars, 
   tau= 1:9/10,
   formula = hp ~  disp + mpg + I(mpg^2) + qsec + am) %>%
  broom::tidy(se.type = "boot") %>%
  filter(!grepl("factor", term)) %>%
  ggplot(aes(x=tau,y=estimate))+
  
  # quantilie results
  geom_point(color="#27408b", size = 3)+ 
  geom_line(color="#27408b", size = 1)+ 
  geom_ribbon(aes(ymin=conf.low,ymax=conf.high),alpha=0.25, fill="#27408b")+

  # OLS results
  geom_hline(data = ols2, aes(yintercept= `coef(lm)`), lty=1, color="red", size=1)+
  geom_hline(data = ols2, aes(yintercept= `2.5 %`), lty=2, color="red", size=1)+
  geom_hline(data = ols2, aes(yintercept= `97.5 %`), lty=2, color="red", size=1)+
  facet_wrap(~term, scales="free", ncol=2)

【讨论】：

您对 OLS 的看法是正确的。由于某种原因，“conf.high/low”不是用 broom::tidy 生成的，所以我在加载“broom”包后将其替换为tidy(se.type = "rank", conf.int = TRUE, conf.level = 0.95)