【问题标题】:R prediction package VS Stata marginsR 预测包 VS Stata 边距
【发布时间】:2018-01-07 23:22:12
【问题描述】:

我正在从 Stata 切换到 R,当我使用预测计算边际 pred 时发现不一致的结果,而 Stata 命令的结果 ma​​rgins 将变量的值固定为 x。示例如下:

library(dplyr)
library(prediction)

d <- data.frame(x1 = factor(c(1,1,1,2,2,2), levels = c(1, 2)),
            x2 = factor(c(1,2,3,1,2,3), levels = c(1, 2, 3)),
            x3 = factor(c(1,2,1,2,1,2), levels = c(1, 2)),
            y = c(3.1, 2.8, 2.5, 4.3, 4.0, 3.5))

m2 <- lm(y ~ x1 + x2 + x3, d)
summary(m2)

marg2a <- prediction(m2, at = list(x2 = "1"))
marg2b <- prediction(m2, at = list(x1 = "1"))

marg2a %>%
  select(x1, fitted) %>%
  group_by(x1) %>%
  summarise(error = mean(fitted))

marg2b %>%
  select(x2, fitted) %>%
  group_by(x2) %>%
  summarise(error = mean(fitted))

这是结果:

# A tibble: 2 x 2
      x1    error
   <fctr>    <dbl>
1      1 3.133333
2      2 4.266667


# A tibble: 3 x 2
      x2 error
  <fctr> <dbl>
1      1 3.125
2      2 2.825
3      3 2.425

如果我尝试使用 Stata 的边距进行复制,结果如下:

regress y i.x1 i.x2 i.x3
margins i.x1, at(x2 == 1)
margins i.x2, at(x1 == 1)


------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |
          1  |      3.125   .0829157    37.69   0.017     2.071456    4.178544
          2  |      4.275   .0829157    51.56   0.012     3.221456    5.328544
------------------------------------------------------------------------------

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x2 |
          1  |      3.125   .0829157    37.69   0.017     2.071456    4.178544
          2  |      2.825   .0829157    34.07   0.019     1.771456    3.878544
          3  |      2.425   .0829157    29.25   0.022     1.371456    3.478544
------------------------------------------------------------------------------

x2 的边距在 R 和 Stata 中是相同的,但在 x1 中存在差异,我不知道为什么。非常感谢任何帮助。谢谢,

P

【问题讨论】:

标签: r stata prediction lm


【解决方案1】:

您的 Stata 和 R 代码不等价。要复制该 Stata 代码,您需要:

> prediction(m2, at = list(x1 = c("1", "2"), x2 = "1"))
Average predictions for 6 observations:
 at(x1) at(x2) value
      1      1 3.125
      2      1 4.275
> prediction(m2, at = list(x2 = c("1", "2", "3"), x1 = "1"))
Average predictions for 6 observations:
 at(x2) at(x1) value
      1      1 3.125
      2      1 2.825
      3      1 2.425

那是因为当您说 margins i.x1 时,您要求对数据集的反事实版本进行预测,其中 x1 被替换为 1,然后被替换为 2,并且在两个反事实 x2 中都保持为 1 . 在您的第二个 Stata 示例中也发生了同样的事情。

这是因为 Stata 的 margins 命令有一个歧义,或者更确切地说是两个获得相同输出的句法表达式。一个是你的代码:

. margins i.x1, at(x2 == 1)

Predictive margins                              Number of obs     =          6
Model VCE    : OLS

Expression   : Linear prediction, predict()
at           : x2              =           1

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |
          1  |      3.125   .0829156    37.69   0.017     2.071457    4.178543
          2  |      4.275   .0829156    51.56   0.012     3.221457    5.328543
------------------------------------------------------------------------------

另一个更明确地说明了上面实际发生的情况:

. margins, at(x1 = (1 2) x2 == 1)

Predictive margins                              Number of obs     =          6
Model VCE    : OLS

Expression   : Linear prediction, predict()

1._at        : x1              =           1
               x2              =           1

2._at        : x1              =           2
               x2              =           1

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |      3.125   .0829156    37.69   0.017     2.071457    4.178543
          2  |      4.275   .0829156    51.56   0.012     3.221457    5.328543
------------------------------------------------------------------------------

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2018-01-07
    • 2016-08-04
    • 2023-03-30
    • 1970-01-01
    • 1970-01-01
    • 2012-06-09
    • 2014-10-23
    • 1970-01-01
    相关资源
    最近更新 更多