r中许多公司的横截面回归答案

【问题标题】：cross sectional regression of many companies in rr中许多公司的横截面回归
【发布时间】：2017-11-12 11:44:22
【问题描述】：

我有两个名为 lagcolmean 和 Dropcolmax 的数据框，其中行名是公司，列名是每月日期。

                      00-02  00-03  00-04
 TENAGA NASIONAL      0.39    0.07  -0.08
 SIME DARBY          -0.09   -0.12  -0.53
 DIGI.COM             0.79    0.96  -1.14
 GENTING             -0.11   -0.27  -0.16
 PETRONAS GAS        -0.30   -0.09  -0.98
and
                   00-01    00-02   00-03
TENAGA NASIONAL     5.61    3.95    4.12
SIME DARBY         10.87    1.97    6.78
DIGI.COM           21.21    9.61    25.40
GENTING            11.55    2.87    4.34
PETRONAS GAS        1.79    1.27    4.75

当我想运行横截面回归来找到每个时期的斜率系数时，我会使用这些公式

library(broom)
fit4 <- lapply(names(Dropcolmax), function(x){
  dd = tidy(lm(lagcolmean[[x]] ~ Dropcolmax[[x]]))
  data.frame(name = x, dd)})

但它会产生以下错误消息：model.frame.default(formula = lagcolmean[[x]] ~ Dropcolmax[[x]], 中的错误：变量“lagcolmean[[x]]”的类型无效（NULL）

【问题讨论】：

标签： r

【解决方案1】：

将您的名称更改为 rownames 和您的 lm 规范：

fit4 <- lapply(rownames(Dropcolmax), function(x){  
  dd = tidy(lm(as.numeric(lagcolmean[rownames(lagcolmean)==x,]) ~ as.numeric(Dropcolmax[rownames(Dropcolmax)==x,])))
  data.frame(name = x, dd)})

你会得到（按行）：

> fit4
[[1]]
             name                                                term   estimate std.error statistic   p.value
1 TENAGA NASIONAL                                         (Intercept) -0.9721944 0.4851794 -2.003783 0.2946863
2 TENAGA NASIONAL as.numeric(Dropcolmax[rownames(Dropcolmax) == x, ])  0.2409783 0.1050042  2.294939 0.2616084

[[2]]
        name                                                term      estimate  std.error   statistic   p.value
1 SIME DARBY                                         (Intercept) -0.2518569598 0.41291644 -0.60994656 0.6513227
2 SIME DARBY as.numeric(Dropcolmax[rownames(Dropcolmax) == x, ])  0.0007936228 0.05517726  0.01438315 0.9908440

编辑 2：如果您希望按列显示结果

indc=names(Dropcolmax) %in% names(lagcolmean)
fit5 <- lapply(names(Dropcolmax)[indc], function(x){  
  df=data.frame(lagcolmean[x] , Dropcolmax[x])
  dd = tidy(lm(df[,1]~df[,2]))
  data.frame(name = x, dd)})

你会得到：

> fit5
[[1]]
    name        term   estimate  std.error statistic    p.value
1 X00.02 (Intercept) -0.3597760 0.12858456 -2.797972 0.06796718
2 X00.02     df[, 2]  0.1260234 0.02606482  4.834999 0.01687099

[[2]]
    name        term   estimate   std.error statistic    p.value
1 X00.03 (Intercept) -0.3545013 0.107404102 -3.300631 0.04571181
2 X00.03     df[, 2]  0.0511678 0.008772428  5.832799 0.01003905

【讨论】：

你是这个意思； library(broom) fit4
非常感谢您的努力。但是你看到输出显示了每家公司的斜率和截距系数，这是一个时间序列回归。但我想要每个时期的斜率和截距系数。像周期 00-02 的斜率，然后是 00-03 的斜率 @Robert