哪个线性模型汇总行对应于公式中的哪个术语？答案

【问题标题】：Which linear model summary row corresponds to which term in formula?哪个线性模型汇总行对应于公式中的哪个术语？
【发布时间】：2026-02-18 09:05:01
【问题描述】：

线性模型的摘要使用某些字符串来表示其输出中的系数，例如：

summary(lm(
 target ~ some.bool + some.factor + some.factor*some.value +
          some.factor:some.other,
 data.frame(target=rnorm(100), some.bool=sample(c(T, F), 100, T),
  some.factor=sample(c('Y', 'N', 'M'), 100, T), some.value=rnorm(100),
  some.other=rnorm(100))))

生成一个带有名称的表： some.boolTRUE, some.factorN, some.factorY, some.value, some.factorN:some.value, some.factorY:some.value, some.factorM:some.other, some.factorN:some.other, some.factorY:some.other.

如何以编程方式找出表格的哪些行对应于输入公式的哪些项？我想获得一些映射，例如：

`some.boolTRUE`            → some.bool
`some.factorN`:            → some.factor, some.factor*some.value
`some.factorY`:            → some.factor, some.factor*some.value
`some.value`:              → some.factor*some.value
`some.factorN:some.value`: → some.factor*some.value
`some.factorN:some.other`: → some.factor:some.other

我的目标是为结果准备一种特定的表示形式，其中线性回归的数据按输入项分组。

【问题讨论】：

也许您需要查阅统计文本。 FWIW，x*y 是 x + y + x:y 的简写。
@RomanLuštrik：我知道。我只想将行与用户输入的术语匹配，无论用户输入什么，以编程方式。
您能否详细说明一下输入和预期输出到底是什么以及为什么需要它？
@Roland：谢谢你的建议，我添加了一个例子和我的目标。

标签： r formula

【解决方案1】：

所以，我注意到生成这些名称的代码位于被称为外部 C 函数的 model.matrix 函数的深处。我可以使用以下 hack 恢复由术语构建的名称（term 是从公式本身中取出的表达式/符号对象）：

names.for.term <- function(term, data, order.as.in=term) {
  # construct a simple formula that has only the requested term
  f <- formula(substitute(~ x, list(x=term)))

  # make a terms object for manipulation
  term.terms <- terms(f, data=data)

  # what order do we want to consider variables in?
  requested.order <- na.omit(match(
    row.names(attr(terms(order.as.in), 'factors')),
    row.names(attr(term.terms, 'factors'))))

  # force the order of variables (setting row.names is enough;
  # values in this array are not important for the process of building
  # strings if you have only a single summand. if not, good luck)
  row.names(attr(term.terms, 'factors')) <-
    row.names(attr(term.terms, 'factors'))[requested.order]

  # we need model frame object to have columns in the same order as
  # rows above; types of variables (e.g. factors) are inferred from here
  m <- model.frame(f, data)[requested.order]

  # call deep into C code
  dimnames(.External2(stats:::C_modelmatrix, term.terms, m))[[2]][-1]
}

丑陋，但有效。由于字符串取决于该函数调用在术语中遇到的变量的顺序，因此您可能希望将完整的公式作为order.as.in 传递。现在唯一剩下的就是反转映射，这在这一点上是微不足道的。

【讨论】：