【问题标题】:How to pass a formula as an argument to a function in r?如何将公式作为参数传递给 r 中的函数?
【发布时间】:2018-09-30 08:24:16
【问题描述】:

如何在 R 中将公式作为参数传递?

下面的代码适用于前两种情况,但是当我传入公式时,我得到一个错误:Error in model.frame.default(formula = formula, weights = weights, na.action = na.omit, : invalid type (closure) for variable '(weights)'

makeModel<-function(formula,weights) {
    m <- lm(formula, na.action = na.omit, weights = weights)
    return(m);
}
run<-function(t) {
    f<-formula(t$y~t$x+t$r)
    m <- lm(t$y~t$x+t$r, na.action = na.omit, weights = t$size)
    m <- lm(f, na.action = na.omit, weights = t$size)
    m <- makeModels(f,t$size)    
}
l<-20
x<-seq(0,1,1/l)
y<-sqrt(x)
r=round(runif(n=length(x),min=0,max=.8))
n<-1:(l+1)
size=n/sum(n)
t<-data.frame(x,y,r,n,size)
run(t)

编辑1:这段代码:

makeModel<-function(formula,weights,t) {
    print(class(weights))
    m <- lm(formula, na.action = na.omit, weights = weights,data=t)
    return(m);
}
run<-function(t) {
    f<-formula(y~x+r)
    f <- as.formula("t$y~t$x+t$r")
    m <- lm(y~x+r, na.action = na.omit, weights = t$size,data=t)
    m <- lm(f, na.action = na.omit, weights = t$size,data=t)
    m <- makeModel(f,t$size,t)    
}

产生:

model.frame.default 中的错误(公式 = 公式,数据 = t,权重 = 权重,: 变量“(权重)”的无效类型(闭包)

编辑 2:作品:

makeModel <- function(formula, data) {
    # size is looked in data first, which is why this works
    m <- lm(formula, na.action = na.omit, weights = size, data =  data) # works
    #m <- lm(formula, na.action = na.omit, weights = data$size, data =  data) # fails!
    return(m)
}

r 很奇怪!

有谁知道为什么: weights=data$size 行失败?

编辑 3:得到:weights=data$size 起作用。

makeModel<-function(formula,w,data) {
    print(class(weights))
    m <- lm(formula, na.action = na.omit, weights = size, data =  data) # works
    m <- lm(formula, na.action = na.omit, weights = data$size, data =  data) #works
    m <- lm(formula, na.action = na.omit, weights = w,data=data) # fails
    return(m);
}
run<-function(data) {
    f<-formula(y~x+r)
    #f <- as.formula("t$y~t$x+t$r")
    m <- lm(y~x+r, na.action = na.omit, weights = data$size,data=data)
    m <- lm(f, na.action = na.omit, weights = data$size,data=data)
    m <- makeModel(f,data$size,data)    
}

最后一个失败并显示: eval 中的错误(extras,data,env):找不到对象 'w'

【问题讨论】:

  • 似乎不起作用。见编辑
  • 关于您的新问题,请查看我的帖子,我强调您分配公式的环境会有所不同。
  • 是的,但很难理解。我错过了关于使用 t 的部分。

标签: r formula


【解决方案1】:

参见?as.formula 中的示例。您不应该从变量名中显式调用变量。公式应该是抽象的,lm 会知道要从 data 中提取哪些变量,您应该指定。

makeModels <- function(formula, data) {
  # size is looked in data first, which is why this works
  m <- lm(formula, na.action = na.omit, weights = size, data =  data)
  return(m)
}

run <- function(t) {
  f <- formula(y ~ x + r)
  m1 <- lm(formula = f, na.action = na.omit, weights = size, data = t)
  m2 <- makeModels(formula = f, data = t)
  return(list(m1, m2))
}

l<-20
x<-seq(0,1,1/l)
y<-sqrt(x)
r=round(runif(n = length(x), min = 0, max = 0.8))
n<-1:(l+1)
size=n/sum(n)
t<-data.frame(x,y,r,n,size)
run(t)

[[1]]

Call:
lm(formula = f, data = t, weights = t$size, na.action = na.omit)

Coefficients:
(Intercept)            x            r  
   0.327154     0.706553    -0.008167  


[[2]]

Call:
lm(formula = formula, data = data, weights = size, na.action = na.omit)

Coefficients:
(Intercept)            x            r  
   0.327154     0.706553    -0.008167  

【讨论】:

    【解决方案2】:

    避免分配与转置函数一致的名为t 的对象。查看回溯产量

    makeModel<-function(formula,weights) {
      m <- lm(formula, na.action = na.omit, weights = weights)
      return(m)
    }
    run<-function(x) {
      f<-formula(x$y~x$x+x$r)
      m <- lm(x$y~x$x+x$r, na.action = na.omit, weights = x$size)
      m <- lm(f, na.action = na.omit, weights = x$size)
      m <- makeModel(f,x$size)    
    }
    l<-20
    x<-seq(0,1,1/l)
    y<-sqrt(x)
    r=round(runif(n=length(x),min=0,max=.8))
    n<-1:(l+1)
    size=n/sum(n)
    x<-data.frame(x,y,r,n,size)
    run(x)
    #R Error in model.frame.default(formula = formula, weights = weights, na.action = na.omit,  : 
    #R    invalid type (closure) for variable '(weights)'
    traceback()
    #R 7: model.frame.default(formula = formula, weights = weights, na.action = na.omit, 
    #R                        drop.unused.levels = TRUE)
    #R 6: stats::model.frame(formula = formula, weights = weights, na.action = na.omit, 
    #R                       drop.unused.levels = TRUE)
    #R 5: eval(mf, parent.frame())
    #R 4: eval(mf, parent.frame())
    #R 3: lm(formula, na.action = na.omit, weights = weights) at #3
    #R 2: makeModel(f, x$size) at #5
    #R 1: run(t)
    

    现在debug(model.frame.default) 表明this line 是由于these linethis line 而出错的地方。原因是它调用了

    eval(list(weights = weights), environment(formula), environment(formula))
    

    并且在run 环境(分配公式的环境)中没有分配weights 对象,因此它产生stats::weights。三种解决方案是

    makeModel <- function(formula, weights) {
      environment(formula) <- environment()
      lm(formula, na.action = na.omit, weights = weights)
    }
    run<-function(x) {
      f <- x$y ~ x$x + x$r
      makeModel(f, x$size)  
    }
    x1 <- run(x)
    
    makeModel <- function(formula, weights) {
      cl <- match.call()
      cl[[1L]] <- quote(lm)
      cl$na.action <- quote(na.omit)
      eval(cl, parent.frame())
    }
    run<-function(x) {
      f <- x$y ~ x$x + x$r
      makeModel(f, x$size)  
    }
    x2 <- run(x)
    
    makeModel <- function(formula, weights, x) {
      cl <- match.call()
      cl[[1]] <- quote(lm)
      cl$x <- NULL
      cl[c("data", "formula", "na.action")] <- 
        list(quote(x), formula, quote(na.omit))
      eval(cl)
    }
    run<-function(x) {
      f <- y ~ x + r
      makeModel(f, size, x)  
    }
    x3 <- run(x)
    
    stopifnot(all.equal(coef(x1), coef(x2)))
    stopifnot(all.equal(coef(x1), coef(x3), check.attributes = FALSE))
    

    例如,上面的第一个解决方案意味着

    eval(list(weights = weights), environment(formula), environment(formula))
    

    成功,因为在formula 的环境中分配了一个weights 对象。第二种解决方案在run 环境中使用weights = x$size 进行调用,从而成功。第三个类似于Roman Luštrik's answer,尽管如果您知道weights 参数始终是size 列,他的解决方案比我建议的第三个更干净。这里的电话是

    eval(list(weights = size), data, environment(formula))
    

    因为sizedata 中的一列,所以它有效。

    【讨论】:

    • 如果用户以正确的方式指定公式,所有这些都会消失。即使需要从变量构造公式,连接公式并将其包装到as.formula 中也会更简洁。
    • 我同意,如果您知道 weights 始终被称为 size,您的解决方案会更好。如果 OP 想要一个带有参数function(formula, weights)(正如他发布的那样)而不是function(formula, data) 的函数怎么办?但是,我宁愿避免使用带有 $ 调用的公式。
    最近更新 更多