“错误 ... %*% ... : non-conformable arguments”在回归中使用自己的函数答案

【问题标题】："Error in ... %*% ... : non-conformable arguments" using own function in regression“错误 ... %*% ... : non-conformable arguments”在回归中使用自己的函数
【发布时间】：2026-01-09 16:50:01
【问题描述】：

我有一个函数 Q(x|delta): R^n --> R 我想为其拟合非线性分位数回归。函数 Q(.) 使用了一些矩阵运算，不使用它会非常复杂。问题是，当公式参数中使用的函数中存在矩阵运算时，nlrq（非线性分位数回归）和 nls（非线性回归）似乎不起作用。

为了说明，考虑更简单的函数 F(x1,x2|a,b,c)，当我不使用矩阵运算时，我可以在 nlrq 和 nls 函数的公式参数中使用它，但它不起作用在用矩阵运算编写的公式参数中。

    library('quantreg')

    ## Generating the data
    x1<- rnorm(200)
    x2<- rnorm(200)
    y<- 1+3*sin(x1)+2*cos(x2) +rnorm(200)
    Dat<- data.frame(y,x1,x2)

    ## The function F1 without matrix operation
    F1<- function(x_1, x_2, a, b,c){a+b*sin(x_1)+c*cos(x_2)}

    ## The function F2 with matrix operation
    F2<- function(x_1, x_2, a, b,c){t(c(1,sin(x_1),cos(x_2)))%*%c(a,b,c)}

    ## Both functions work perfectly
    F1(x_1=3, x_2=2, a=1, b=3,c=2)
    F2(x_1=3, x_2=2, a=1, b=3,c=2)

    ## But only F1 can be estimated by nls and nlrq
    nls_1<-nls(y ~ F1(x_1 = x1, x_2 = x2, a = 1, b, c),
               data = Dat, start = list(b = 3, c = 2))

    nlrq_1<-nlrq(y ~ F1(x_1 = x1, x_2 = x2, a = 1, b, c),
                 data = Dat, start = list(b = 3, c = 2), tau = 0.9)

    ## When F2 is used in the formula argument an error happens
    nls_2<-nls(y ~ F2(x_1 = x1, x_2 = x2, a = 1, b, c),
               data = Dat, start = list(b = 3, c = 2))

    nlrq_2<-nlrq(y ~ F2(x_1 = x1, x_2 = x2, a = 1, b, c),
                 data = Dat, start = list(b = 3, c = 2), tau = 0.9)

错误是Error in t(c(1, sin(x_1), cos(x_2))) %*% c(a, b, c) : non-conformable arguments。我相信如果有人设法通过 nls 和 nlrq 使用矩阵运算来估计 F2，我将能够在我的其他函数中使用相同的解决方案。

Dat 的大小为 200x3。

非常感谢。

【问题讨论】：

对不起，我忘了提到要使用 nlrq 需要 quantreg 包，然后将 install.packages('quantreg') 和 library('quantreg') 添加到代码中。

标签： r matrix-multiplication non-linear-regression

【解决方案1】：

您的函数F2() 不适用于向量参数x_1、x_2、...因为c(...) 仅构造一个长向量（不是矩阵）。
见：

F1(x_1=c(3,5), x_2=c(2,4), a=1, b=3,c=2)
F2(x_1=c(3,5), x_2=c(2,4), a=1, b=3,c=2)

结果：

#> F1(x_1=c(3,5), x_2=c(2,4), a=1, b=3,c=2)
#[1]  0.5910664 -3.1840601
#> F2(x_1=c(3,5), x_2=c(2,4), a=1, b=3,c=2)
#error in t(c(1, sin(x_1), cos(x_2))) %*% c(a, b, c) :  ...

函数nls() 和nlrq() 正在向函数F2()（分别为F1()）发送向量（即数据帧Dat 中的列）。

这里是F2()的一些矢量化定义：

# other definitions for F2()
F2 <- function(x_1, x_2, a, b,c) cbind(1,sin(x_1),cos(x_2)) %*% c(a,b,c)
F2(x_1=c(3,5), x_2=c(2,4), a=1, b=3,c=2)

F2 <- function(x_1, x_2, a, b,c) t(rbind(1,sin(x_1),cos(x_2))) %*% c(a,b,c)
F2(x_1=c(3,5), x_2=c(2,4), a=1, b=3,c=2)

F2 <- function(x_1, x_2, a, b,c) colSums(rbind(1,sin(x_1),cos(x_2)) * c(a,b,c))
F2(x_1=c(3,5), x_2=c(2,4), a=1, b=3,c=2)

F2 <- function(x_1, x_2, a, b,c) crossprod(rbind(1,sin(x_1),cos(x_2)), c(a,b,c))
F2(x_1=c(3,5), x_2=c(2,4), a=1, b=3,c=2)

F2 <- function(x_1, x_2, a, b,c) tcrossprod(c(a,b,c), cbind(1,sin(x_1),cos(x_2)))
F2(x_1=c(3,5), x_2=c(2,4), a=1, b=3,c=2)

【讨论】：

我没想到该函数可以使用向量参数，但是您给了我一个想法，即使用 rbind 而不是 c 在 F2 内部构建向量，因此估计有效！非常感谢。
当我回到我原来的问题时，我明白你为什么指出这是一个问题，函数不接受向量作为参数。 nls 和 nlrq 函数使用整个解释变量向量作为公式中的参数。

【解决方案2】：

您可以为此使用通用优化功能。 R 中通常的默认值是optim，但还有很多其他的。

这是最小二乘回归的情况。损失函数是残差平方和。我已经重写了您的 F2 函数，使其适用于向量参数。

sumsq <- function(beta)
{
    F2 <- function(x1, x2, a, b, c)
    {
        cbind(1, sin(x1), cos(x2)) %*% c(a, b, c)
    }
    yhat <- F2(Dat$x1, Dat$x2, beta[1], beta[2], beta[3])
    sum((Dat$y - yhat)^2)
}

beta0 <- c(mean(Dat$y), 1, 1)

optim(beta0, sumsq, method="BFGS")

#initial  value 731.387431 
#final  value 220.265745 
#converged
#$par
#[1] 0.8879371 3.0211286 2.1639280
# 
#$value
#[1] 220.2657
#
#$counts
#function gradient 
#      25        7 
#
#$convergence
#[1] 0
#
#$message
#NULL

这里，optim 返回一个包含多个组件的列表。分量par是最小化残差平方和的回归系数值，在分量value中。

如果你和nls的结果比较，你可以看到估计的系数大致相等。

nls(y ~ F1(x_1=x1, x_2=x2, a=1, b, c),
           data=Dat, start=list(b=3, c=2))

Nonlinear regression model
  model: y ~ F1(x_1 = x1, x_2 = x2, a = 1, b, c)
   data: Dat
    b     c 
3.026 2.041 
 residual sum-of-squares: 221

Number of iterations to convergence: 1 
Achieved convergence tolerance: 7.823e-10

你可以对分位数回归做类似的事情，但这会更复杂。

【讨论】：

【解决方案3】：

根据其他答案，我发现问题在于使用 c() 函数在 F2 内构建向量。当我改用rbind() 时，估计在nls() 和nlrq() 上都非常有效。

接下来我展示 F2 的修正版本。

    ## Changing c() for rbind()
    F2<- function(x_1, x_2, a, b,c){t(rbind(1,sin(x_1),cos(x_2)))%*%rbind(a,b,c)}

    ## Now nls() and nlrq() work properly
    nls_2<-nls(y ~ F2(x_1 = x1, x_2 = x2, a = 1, b, c),
       data = Dat, start = list(b = 3, c = 2))

    nlrq_2<-nlrq(y ~ F2(x_1 = x1, x_2 = x2, a = 1, b, c),
         data = Dat, start = list(b = 3, c = 2), tau = 0.9)

请注意，nls_2 和 nlrq_2 中的估计值与 nls_1 和 nlrq_1 中的估计值一致。

非常感谢您的帮助。

【讨论】：