R nls() 初始参数问题，非线性回归答案

【问题标题】：R nls() Initial Parameter Problem, nonlinear RegressionR nls() 初始参数问题，非线性回归
【发布时间】：2020-09-19 02:03:18
【问题描述】：

我收到一条错误消息：

Error in nlsModel(formula, mf, start, wts) : 
  singular gradient matrix at initial parameter estimates

当使用 nls() 函数时

form_Q10_parabolic_SM <- as.formula(Lin_Flux..mymol.m.2.s.1. ~ (rRef<- 5.5354)*a*exp(b*Mean_Soil_Temp_V2..C.)*((-c*Soil_Moist_V3**2)+(d*Soil_Moist_V3)+e))
Q10_parabolic_SM <- nls(form_Q10_parabolic_SM, data = conB1_2015, start = list(a = 1, b = 0.11, c = 0.0001, d = 0.01, e = 0.1))

我通过使用这样的nsltools库的preview()函数得到了我的初始参数（与上面公式的定义相同）

preview(form_Q10_parabolic_SM, data = conB1_2015, start = c(a = 1, b = 0.11, c = 0.0001, d = 0.01, e = 0.1), variable = 1)

这给了我上面参数 a-e 的输出：

这在我看来相当不错，但我现在真的不知道该怎么做，因为 preview() 工作正常。

我的模型是否过于复杂或过度参数化？还是我只是在 nls 函数上做错了什么？

任何提示将不胜感激！

> dput(head(conB1_2015, 30))
structure(list(X = c(13L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 
75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 87L, 
88L, 89L, 90L, 91L, 92L, 93L, 94L, 95L, 96L), IV_Date = c("2015-01-14", 
"2015-03-11", "2015-03-12", "2015-03-13", "2015-03-14", "2015-03-15", 
"2015-03-16", "2015-03-17", "2015-03-18", "2015-03-19", "2015-03-20", 
"2015-03-21", "2015-03-22", "2015-03-23", "2015-03-24", "2015-03-25", 
"2015-03-26", "2015-03-27", "2015-03-28", "2015-03-29", "2015-03-30", 
"2015-03-31", "2015-04-01", "2015-04-02", "2015-04-03", "2015-04-04", 
"2015-04-05", "2015-04-06", "2015-04-07", "2015-04-08"), SMmean010.... = c(24.5341666666667, 
23.4754166666667, 23.0585416666667, 22.830625, 22.7447916666667, 
22.7729166666666, 22.7929166666667, 22.7354166666667, 22.6579166666667, 
22.5935416666667, 22.5233333333333, 22.7641666666667, 23.6010416666667, 
23.445625, 23.404375, 23.2845833333333, 23.0672916666667, 22.9347916666667, 
22.8272916666667, 23.0316666666667, 23.988125, 25.5647916666667, 
27.055, 27.7995833333333, 26.23125, 25.4658333333333, 25.0845833333333, 
24.8175, 24.605, 24.4216666666667), Lin_Flux..mymol.m.2.s.1. = c(1.13, 
2.146, 1.98708333333333, 1.88416666666667, 1.57083333333333, 
1.93041666666667, 2.69875, 2.8075, 3.23272727272727, 2.35818181818182, 
2.23833333333333, 1.84958333333333, 2.18695652173913, 2.16958333333333, 
2.69791666666667, 3.025, 1.985, 1.88083333333333, 2.30416666666667, 
2.775, 1.44458333333333, 1.78791666666667, 1.04863636363636, 
1.03458333333333, 1.4725, 1.86833333333333, 1.71125, 1.79, 1.53166666666667, 
1.97666666666667), Mean_Soil_Temp_V2..C. = c(4.739, 5.1864, 4.08408333333333, 
3.61625, 3.68508333333333, 4.09925, 4.87079166666667, 5.64720833333333, 
6.58433333333333, 5.05075, 4.93708333333333, 4.109, 3.2295, 3.537, 
5.1395, 5.65270833333333, 5.931875, 5.61775, 5.88695833333333, 
6.86308333333333, 5.61833333333333, 4.24566666666667, 3.05952173913043, 
2.45716666666667, 3.6365, 3.68820833333333, 3.83766666666667, 
4.3435, 4.8745, 6.29133333333333), Soil_Moist_V3 = c(25.603137, 
21.98744709, 21.8053864833333, 21.6770563291667, 20.1319423708333, 
19.9826592666667, 19.8279438958333, 20.1589541791667, 21.5796382, 
21.5971315083333, 21.3742824541667, 21.8992939333333, 23.9737254583333, 
23.4506886041667, 23.0956395708333, 22.574581225, 22.3561680833333, 
21.3806269916667, 21.4045219791667, 21.5611478916667, 25.5090813166667, 
28.6440265, 31.4434210347826, 31.9276734541667, 27.5706909333333, 
25.1139413583333, 24.2945348333333, 24.0232171416667, 23.705631425, 
22.8323341625), precip50..mm. = c(0.6, 0, 0, 0, 0.9, 1.3, 0, 
0, 0, 0, 0, 6.6, 0, 0, 0, 0, 0.1, 0.2, 0.1, 6.1, 5, 17.6, 10.4, 
6.6, 0, 0, 0, 0, 0, 0), RWI = c(0.6, 0.4, 0.2, 0.133333333333333, 
0.9, 1.3, 1.3, 0.65, 0.433333333333333, 0.325, 0.26, 6.6, 6.6, 
3.3, 2.2, 1.65, 0.1, 0.2, 0.1, 6.1, 5, 17.6, 10.4, 6.6, 6.6, 
3.3, 2.2, 1.65, 1.32, 1.1)), na.action = structure(c(`1` = 1L, 
`2` = 2L, `3` = 3L, `4` = 4L, `5` = 5L, `6` = 6L, `7` = 7L, `8` = 8L, 
`9` = 9L, `10` = 10L, `11` = 11L, `12` = 12L, `13` = 13L, `15` = 15L, 
`16` = 16L, `17` = 17L, `18` = 18L, `19` = 19L, `20` = 20L, `21` = 21L, 
`22` = 22L, `23` = 23L, `24` = 24L, `25` = 25L, `26` = 26L, `27` = 27L, 
`28` = 28L, `29` = 29L, `30` = 30L, `31` = 31L, `32` = 32L, `33` = 33L, 
`34` = 34L, `35` = 35L, `36` = 36L, `37` = 37L, `38` = 38L, `39` = 39L, 
`40` = 40L, `41` = 41L, `42` = 42L, `43` = 43L, `44` = 44L, `45` = 45L, 
`46` = 46L, `47` = 47L, `48` = 48L, `49` = 49L, `50` = 50L, `51` = 51L, 
`52` = 52L, `53` = 53L, `54` = 54L, `55` = 55L, `56` = 56L, `57` = 57L, 
`58` = 58L, `59` = 59L, `60` = 60L, `61` = 61L, `62` = 62L, `63` = 63L, 
`64` = 64L, `65` = 65L, `66` = 66L, `67` = 67L, `68` = 68L, `199` = 199L, 
`218` = 218L, `219` = 219L, `220` = 220L, `221` = 221L, `222` = 222L, 
`223` = 223L, `224` = 224L, `225` = 225L, `226` = 226L, `227` = 227L, 
`228` = 228L, `229` = 229L, `230` = 230L, `231` = 231L, `232` = 232L, 
`264` = 264L, `265` = 265L, `266` = 266L, `267` = 267L, `352` = 352L, 
`353` = 353L, `354` = 354L, `355` = 355L, `356` = 356L, `357` = 357L, 
`358` = 358L, `359` = 359L, `360` = 360L, `361` = 361L, `362` = 362L, 
`363` = 363L, `364` = 364L, `365` = 365L, `366` = 366L), class = "omit"), row.names = c(14L, 
69L, 70L, 71L, 72L, 73L, 74L, 75L, 76L, 77L, 78L, 79L, 80L, 81L, 
82L, 83L, 84L, 85L, 86L, 87L, 88L, 89L, 90L, 91L, 92L, 93L, 94L, 
95L, 96L, 97L), class = "data.frame")

【问题讨论】：

您可以发布示例数据吗？请使用dput(conB1_2015) 的输出编辑问题。或者，如果 dput(head(conB1_2015, 30)) 的输出太大。

标签： r regression nls

【解决方案1】：

主要问题是参数不是唯一可识别的。我们可以将 a 乘以任意数字，然后将 c、d 和 e 除以相同的数字，得到相同的模型。省略一个。
虽然as.formula 的使用不会有什么坏处，但它已经是一个公式了。
在 nls 公式中进行赋值是非常不寻常的。 nls 会认为 Rref 是一个参数并在该帐户上失败。删除作业。

如果我们进行这些更改，那么它确实会使用问题更新版本中的数据给出答案。

form_Q10_parabolic_SM <- Lin_Flux..mymol.m.2.s.1. ~ 
 exp(b*Mean_Soil_Temp_V2..C.) * ( (-c*Soil_Moist_V3**2) + (d*Soil_Moist_V3) + e)

Q10_parabolic_SM <- nls(form_Q10_parabolic_SM, data = conB1_2015, 
  start = list(b = 0.11, c = 0.0001, d = 0.01, e = 0.1))

给予：

> Q10_parabolic_SM
Nonlinear regression model
  model: Lin_Flux..mymol.m.2.s.1. ~ exp(b * Mean_Soil_Temp_V2..C.) * ((-c *     Soil_Moist_V3^2) + (d * Soil_Moist_V3) + e)
   data: conB1_2015
        b         c         d         e 
 0.103062 -0.001564 -0.135531  3.528621 
 residual sum-of-squares: 3.979

Number of iterations to convergence: 6 
Achieved convergence tolerance: 4.401e-06

线性

注意，nls 也有 plinear 算法，它的优点是只有非线性参数（在这种情况下只有 b）需要起始值。在这种情况下，公式的 RHS 应该是一个矩阵，其列与每个线性参数相乘。它给出了与上面相同的答案，只是线性参数的名称以 .lin 开头。请注意，与使用上述默认算法的版本相比，plinear 版本收敛的迭代次数更少。（而且看起来 plinear 版本对起始值不是很敏感，即使我们使用 b=1 作为起始值它也会收敛。）

fo <- Lin_Flux..mymol.m.2.s.1. ~ 
  cbind(-Soil_Moist_V3**2, Soil_Moist_V3, 1) * exp(b*Mean_Soil_Temp_V2..C.)
fm <- nls(fo, data = conB1_2015, start = list(b = 0.11), algorithm = "plinear")

给予：

> fm
Nonlinear regression model
  model: Lin_Flux..mymol.m.2.s.1. ~ cbind(-Soil_Moist_V3^2, Soil_Moist_V3,     1) * exp(b * Mean_Soil_Temp_V2..C.)
   data: conB1_2015
                 b              .lin1 .lin.Soil_Moist_V3              .lin3 
          0.103062          -0.001564          -0.135528           3.528593 
 residual sum-of-squares: 3.979

Number of iterations to convergence: 3 
Achieved convergence tolerance: 2.189e-06

【讨论】：

我省略了一个，但仍然收到相同的错误消息。我需要分配，因为我想用类似的数据自动对几个 csv 进行回归，并且这些值会随着每个表的变化而轻微变化。但是 nls 函数适用于我所做的更简单回归的这些分配，所以这不是我猜的问题。
我无法重现该错误。现在您已经提供了数据，我使用该数据和 3 个建议的更改运行它，它给出了显示的答案。
感谢您的回答，我删除了 rRef 变量，它确实可以这样工作。我仍然想知道为什么自从使用 Lin_Flux..mymol.m.2.s.1 回归后，这个变量会成为一个问题。 ~ (rRef)*aexp(bMean_Soil_Temp_V2..C.) 没有问题。我可以尝试线性方法，感谢您提供的信息！
它可能会从公式中提取变量名称，并期望每个变量名称都是数据或参数，但 rRef 两者都不是。