在数据集中拟合多条逻辑增长曲线答案

【问题标题】：Fitting multiple logistic growth curves in a dataset在数据集中拟合多条逻辑增长曲线
【发布时间】：2017-09-18 08:29:30
【问题描述】：

我有多个县的人口数据，并希望尽量减少重复拟合每个县的逻辑增长曲线。

county      year    pop
lake        1970    69305
lake        1980    104870
lake        1990    152104
lake        2000    210528
lake        2010    297052
marion      1970    69030
marion      1980    122488
marion      1990    194833
marion      2000    258916
marion      2010    331298
seminole    1970    83692
seminole    1980    179752
seminole    1990    287529
seminole    2000    365196
seminole    2010    422718

目前我正在对每个县进行细分：

lake<-countypop[1:5,2:3]
colnames(lake)<-c("year", "pop")
marion<-countypop[6:10,2:3]
colnames(marion)<-c("year", "pop")
seminole<-countypop[11:15,2:3]

然后使用 SSlogis 绘制每个县的曲线，例如：

lake.model <- nls(pop ~ SSlogis(year, phi1, phi2, phi3, data = lake)))
alpha <- coef(lake.model)
plot(pop ~ year, data = lake, main = "Logistic Growth Model of Lake County", 
xlab = "Year", ylab = "Population", xlim = c(1920, 2030),ylim=c(0,1000000))  
curve(alpha[1]/(1 + exp(-(x - alpha[2])/alpha[3])), add = T, col = "blue")

我有大约 60 个县，我知道必须有一种更清洁的方法来做到这一点。如何使用应用函数、循环或其他东西来消除代码中的重复？

【问题讨论】：

Mayeb 试试?nlsList 例如lake.model <- nlsList(pop ~ SSlogis(year, phi1, phi2, phi3)|county, data = dat)，其中dat 是所有数据
当我尝试这样做时，我得到： nlsList.formula 中的错误（pop ~ SSlogis(year, phi1, phi2, phi3), data =countypop) : 'data' must be a "groupedData" object如果“公式”不包括组
使用 Gabor 回答中的数据：nlme::nlsList(pop ~ SSlogis(year, phi1, phi2, phi3)|county, data = countypop) 似乎工作正常

标签： r nls

【解决方案1】：

试试这个：

pdf("countypop.pdf")
models <- by(countypop, countypop$county, function(x) {
  fm <- nls(pop ~ SSlogis(year, phi1, phi2, phi3), data = x)
  plot(pop ~ year, x, main = county[1])
  lines(fitted(fm) ~ year, x)
  fm
})
dev.off()

注意：我们将此用作输入：

countypop <- 
structure(list(county = c("lake", "lake", "lake", "lake", "lake", 
"marion", "marion", "marion", "marion", "marion", "seminole", 
"seminole", "seminole", "seminole", "seminole"), year = c(1970L, 
1980L, 1990L, 2000L, 2010L, 1970L, 1980L, 1990L, 2000L, 2010L, 
1970L, 1980L, 1990L, 2000L, 2010L), pop = c(69305L, 104870L, 
152104L, 210528L, 297052L, 69030L, 122488L, 194833L, 258916L, 
331298L, 83692L, 179752L, 287529L, 365196L, 422718L)), .Names = c("county", 
"year", "pop"), class = "data.frame", row.names = c(NA, -15L))

【讨论】：