如何绘制适合 ggplot2 的 nls 模型的输出答案

【问题标题】：How to plot the output from an nls model fit in ggplot2如何绘制适合 ggplot2 的 nls 模型的输出
【发布时间】：2018-03-29 18:10:39
【问题描述】：

我有一些数据，我想使用 nls 将非线性模型拟合到数据的每个子集，然后使用 ggplot2 将拟合模型叠加到数据点图上。具体来说，模型的形式是

y~V*x/(K+x)

您可能会认出它是 Michaelis-Menten。一种方法是使用 geom_smooth，但如果我使用 geom_smooth，我将无法检索模型拟合的系数。或者，我可以使用 nls 拟合数据，然后绘制使用 geom_smooth 拟合的线，但是我怎么知道 geom_smooth 绘制的曲线与我的 nls 拟合给出的曲线相同？我不能将 nls fit 中的系数传递给 geom_smooth 并告诉它使用它们，除非我可以让 geom_smooth 只使用数据的一个子集，然后我可以指定起始参数以便它可以工作，但是......每个我尝试过的时候得到如下错误读数：

Aesthetics must be either length 1 or the same as the data (8): x, y, colour

这是我一直在使用的一些示例数据：

Concentration <- c(500.0,250.0,100.0,62.5,50.0,25.0,12.5,5.0,
                   500.0,250.0,100.0,62.5,50.0,25.0,12.5,5.0)

drug <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2)

rate <- c(1.889220,1.426500,0.864720,0.662210,0.564340,0.343140,0.181120,0.077170,
          3.995055,3.011800,1.824505,1.397237,1.190078,0.723637,0.381865,0.162771)

file<-data.frame(Concentration,drug,rate)

在我的图中，浓度为 x，速率为 y；药物将是颜色变量。如果我写以下内容，我会收到该错误：

plot <- ggplot(file,aes(x=file[,1],y=file[,3],color=Compound))+geom_point()

plot<-plot+geom_smooth(data=subset(file,file[,2]==drugNames[i]),method.args=list(formula=y~Vmax*x/(Km+x),start=list(Vmax=coef(models[[i]])[1],Km=coef(models[[i]])[2])),se=FALSE,size=0.5)

其中 models[[]] 是 nls 返回的模型参数列表。

关于如何在 geom_smooth 中对数据框进行子集化以便我可以使用我的 nls 拟合的起始参数单独绘制曲线的任何想法？

【问题讨论】：

ggplot2 plot function with several arguments的可能重复
不相关，但将plot、file 作为变量名不是一个好主意（这些名称存在函数）。
另外：查看生成models的代码会有所帮助。
我还注意到您的示例数据变量与代码不匹配：e.g. drug、Compound、drugNames 都使用了。
我认为这些帖子会对您有所帮助：Enzyme kinetics with R; Plotting two enzyme plots with ggplot

标签： r ggplot2

【解决方案1】：

理想的解决方案是使用ggplot 绘制nls() 的结果，但这里有一个基于几个观察的“快速而肮脏”的解决方案。

首先，您可以确定，如果您对nls() 和geom_smooth(method = "nls") 使用相同的公式，您将获得相同的系数。那是因为后者在调用前者。

其次，使用您的示例数据，nls() 收敛到相同的 Vmax 和 Km 值（每种药物不同），无论起始值如何。换句话说，没有必要使用每种药物的范围内的起始值来构建模型。以下任何一项对药物 1 给出相同的结果（对药物 2 也是如此）：

library(dplyr)
# use maximum as start
df1 %>% 
  filter(drug == 1) %>% 
  nls(rate ~ Vm * Concentration/(K + Concentration), 
      data = ., 
      start = list(K = max(.$Concentration), Vm = max(.$rate)))

# use minimum as start
df1 %>% 
  filter(drug == 1) %>% 
  nls(rate ~ Vm * Concentration/(K + Concentration), 
      data = ., 
      start = list(K = min(.$Concentration), Vm = min(.$rate)))

# use arbitrary values as start
df1 %>% 
  filter(drug == 1) %>% 
  nls(rate ~ Vm * Concentration/(K + Concentration), 
      data = ., 
      start = list(K = 50, Vm = 2))

因此，绘制曲线的最快方法是将药物映射到ggplot 美学，例如颜色。这将从相同的起始值构造单独的nls 曲线，然后如果需要获取系数，您可以返回到nls()，知道模型应该与绘图相同。

使用您的示例数据file（但不要称它为file，我使用的是df1）：

library(ggplot2)
df1 <- structure(list(Concentration = c(500, 250, 100, 62.5, 50, 25, 12.5, 5, 
                                        500, 250, 100, 62.5, 50, 25, 12.5, 5), 
                      drug = c(1, 1, 1, 1, 1, 1, 1, 1, 
                               2, 2, 2, 2, 2, 2, 2, 2), 
                      rate = c(1.88922, 1.4265, 0.86472, 0.66221, 0.56434, 0.34314, 
                               0.18112, 0.07717, 3.995055, 3.0118, 1.824505, 1.397237, 
                               1.190078, 0.723637, 0.381865, 0.162771)),
                      .Names = c("Concentration", "drug", "rate"), 
                      row.names = c(NA, -16L), 
                      class = "data.frame")

# could use e.g. Km = min(df1$Concentration) for start
# but here we use arbitrary values
ggplot(df1, aes(Concentration, rate)) + 
  geom_point() + 
  geom_smooth(method = "nls", 
              method.args = list(formula = y ~ Vmax * x / (Km + x),
                                 start = list(Km = 50, Vmax = 2)), 
              data = df1,
              se = FALSE,
              aes(color = factor(drug)))

【讨论】：

谢谢！那很完美。很抱歉我上面的代码中的错误