【问题标题】:Python - Scipy and Numpy don't get along - problem using numpy arrays in scipy.optimize.curve_fitPython - Scipy 和 Numpy 不相处 - 在 scipy.optimize.curve_fit 中使用 numpy 数组的问题
【发布时间】:2022-01-17 21:39:29
【问题描述】:

我正在尝试将一些 experimental data (x and y)custom function (Srt) 匹配并使用 scipy.optimize.curve_fit()

读取数据并定义函数,使用 Km 和 Vmax 的虚拟值 (10,10)(将使用曲线拟合确定)工作正常,只要我使用 np.asarray()

from scipy.special import lambertw
from scipy.optimize import curve_fit
import numpy as np
import scipy

def Srt(t,s,Km,Vmax):
    print("t",type(t))
    print("t",t)
    print("last element of t:",t[-1])
    print("s",type(s))
    print("s",s)
    print("last element of s:",s[-1])
    Smax = s[-1] # Substrate concentration at end of reaction 
    t0 = t[0]    # time=0 (beginning of reaction)
    s0 = s[0]    # Substrate concentration at time = 0 (beginning of reaction)
    E = np.exp(((Smax - s0) - Vmax*(t+t0))/Km)
    L = lambertw(((Smax - s0)/Km)*E)
    y = Smax - Km*L
    return y

x=[2.780000e-03,2.778000e-02,5.278000e-02,7.778000e-02,1.027800e-01
,1.277800e-01,1.527800e-01,1.777800e-01,2.027800e-01,2.277800e-01
,2.527800e-01,2.777800e-01,3.027800e-01,3.277800e-01,3.527800e-01]

y=[0.44236,0.4308,0.42299,0.41427,0.40548,0.39908,0.39039,0.3845,0.37882
,0.37411,0.36759,0.36434,0.35864,0.35508,0.35138]

xdata = np.asarray(x)
ydata = np.asarray(y)
Srt(xdata, ydata,10,10)

如果我不使用np.asarray,我会收到“类型错误”:

Srt(x, y,10,10)

当我继续使用 curve_fit 来拟合 Vmax 和 Km 时:

parameters, covariance = scipy.optimize.curve_fit(Srt, xdata, ydata)

我遇到了麻烦:

如果我正确理解错误消息,由于某种原因,数组ydata 在读入s 时不再是数组?!?

我必须在我的代码中进行哪些更改才能使用我的函数 Srtcurve_fit

编辑:代码的完整输出:

t <class 'numpy.ndarray'>
t [0.00278 0.02778 0.05278 0.07778 0.10278 0.12778 0.15278 0.17778 0.20278
 0.22778 0.25278 0.27778 0.30278 0.32778 0.35278]
last element of t: 0.35278
s <class 'numpy.ndarray'>
s [0.44236 0.4308  0.42299 0.41427 0.40548 0.39908 0.39039 0.3845  0.37882
 0.37411 0.36759 0.36434 0.35864 0.35508 0.35138]
last element of s: 0.35138
t <class 'numpy.ndarray'>
t [0.00278 0.02778 0.05278 0.07778 0.10278 0.12778 0.15278 0.17778 0.20278
 0.22778 0.25278 0.27778 0.30278 0.32778 0.35278]
last element of t: 0.35278
s <class 'numpy.float64'>
s 1.0

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-23-5ce34d06b849> in <module>
     33 #then the problems start
     34 
---> 35 parameters, covariance = scipy.optimize.curve_fit(Srt, xdata, ydata)

~\anaconda3\lib\site-packages\scipy\optimize\minpack.py in curve_fit(f, xdata, ydata, p0, sigma, absolute_sigma, check_finite, bounds, method, jac, **kwargs)
    761         # Remove full_output from kwargs, otherwise we're passing it in twice.
    762         return_full = kwargs.pop('full_output', False)
--> 763         res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)
    764         popt, pcov, infodict, errmsg, ier = res
    765         ysize = len(infodict['fvec'])

~\anaconda3\lib\site-packages\scipy\optimize\minpack.py in leastsq(func, x0, args, Dfun, full_output, col_deriv, ftol, xtol, gtol, maxfev, epsfcn, factor, diag)
    386     if not isinstance(args, tuple):
    387         args = (args,)
--> 388     shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
    389     m = shape[0]
    390 

~\anaconda3\lib\site-packages\scipy\optimize\minpack.py in _check_func(checker, argname, thefunc, x0, args, numinputs, output_shape)
     24 def _check_func(checker, argname, thefunc, x0, args, numinputs,
     25                 output_shape=None):
---> 26     res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
     27     if (output_shape is not None) and (shape(res) != output_shape):
     28         if (output_shape[0] != 1):

~\anaconda3\lib\site-packages\scipy\optimize\minpack.py in func_wrapped(params)
    461     if transform is None:
    462         def func_wrapped(params):
--> 463             return func(xdata, *params) - ydata
    464     elif transform.ndim == 1:
    465         def func_wrapped(params):

<ipython-input-23-5ce34d06b849> in Srt(t, s, Km, Vmax)
     10     print("s",type(s))
     11     print("s",s)
---> 12     print("last element of s:",s[-1])
     13     Smax = s[-1] # Substrate concentration at end of reaction
     14     t0 = t[0]    # time=0 (beginning of reaction)

IndexError: invalid index to scalar variable.

编辑 2 功能齐全的代码,感谢 Jonathan Weine。 由于“糟糕”的实验数据,拟合不是最理想的,我现在正在玩我的完整数据集:D

from scipy.special import lambertw
from scipy.optimize import curve_fit
import numpy as np
import scipy

def Srt(t, Smax: float, s0: float, Km: float, Vmax: float):
    t0 = t[0]    # time=0 (beginning of reaction)
    E = np.exp(((Smax - s0) - Vmax*(t+t0))/Km)
    # L = lambertw(((Smax - s0)/Km)*E)  # this apparently can be complex which causes another Error
    L = np.abs(lambertw(((Smax - s0)/Km)*E))
    y = Smax + Km*L
    return y

y = [0.44236,0.4308,0.42299,0.41427,0.40548,0.39908,0.39039,0.3845,0.37882
,0.37411,0.36759,0.36434,0.35864,0.35508,0.35138,0.34748,0.34437,0.34143
,0.3391,0.3348,0.33345,0.31404,0.30212,0.29043,0.28026,0.27331,0.26672
,0.26187,0.25645,0.25208,0.24736,0.244,0.24056,0.23798,0.23359,0.23138
,0.22845,0.22566,0.22384,0.22112,0.21894,0.21672,0.21466,0.21316,0.21209
,0.20941,0.20823,0.20687,0.2056,0.20324,0.20266,0.20095,0.19935,0.19895
,0.19746,0.19616,0.19486,0.19419,0.19382,0.19301,0.19085,0.19108,0.19024
,0.18933,0.18839,0.18706,0.18643,0.18623,0.18569,0.18469,0.18381,0.18341
,0.18331,0.18324,0.18222,0.18106,0.18039,0.18022,0.17906,0.17935,0.17842
,0.17834,0.1781,0.17731,0.17704,0.1766,0.17654,0.1761,0.17568,0.1744
,0.17453,0.17393,0.17325,0.17329,0.17302,0.17347,0.17344,0.17233,0.17228
,0.17208,0.17177,0.1712,0.17076,0.171,0.17043,0.17057,0.17003,0.16965
,0.16923,0.16944,0.16898,0.16879,0.16809,0.16821,0.16794,0.16831,0.16779
,0.16805,0.16765,0.16762,0.16695,0.16694,0.1669,0.16642,0.16583,0.166
,0.16625,0.16575,0.1658,0.16553,0.16565,0.1654,0.16419,0.16487,0.16467
,0.16452,0.16433,0.16468,0.16423,0.16427,0.16372,0.16388,0.16388,0.16394
,0.16382,0.1631,0.16353,0.1638,0.16304,0.163,0.16296,0.16295,0.16284
,0.16275,0.16214,0.16243,0.16211,0.16207,0.16185,0.16187,0.16176,0.16168
,0.16195,0.16138,0.16177,0.16121,0.16163,0.16121,0.161,0.16114,0.16122
,0.16096,0.16105,0.16102,0.16068,0.16031,0.16028,0.16051,0.16045,0.16017
,0.15977,0.15927,0.16007,0.15953,0.15933,0.1596,0.15911,0.15903,0.15884
,0.15856,0.15889,0.15888,0.15861,0.15849,0.158,0.15822,0.15776,0.15759
,0.15734,0.15757,0.15718,0.15699,0.15747,0.15692,0.15701,0.15715,0.15675
,0.15732,0.15687,0.15659,0.15664,0.15635,0.15633,0.15591] 
#csvFile.iloc[0:500,9] 

x = [2.780000e-03,2.778000e-02,5.278000e-02,7.778000e-02,1.027800e-01
,1.277800e-01,1.527800e-01,1.777800e-01,2.027800e-01,2.277800e-01
,2.527800e-01,2.777800e-01,3.027800e-01,3.277800e-01,3.527800e-01
,3.777800e-01,4.027800e-01,4.277800e-01,4.527800e-01,4.777800e-01
,5.027800e-01,7.538900e-01,1.003890e+00,1.253890e+00,1.503890e+00
,1.753890e+00,2.003890e+00,2.253890e+00,2.503890e+00,2.753890e+00
,3.003890e+00,3.253890e+00,3.503890e+00,3.753890e+00,4.003890e+00
,4.253890e+00,4.503890e+00,4.753890e+00,5.003890e+00,5.253890e+00
,5.503890e+00,5.753890e+00,6.003890e+00,6.253890e+00,6.503890e+00
,6.753890e+00,7.003890e+00,7.253890e+00,7.503890e+00,7.753890e+00
,8.003890e+00,8.253890e+00,8.503890e+00,8.753890e+00,9.003890e+00
,9.253890e+00,9.503890e+00,9.753890e+00,1.000389e+01,1.025389e+01
,1.050389e+01,1.075389e+01,1.100389e+01,1.125389e+01,1.150389e+01
,1.175389e+01,1.200389e+01,1.225389e+01,1.250389e+01,1.275389e+01
,1.300389e+01,1.325389e+01,1.350389e+01,1.375389e+01,1.400389e+01
,1.425389e+01,1.450389e+01,1.475389e+01,1.500389e+01,1.525389e+01
,1.550389e+01,1.575389e+01,1.600389e+01,1.625389e+01,1.650389e+01
,1.675389e+01,1.700389e+01,1.725389e+01,1.750389e+01,1.775389e+01
,1.800389e+01,1.825389e+01,1.850389e+01,1.875389e+01,1.900389e+01
,1.925389e+01,1.950389e+01,1.975389e+01,2.000389e+01,2.025389e+01
,2.050389e+01,2.075389e+01,2.100389e+01,2.125389e+01,2.150389e+01
,2.175389e+01,2.200389e+01,2.225389e+01,2.250389e+01,2.275389e+01
,2.300389e+01,2.325389e+01,2.350389e+01,2.375389e+01,2.400389e+01
,2.425389e+01,2.450389e+01,2.475389e+01,2.500389e+01,2.525389e+01
,2.550389e+01,2.575389e+01,2.600389e+01,2.625389e+01,2.650389e+01
,2.675389e+01,2.700389e+01,2.725389e+01,2.750389e+01,2.775389e+01
,2.800389e+01,2.825389e+01,2.850389e+01,2.875389e+01,2.900389e+01
,2.925389e+01,2.950389e+01,2.975389e+01,3.000389e+01,3.025389e+01
,3.050389e+01,3.075389e+01,3.100389e+01,3.125389e+01,3.150389e+01
,3.175389e+01,3.200389e+01,3.225389e+01,3.250389e+01,3.275389e+01
,3.300389e+01,3.325389e+01,3.350389e+01,3.375389e+01,3.400389e+01
,3.425389e+01,3.450389e+01,3.475389e+01,3.500389e+01,3.525389e+01
,3.550389e+01,3.575389e+01,3.600389e+01,3.625389e+01,3.650389e+01
,3.675389e+01,3.700389e+01,3.725389e+01,3.750389e+01,3.775389e+01
,3.800389e+01,3.825389e+01,3.850389e+01,3.875389e+01,3.900389e+01
,3.925389e+01,3.950389e+01,3.975389e+01,4.000389e+01,4.025389e+01
,4.050389e+01,4.075389e+01,4.100389e+01,4.125389e+01,4.150389e+01
,4.175389e+01,4.200389e+01,4.225389e+01,4.250389e+01,4.275389e+01
,4.300389e+01,4.325389e+01,4.350389e+01,4.375389e+01,4.400389e+01
,4.425389e+01,4.450389e+01,4.475389e+01,4.500389e+01,4.525389e+01
,4.550389e+01,4.575389e+01,4.600389e+01,4.625389e+01,4.650389e+01
,4.675389e+01,4.700389e+01,4.725389e+01,4.750389e+01,4.775389e+01
,4.800389e+01,4.825389e+01,4.850389e+01,4.875389e+01] 
# csvFile.iloc[0:500,0] 

xdata = np.array(x)
ydata = np.array(y)

parameters, covariance = scipy.optimize.curve_fit(Srt, xdata, ydata)
Km = parameters[1]
Vmax = parameters[0]
fit_y = Srt(xdata, ydata[-1],ydata[0], Km, Vmax)

print("Km: ", parameters[1], "Vmax: ", parameters[0])
plt.plot(xdata, fit_y, '-', color="green",linewidth=1)
plt.plot(xdata, ydata, 'o', color="red")

【问题讨论】:

  • 使用numpy,你使用的是广播numpy.org/doc/stable/user/basics.broadcasting.html,没有numpy,你的计算是float + list,会抛出连接失败的错误。
  • 将完整的回溯发布为文本,而不是图像。您的函数应该可以使用标量调用。删除不适当的打印输出,或更改它们以正确接受标量。

标签: python numpy scipy


【解决方案1】:

请仔细查看curve_fit 函数的documentation。它声明ydata 必须名义上是func(xdata... ) 的结果。因此,您传递给 curve_fit 的 ydata 永远不会像您在手动调用中指出的那样作为 Srt 调用的参数传递。

此外,要估计的参数必须具有相同的形状,这意味着您必须将Smaxs0 定义为浮点输入。我修改了您的示例,使其实际运行:

from scipy.special import lambertw
from scipy.optimize import curve_fit
import numpy as np
import scipy

def Srt(t, Smax: float, s0: float, Km: float, Vmax: float):
    t0 = t[0]    # time=0 (beginning of reaction)
    E = np.exp(((Smax - s0) - Vmax*(t+t0))/Km)
    # L = lambertw(((Smax - s0)/Km)*E)  # this apparently can be complex which causes another Error
    L = np.abs(lambertw(((Smax - s0)/Km)*E))
    y = Smax - Km*L
    return y

x=[2.780000e-03,2.778000e-02,5.278000e-02,7.778000e-02,1.027800e-01
,1.277800e-01,1.527800e-01,1.777800e-01,2.027800e-01,2.277800e-01
,2.527800e-01,2.777800e-01,3.027800e-01,3.277800e-01,3.527800e-01]

y=[0.44236,0.4308,0.42299,0.41427,0.40548,0.39908,0.39039,0.3845,0.37882
,0.37411,0.36759,0.36434,0.35864,0.35508,0.35138]

xdata = np.array(x)
ydata = np.array(y)

parameters, covariance = scipy.optimize.curve_fit(Srt, xdata, ydata)

注意
函数内部的np.abs 没有意义,但lambertw 的复数结果显然可以复数。在这种情况下,由于没有安全的强制转换规则,会引发错误,从而导致曲线拟合中止。

【讨论】:

  • Smax, s0, Km, Vmax 的合理初始条件作为 curve_fit 的关键字参数提供,例如p0=(1., 0., 10. ,10.) 可以解决 lambertw 结果复杂的问题
  • 谢谢,你解决了我的问题。我将使用我的“真实”数据尝试不同的初始条件 - 那里可能存在一些挑战,我必须解决问题......
【解决方案2】:

您的第一个错误是由t+t0 表达式产生的。它t 是一个列表x,这是一个列表“连接”表达式,适用于[1,2,3]+[4,5],但不适用于[1,2,3]+5。这就是为什么 xy 必须使用数组。

在第二个错误中,发生了什么

print("s",type(s))
print("s",s)

显示?显然s 不是一个数组,甚至不是一个列表。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2014-08-16
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-09-04
    • 1970-01-01
    • 2023-03-24
    • 1970-01-01
    相关资源
    最近更新 更多