scipy曲线拟合中的回归系数答案

【问题标题】：Regression coefficient in scipy curve fittingscipy曲线拟合中的回归系数
【发布时间】：2021-10-04 12:31:21
【问题描述】：

我面临以下问题。

我有这些数据：

x = np.array([ 1.00E-03, 1.00E-04, 1.00E-05, 1.00E-06 ])

y = np.array([ 0.01, 0.002469136, 0.000771605, 0.000257202 ])

我想对这个数据进行幂律拟合并得到回归系数。

但是，我在 WPS office 和 scipy 之间得到了不同的结果。

我的代码如下：

import numpy as np
from scipy.optimize import curve_fit
from sklearn.metrics import r2_score

xdata = x
ydata = y

# Power Law function
def f(x,a,b):
    return (a*(x**b))

popt, pcov = curve_fit(f,  xdata,  ydata)

r_squared = r2_score(ydata, f(xdata, popt[0], popt[1]))

在 WPS 办公室我得到 R² = 0.9968

在谷歌表格中相同的值

在 scipy 中，我得到 R² = 0.9995。

关于为什么会发生这种情况的任何解释？即使使用可能的不同算法，它们也应该收敛到相似的解决方案，不是吗？

最好的问候！

【问题讨论】：

标签： python scipy

【解决方案1】：

好的...我找到了答案。

R2 的计算如下：https://www.got-it.ai/solutions/excel-chat/excel-tutorial/r-squared/r-squared-in-excel

所以：

x_log = np.log(x)
y_log = np.log(y)

tmp1 = len(x)*(np.sum(x_log*y_log))-np.sum(x_log)*np.sum(y_log)
tmp2 = len(x)*np.sum(x_log**2)-np.sum(x_log)**2
tmp3 = len(x)*np.sum(y_log**2)-np.sum(y_log)**2    

r2 = (tmp1/np.sqrt(tmp2*tmp3))**2

产生正确的值

【讨论】：

【解决方案2】：

如果我将数据更改为 log-log 中的线性拟合：

parameters =  np.polyfit(np.log(xdata), np.log(ydata), 1)

a = np.exp(parameters[1])
b = parameters[0]
r2_score(y, a*x**b)

我将获得与 excel 中相同的 a 和 b 值，但现在 R² = 0.9883...

【讨论】：