【发布时间】:2021-01-29 15:29:55
【问题描述】:
我正在尝试使用非线性最小二乘法联合估计模型的参数,以最小化实际估计值和基于模型的估计值之间的平方差之和。然而,结果值高于我猜测的 SSE。猜测值SSE为2951687,优化参数SSE为4281096。
版本:python 3.7.6、numpy 1.19.2、scipy 1.5.2
import numpy as np
import pandas as pd
from scipy.optimize import minimize
###################### importing the excel file ######################
df = pd.read_csv('data2.csv')
###################### Setting up variables and arrays ######################
a = df.loc[:,'C(ADD)'].values #measured added customers
l = df.loc[:,'C(Loss)'].values #measured lost customers
m = df.loc[:,'m'].values #number of months
mkt = df.loc[:,'Marketing Expense'].values #maketing dollars in each month
e = 5596 #end measured value, Calculated from the cac/total marketing spend over the time period
n = len(df) #creates a variable of the length of the dataframe
###################### Defining equations ######################
g0 = np.zeros(n) #guess values
g0[0] = 0.0001
g0[1] = 0.006
g0[2] = 96755.00
g0[3] = 1.7
g0[4] = 0.6
g0[5] = 0.1
g0[6] = 0.006
g0[7] = 1.7
g0[8] = 0.6
def addhat(g): #Add predict values
pNT = g[0]
r = g[1]
alpha = g[2]
c = g[3]
Bm = g[4]
ah = np.empty(len(df)) #an empty array for the add hat values
b = np.empty(n) #an empty array for the B(m,m') values
b[0] = np.exp(np.log(mkt[0])*Bm)
ah[0] = 400000*((1-pNT) * (1 - (alpha/(alpha + b[0]))**r))
for i in range(1, n):
b[i] = b[i-1] + (m[i]**c - m[i-1]**c)*np.exp(np.log(mkt[i])*Bm)
ah[i] = 400000*((1-pNT) * (1 - (alpha/(alpha + (b[i])))**r))
return ah
print('add pred values: ' + str(addhat(g0)))
def rethat(g): #Retention percentage
rr = g[5]
alphar = g[6]
cr = g[7]
Bmr = g[8]
k = np.empty(n) #an empty array for exponent section of the formula
w = np.empty(n) #an empty array for the retention values
#The value of b(t)r when i = 0
k[0] = np.exp(np.log(mkt[0])*Bmr)
w[0] = 1 - (alphar/(alphar + k[0]))**rr
# the value of B(t) for all other values of q
for i in range(1, n):
k[i] = k[i-1] + (m[i]**cr - m[i-1]**cr)*np.exp(np.log(mkt[i])*Bmr)
w[i] = 1 - (alphar/(alphar + (k[i])))**rr
return w
def endpred(g): #predicting the end hat values
eh = np.empty(n) #an empty array for the end hat values
eh[0] = 213
for i in range(1, n):
eh[i] = (eh[i-1] * rethat(g)[i]) + addhat(g)[i]
return eh
endhat = sum(endpred(g0))
def losshat(g):
lh = np.empty(n) #an empty array for the loss hat values
lh[0] = 0
for i in range(1, n):
lh[i] = endpred(g)[i-1] - (endpred(g)[i] - addhat(g)[i])
return lh
###################### Sum of square errors ######################
def objective(g):
sse = sum((addhat(g)-a)**2 + (losshat(g)-l)**2) + (endhat-e)**2
return sse
print("SSE Initial: " + str(objective(g0)))
###################### Constraints ######################
def constraint1(g): #c is greater than 1
return g[3] - 1
def constraint2(g): #cr is greater than 1
return g[7] - 1
def constraint3(g): #pNT is greater than 0
return g[0]
con1 = {'type': 'ineq', 'fun': constraint1}
con2 = {'type': 'ineq', 'fun': constraint2}
con3 = {'type': 'ineq', 'fun': constraint3}
cons = [con1, con2, con3]
###################### Optimize ######################
s = minimize(objective, g0, method='SLSQP', constraints = cons)
g = s.x
print(g)
print("SSE Final: " + str(objective(g)))
生成的 SSE 值为 4,281,096.9,其值为:
3.48133574e+02, 6.84452015e+02, 9.67550032e+04, 2.22008198e+00, -3.28153006e+03, -1.91454144e+02, 2.20947909e+02, 1.70207912e+00, -1.24649708e+01
我使用的初始猜测值与实际结果值非常接近(我正在检查我的代码,遇到一个我知道结果的问题)。结果应为 0.0001001361, 0.006035783, 96,755.64542, 1.78204741, 0.636357403, 0.152, 0.0065432195, 1.73490796, 0.62625507,其 SSE 为 912,278。
Link 到data2.csv。
再次感谢您的帮助
【问题讨论】:
-
当我运行它时,我收到溢出错误消息:'RuntimeWarning:exp 中遇到溢出'等。你遇到同样的问题吗?
-
感谢 T 先生调查我的问题。我在运行我的代码时没有遇到任何错误,它成功终止。我正在使用带有 python 3.7.6 的 jupyter notebook 在 Visual Basic 中运行它。但我会尝试在 google colab 中运行它,看看是否可以复制您的错误
-
我在 google collab 中运行它,我再次没有任何错误。我不确定会发生什么。任何建议将不胜感激
-
我在 Linux 和 Win10 上的 Eclipse/PyDev 中尝试过,都生成相同的错误消息。不同的 Python 3.x 版本,但我最近在两者上都升级了 scipy/numpy,所以也许你使用这些包的其他版本,没有明确提到溢出? Error message
-
感谢 T 先生的持续帮助。我更新了 scipy (1.5.2) 和 numpy (1.19.2) 并在 jypter notebook 中运行它,而不是通过 Visual Studio,我现在得到了同样的错误。显然不是最好的结果,但它现在给了我一些可以使用的东西,所以这很棒