【发布时间】:2020-07-31 14:09:57
【问题描述】:
我正在尝试使用scipy.optimize 优化预测参数。我按照教程进行操作,并在 stackoverflow 上找到了一些不错的示例,但我遇到了一个无法解决的问题。我开始怀疑在 scipy 中使用 pandas 是否是一个糟糕的选择?
我的代码设置如下:
import simpy
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
import pandas as pd
import statistics as stat
import math as m
#from sklearn.grid_search import ParameterGrid
from scipy.optimize import minimize
###dataframe for the simulation
df = pd.read_csv('simulation_df_data_2018_2.csv')
with pd.option_context("max_rows", None,"max_columns", None):
print(df.head())
for i in df.index:
alpha = 0.2
beta = 0.3
x = np.array([alpha, beta])
def holts(x):
LO = np.int(df['average_demand'].loc[i])
print(type(LO))
TO = ((df['m2'].loc[i] - df['m3'].loc[i]) + (df['m1'].loc[i] - df['m2'].loc[i])) / 2
L1 = round(x[0] * df['m3'].loc[i] + (1 - x[0]) * (
LO + TO))
T1 = x[1] * (L1 - LO) + (1 - x[1]) * TO
L2 = round(x[0] * df['m2'].loc[i] + (1 - x[0]) * (
L1 + T1))
T2 = x[1] * (L2 - L1) + (1 - x[1]) * T1
L3 = round(x[0] * df['m1'].loc[i] + (1 - x[0]) * (
L2 + T2))
T3 = beta * (L3 - L2) + (1 - beta) * T2
LT1 = round(L3 + 1 * T3)
MSE = ((df['m3'].loc[i] - L1) + (df['m2'].loc[i] - L2) + (
df['m2'].loc[i] - L3)) ** 2 / 3
return MSE
#print(holts(x))
x0 = [0.1,0.1]
result = minimize(holts, x0, bounds=[(0,1),(0,1)], method="SLSQP")
print(result)
print(x)
df 看起来像这样:
m1 m2 m3 m4 m5 m6 m7 m8 m9 m10 m11 \
0 0.0 8.0 2.0 0.0 14.0 0.0 5.0 2.0 4.0 4.0 10.0
1 4.0 55.0 2.0 72.0 38.0 87.0 113.0 2.0 0.0 165.0 2.0
2 18.0 34.0 6.0 63.0 14.0 18.0 33.0 35.0 51.0 0.0 24.0
3 0.0 21.0 3.0 10.0 15.0 0.0 32.0 1.0 3.0 17.0 0.0
4 96.0 106.0 237.0 136.0 138.0 116.0 167.0 158.0 110.0 115.0 161.0
m12 m13 m14 m15 m16 m17 m18 m19 m20 m21 m22 \
0 0.0 6.0 10.0 0.0 2.0 2.0 17.0 0.0 0.0 0.0 0.0
1 35.0 7.0 88.0 6.0 3.0 103.0 18.0 59.0 6.0 20.0 152.0
2 6.0 5.0 99.0 7.0 17.0 15.0 8.0 3.0 21.0 6.0 4.0
3 30.0 5.0 88.0 1.0 6.0 10.0 9.0 17.0 9.0 0.0 1.0
4 116.0 77.0 48.0 96.0 69.0 77.0 96.0 74.0 94.0 101.0 115.0
m23 m24 average_demand low_demand high_demand
0 0.0 0.0 3.583333 0.0 17.0
1 6.0 0.0 43.458333 0.0 165.0
2 14.0 12.0 21.375000 0.0 99.0
3 0.0 0.0 11.583333 0.0 88.0
4 158.0 167.0 117.833333 48.0 237.0
我对不断收到的错误感到非常困惑,这是回溯
Traceback (most recent call last):
File "/Users/pierre/Desktop/simul/forecast_holts_alpha.py", line 121, in <module>
result = minimize(holts, x0, args= coef_list,bounds=[(0,1),(0,1)], method="SLSQP")
File "/Users/pierre/Desktop/Django-app/lib/python3.7/site-packages/scipy/optimize/_minimize.py", line 618, in minimize
constraints, callback=callback, **options)
File "/Users/pierre/Desktop/Django-app/lib/python3.7/site-packages/scipy/optimize/slsqp.py", line 399, in _minimize_slsqp
fx = func(x)
File "/Users/pierre/Desktop/Django-app/lib/python3.7/site-packages/scipy/optimize/optimize.py", line 327, in function_wrapper
return function(*(wrapper_args + args))
File "/Users/pierre/Desktop/simul/forecast_holts_alpha.py", line 63, in holts
LO = np.int(df['average_demand'].loc[i])
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
我不明白为什么会出现此错误,尤其是因为如果我搜索 LO 的类型,我会得到:
print(type(LO))
<class 'int'>
我不是经验丰富的程序员,所以我很难弄清楚发生了什么,任何帮助将不胜感激!
更新:
fun: 56.333333333333336
jac: array([0., 0.])
message: 'Optimization terminated successfully.'
nfev: 4
nit: 1
njev: 1
status: 0
success: True
x: array([0.1, 0.1])
[0.2,0.3]
OUTPUT 看起来像这样,但似乎没有优化任何东西
【问题讨论】:
-
我认为docs for
scipy.optimize.minimize会有所帮助。您的holts函数签名与文档不匹配,文档应为fun(x, *args) -> float。由于您的代码中有x0 = [0.1, 0.1],我认为x应该是形状为(2,) 的一维数组。 -
谢谢你的链接,它有帮助,但我没有从“其中 x 是一个形状为 (n,) 的一维数组和 args 是固定参数的元组需要完全指定功能。”除了 x 需要是一个数组而不是数据框之外,你能告诉我 X 需要是什么吗?
-
在函数签名
fun(x, *args) -> float中,x应该是要最小化的函数的变量。在您的代码中,您希望最小化holts函数,而holts(df, coef_list)函数在变量x的位置上有df,这似乎是错误的。对于哪些变量,您希望最小化holts函数?您的holts函数似乎没有变量。 -
我试图通过优化变量 alpha 和 beta 来最小化 MSE。如果我理解你,x 需要是一个包含 alpha 和 beta 的数组?
-
是的。如果
alpha和beta是变量,则x需要是一个包含alpha和beta的数组。