【发布时间】:2020-05-25 21:30:29
【问题描述】:
我正在尝试使用 python 预测股票价格,同时尝试将数据集重塑为“fit”函数的 2D num 数组,并将其用作参考:sklearn Logistic Regression "ValueError: Found array with dim 3. Estimator expected <= 2."
next_day_open_values, nx, ny = next_day_open_values.shape
next_day_open_values = next_day_open_values.reshape((next_day_open_values,nx*ny))
y_normaliser = preprocessing.MinMaxScaler()
y_normaliser.fit((np.expand_dims( next_day_open_values, -1 )))
我遇到了这个错误:
<ipython-input-42-6ea43c55dc18> in csv_to_dataset(csv_path)
20
21 next_day_open_values, nx, ny = next_day_open_values.shape
---> 22 next_day_open_values = next_day_open_values.reshape((next_day_open_values,nx*ny))
23 y_normaliser = preprocessing.MinMaxScaler()
24 y_normaliser.fit((np.expand_dims( next_day_open_values, -1 )))
AttributeError: 'int' object has no attribute 'reshape'
出了什么问题?我该如何解决?感谢您提供详细的答案。
到目前为止的代码如下(我使用的是 Jupyter notebook):
import keras
from keras.models import Model
from keras.layers import Dense, Dropout, LSTM, Input, Activation
from keras import optimizers
import numpy as np
np.random.seed(4)
import tensorflow
tensorflow.random.set_seed(4)
import pandas as pd
from sklearn import preprocessing
import numpy as np
history_points = 50
def csv_to_dataset(csv_path):
data = pd.read_csv(csv_path)
data = data.drop('Date', axis=1)
data = data.drop(0, axis=0)
data_normaliser = preprocessing.MinMaxScaler()
data_normalised = data_normaliser.fit_transform(data)
# using the last {history_points} open high low close volume data points, predict the next open value
ohlcv_histories_normalised = np.array([data_normalised[i : i + history_points].copy() for i in range(len(data_normalised) - history_points)])
next_day_open_values_normalised = np.array([data_normalised[:,0][i + history_points].copy() for i in range(len(data_normalised) - history_points)])
next_day_open_values_normalised = np.expand_dims(next_day_open_values_normalised, -1)
next_day_open_values = np.array([data.iloc[:,0][i + history_points].copy() for i in range(len(data) - history_points)])
next_day_open_values = np.expand_dims(next_day_open_values_normalised, axis=-1)
next_day_open_values, nx, ny = next_day_open_values.shape
next_day_open_values = next_day_open_values.reshape((next_day_open_values,nx*ny))
y_normaliser = preprocessing.MinMaxScaler()
y_normaliser.fit((np.expand_dims( next_day_open_values, -1 )))
assert ohlcv_histories_normalised.shape[0] == next_day_open_values_normalised.shape[0]
return ohlcv_histories_normalised, next_day_open_values_normalised, next_day_open_values, y_normaliser
#dataset
hlcv_histories, next_day_open_values, unscaled_y, y_normaliser = csv_to_dataset('AMZN1.csv')
test_split = 0.9 # the percent of data to be used for testing
n = int(ohlcv_histories.shape[0] * test_split)
# splitting the dataset up into train and test sets
ohlcv_train = ohlcv_histories[:n]
y_train = next_day_open_values[:n]
ohlcv_test = ohlcv_histories[n:]
y_test = next_day_open_values[n:]
unscaled_y_test = unscaled_y[n:]
请随时更正/编辑。
谢谢
【问题讨论】:
-
这里
next_day_open_values, nx, ny = next_day_open_values.shape您将next_day_open_values分配给了int(我假设以前是ndarray)。然后你继续尝试使用next_day_open_values作为ndarray。 -
为什么将
next_day_open_values重新分配给它的shape元素之一?
标签: python numpy tensorflow keras