【发布时间】:2021-07-01 22:13:30
【问题描述】:
我正在从事一个深度学习项目,并尝试按照教程使用交叉验证来评估我的模型。
我在看这个教程:https://machinelearningmastery.com/use-keras-deep-learning-models-scikit-learn-python/
我首先将我的数据集拆分为特征和标签:
labels = dataset['Label']
features = dataset.loc[:, dataset.columns != 'Label'].astype('float64')
我有以下形状:
features.shape ,labels.shape
((2425727, 78), (2425727,))
我使用 RobustScalar 来扩展我的数据以及我如何拥有
features
array([[ 1.40474359e+02, -1.08800488e-02, 0.00000000e+00, ...,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
[ 1.40958974e+02, -1.08609909e-02, -2.50000000e-01, ...,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
[ 1.40961538e+02, -1.08712390e-02, -2.50000000e-01, ...,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
...,
[ 1.48589744e+02, -1.08658453e-02, 0.00000000e+00, ...,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
[-6.92307692e-02, 1.77654485e-01, 1.00000000e+00, ...,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
[-6.92307692e-02, 6.18858398e-03, 5.00000000e-01, ...,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00]])
labels
array([0, 0, 0, ..., 0, 0, 0])
现在数据已准备好执行交叉验证。
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import cross_val_score
# Function to create model, required for KerasClassifier
def create_model():
# create model
model= Sequential()
model.add(Dense(128, activation='relu',input_shape = (78,1)))
model.add(Dropout(0.01))
model.add(Dense(15, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
#fix random seed for reproducibility
seed = 7
np.random.seed(seed)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=seed)
# create model
model = KerasClassifier(build_fn=create_model(), epochs=30, batch_size=64, verbose=0)
# evaluate using 5-fold cross-validation
results = cross_val_score(model, features, labels, cv=kfold,scoring='accuracy', error_score="raise")
print(results.mean())
执行此操作后,我收到此错误:“ValueError:必须始终传递Layer.call 的第一个参数。”
我还检查了 scikit learn 文档以检查我是否做错了什么: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_score.html
我还试图寻找可能遇到此问题的其他人,例如: https://github.com/scikit-learn/scikit-learn/issues/18944
但我无法解决这个问题。 有人可以帮我解决这个问题吗?
【问题讨论】:
-
对于那些会遇到这个问题的人,我设法通过添加 input_dim=78 而不是 input_shape = (78,1) 来解决它。现在它可以工作了。
标签: python tensorflow machine-learning keras scikit-learn