【问题标题】:how can i evaluate StratifiedKFold model我如何评估 StratifiedKFold 模型
【发布时间】:2019-10-01 08:12:50
【问题描述】:
    import numpy as np
    from keras.models import Sequential
    from keras.layers import Dense
    from keras.wrappers.scikit_learn import KerasClassifier 
    from sklearn.model_selection import StratifiedKFold 
    from sklearn.model_selection import cross_val_score
    from sklearn.model_selection import cross_val_predict   

    x_train = dataset[0:700,:-1]
    y_train = dataset[0:700,-1]
    x_test = dataset[700:,:-1]
    y_test = dataset[700:,-1]

    def create_model():
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation='relu'))
    model.add(Dense(8, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

    model = KerasClassifier(build_fn=create_model, epochs=100, batch_size=64)
    skf = StratifiedKFold(n_splits=3, shuffle=True, random_state=seed) 

    scores = cross_val_score(model, x_train, y_train, cv=skf)
    predictions = cross_val_predict(model, x_test, y_test, cv=skf)

我想通过 StratifiedKFold 训练 [x_train], [y_train] 并通过 [x_test]、[y_test] 进行评估 我能怎么做? 我试过cross_val_predict。但我觉得不合适。

【问题讨论】:

  • 您的意思是要分层拆分训练和测试?
  • 是的,完全正确。我想以分层的方式拆分训练和测试。训练(x_train,y_train)和测试(x_test,y_test)

标签: python numpy tensorflow keras


【解决方案1】:

要以分层方式在训练和测试之间进行拆分,您可以使用:

from sklearn.model_selection import train_test_split
dataset_train, dataset_test = train_test_split(dataset,
                                                stratify=dataset[:,-1], 
                                                test_size=0.2)

#split both datasets into X,y

检查:

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

Stratified Train/Test-split in scikit-learn

【讨论】:

  • 请检查我的答案。
【解决方案2】:
skf = StratifiedKFold(n_splits=3, shuffle=True, random_state=seed)
accuracy=[]
for train in skf.split(x_train, y_train):
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation='relu'))
    model.add(Dense(8, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

这个怎么样?这是工作,但我不知道它是否正确。

【讨论】:

  • 你为什么要循环 3 次,每次都创建一个新模型(覆盖)而不使用循环内的迭代变量 train
猜你喜欢
  • 2019-03-09
  • 1970-01-01
  • 2017-03-24
  • 1970-01-01
  • 2020-03-06
  • 2017-04-24
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多