【发布时间】:2019-09-06 21:17:56
【问题描述】:
我正在尝试使用 categorical_crossentropy 解决多类分类问题(心脏病数据集),使用 Keras(TensorFlow 作为后端)获得良好的准确性。我的模型可以达到很好的训练准确率,但验证准确率低(验证损失高)。我已经尝试过过度拟合的解决方案(例如,归一化、辍学、正则化等),但我仍然遇到同样的问题。到目前为止,我一直在玩优化器、损失、时期和批量大小,但没有成功。这是我正在使用的代码:
import pandas as pd
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.optimizers import SGD,Adam
from keras.layers import Dense, Dropout
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.impute import SimpleImputer
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from keras.models import load_model
from keras.regularizers import l1,l2
# fix random seed for reproducibility
np.random.seed(5)
data = pd.read_csv('ProcessedClevelandData.csv',delimiter=',',header=None)
#Missing Values
Imp=SimpleImputer(missing_values=np.nan,strategy='mean',copy=True)
Imp=Imp.fit(data.values)
Imp.transform(data)
X = data.iloc[:, :-1].values
y=data.iloc[:,-1].values
y=to_categorical(y)
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.1)
scaler = StandardScaler()
X_train_norm = scaler.fit_transform(X_train)
X_test_norm=scaler.transform(X_test)
# create model
model = Sequential()
model.add(Dense(13, input_dim=13, activation='relu',use_bias=True,kernel_regularizer=l2(0.0001)))
#model.add(Dropout(0.05))
model.add(Dense(9, activation='relu',use_bias=True,kernel_regularizer=l2(0.0001)))
#model.add(Dropout(0.05))
model.add(Dense(5,activation='softmax'))
sgd = SGD(lr=0.01, decay=0.01/32, nesterov=False)
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])#adam,adadelta,
print(model.summary())
history=model.fit(X_train_norm, y_train,validation_data=(X_test_norm,y_test), epochs=1200, batch_size=32,shuffle=True)
# list all data in history
print(history.history.keys())
# summarize history for accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
这是输出的一部分,您可以在其中看到上述行为:
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 13) 182
_________________________________________________________________
dense_2 (Dense) (None, 9) 126
_________________________________________________________________
dense_3 (Dense) (None, 5) 50
=================================================================
Total params: 358
Trainable params: 358
Non-trainable params: 0
_________________________________________________________________
Train on 272 samples, validate on 31 samples
Epoch 1/1200
32/272 [==>...........................] - ETA: 21s - loss: 1.9390 - acc: 0.1562
272/272 [==============================] - 3s 11ms/step - loss: 2.0505 - acc: 0.1434 - val_loss: 2.0875 - val_acc: 0.1613
Epoch 2/1200
32/272 [==>...........................] - ETA: 0s - loss: 1.6747 - acc: 0.2188
272/272 [==============================] - 0s 33us/step - loss: 1.9416 - acc: 0.1544 - val_loss: 1.9749 - val_acc: 0.1290
Epoch 3/1200
32/272 [==>...........................] - ETA: 0s - loss: 1.7708 - acc: 0.2812
272/272 [==============================] - 0s 37us/step - loss: 1.8493 - acc: 0.1801 - val_loss: 1.8823 - val_acc: 0.1290
Epoch 4/1200
32/272 [==>...........................] - ETA: 0s - loss: 1.9051 - acc: 0.2188
272/272 [==============================] - 0s 33us/step - loss: 1.7763 - acc: 0.1949 - val_loss: 1.8002 - val_acc: 0.1613
Epoch 5/1200
32/272 [==>...........................] - ETA: 0s - loss: 1.6337 - acc: 0.2812
272/272 [==============================] - 0s 33us/step - loss: 1.7099 - acc: 0.2426 - val_loss: 1.7284 - val_acc: 0.1935
Epoch 6/1200
....
32/272 [==>...........................] - ETA: 0s - loss: 0.0494 - acc: 1.0000
272/272 [==============================] - 0s 37us/step - loss: 0.0532 - acc: 1.0000 - val_loss: 4.1031 - val_acc: 0.5806
Epoch 1197/1200
32/272 [==>...........................] - ETA: 0s - loss: 0.0462 - acc: 1.0000
272/272 [==============================] - 0s 33us/step - loss: 0.0529 - acc: 1.0000 - val_loss: 4.1174 - val_acc: 0.5806
Epoch 1198/1200
32/272 [==>...........................] - ETA: 0s - loss: 0.0648 - acc: 1.0000
272/272 [==============================] - 0s 37us/step - loss: 0.0533 - acc: 1.0000 - val_loss: 4.1247 - val_acc: 0.5806
Epoch 1199/1200
32/272 [==>...........................] - ETA: 0s - loss: 0.0610 - acc: 1.0000
272/272 [==============================] - 0s 29us/step - loss: 0.0532 - acc: 1.0000 - val_loss: 4.1113 - val_acc: 0.5484
Epoch 1200/1200
32/272 [==>...........................] - ETA: 0s - loss: 0.0511 - acc: 1.0000
272/272 [==============================] - 0s 29us/step - loss: 0.0529 - acc: 1.0000 - val_loss: 4.1209 - val_acc: 0.5484
【问题讨论】:
-
您知道您的训练和测试集中的类分布是否相似吗?换句话说,你的每个班级在两组中的出现比例是否大致相同?
-
我希望如此,但我不确定。我该怎么做?
-
您有数据标签,对吗?计算每个标签在每个数据集中出现的次数,然后除以每组中的总点数。它们不必精确,但如果您在任一数据集中都没有标签示例,那可能很糟糕,并且如果您的样本严重不平衡,也会导致挑战。
标签: python tensorflow keras neural-network