【发布时间】:2020-11-15 13:46:20
【问题描述】:
我正在制作一个神经网络,它应该获取 480 个数据点并输出 18 个数据点。输入是磁场强度,输出是检测到的对象的坐标(如果没有检测到对象,则为零),因此没有数据点是真正分类的。出于某种原因,当我训练模型时,我尝试的每个输入都会得到相同的输出,例如:
>>> output2 = loaded_model.predict(X_)
>>> output2[0]
array([0.32035217, 0.3027814 , 0.2977892 , 0.30922157, 0.3294088 ,
0.40853357, 0.09848618, 0.15266985, 0.29188123, 0.31177315,
0.4652696 , 0.6406114 , 0.204305 , 0.23156416, 0.19870688,
0.21269864, 0.28510743, 0.29115945], dtype=float32)
>>> output2[100]
array([0.32035217, 0.3027814 , 0.2977892 , 0.30922157, 0.3294088 ,
0.40853357, 0.09848618, 0.15266985, 0.29188123, 0.31177315,
0.4652696 , 0.6406114 , 0.204305 , 0.23156416, 0.19870688,
0.21269864, 0.28510743, 0.29115945], dtype=float32)
我用来生成这个模型的代码是:
import pandas as pd
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from numpy import savetxt
from keras.optimizers import Adam
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras import regularizers
from keras.layers.advanced_activations import LeakyReLU
from keras.models import model_from_json
#from keras.layers import LeakyReLU
import matplotlib.pyplot as plt
df = pd.read_csv("C:/Users/an/Desktop/python processing/try_2/Hx_output.csv")
dataset = df.values
X = dataset[:,0:480]
Y = dataset[:,480:499]
min_max_scaler = preprocessing.MinMaxScaler()
X_scale = min_max_scaler.fit_transform(X)
Y_scale = min_max_scaler.fit_transform(Y)
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X_scale, Y_scale, test_size=0.3)
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
pd.DataFrame(X_val).to_csv("C:/Users/an/Desktop/python processing/try_2/X_val.csv", header=None, index=None)
pd.DataFrame(Y_val).to_csv("C:/Users/an/Desktop/python processing/try_2/Y_val.csv", header=None, index=None)
# activation = LeakyReLU(alpha=0.05)
model = Sequential([ Dense(480, activation= 'sigmoid', kernel_regularizer=regularizers.l2(0.01), input_shape=(480,)),
Dropout(0.3),
Dense(5000, activation= 'softplus', kernel_regularizer=regularizers.l2(0.01)),
Dropout(0.3),
Dense(5000, activation= 'softplus', kernel_regularizer=regularizers.l2(0.01)),
Dropout(0.3),
Dense(5000, activation= 'softplus', kernel_regularizer=regularizers.l2(0.01)),
Dropout(0.3),
Dense(5000, activation= 'softplus', kernel_regularizer=regularizers.l2(0.01)),
Dropout(0.3),
Dense(5000, activation= 'softplus', kernel_regularizer=regularizers.l2(0.01)),
Dropout(0.3),
Dense(18, activation= 'sigmoid', kernel_regularizer=regularizers.l2(0.01))])
#opt = keras.optimizers.Adam(lr=0.001)
model.compile(optimizer= Adam(lr=0.0001), loss='mean_squared_error', metrics=['mean_squared_error'])
##callbacks=[early_stopping_monitor]
hist = model.fit(X_train, Y_train, batch_size=32, epochs=150, validation_data=(X_val, Y_val))
print("Done training !!!")
# serialize model to JSON
model_json = model.to_json()
with open("C:/Users/an/Desktop/python processing/try_2/model.json", "w") as json_file:
json_file.write(model_json)
# serialize weights to HDF5
model.save_weights("C:/Users/an/Desktop/python processing/try_2/model.h5")
print("Saved model to disk")
plt.plot(hist.history['mean_squared_error'])
#plt.plot(hist.history['val_acc'])
plt.title('Model Mean Squared Error')
plt.ylabel('MeanSquaredError')
plt.xlabel('Epoch')
#plt.legend(['Train', 'Val'], loc='lower right')
plt.show()
我了解到,造成这种情况的一些原因是学习率太高,禁用了层的“可训练”功能,批量大小很小。我试图将我的学习率降低到 0.0001,但我仍然得到相同的结果,据我所知,我的所有层都是可训练的,最后可能是批量大小的问题,我还没有尝试过。我有几千个训练样本,所以也许这就是问题所在,我正在将它从 32 个增加到 400 个,以便我将很快进行新一轮训练,但也许问题出在我没有看到的其他地方?
我还读到,在这种情况下使用 callbacks=['early_stopping_monitor'] 是个好主意吗?
编辑:kernel_regularizer=regularizers.l2(0.01) 术语也会对此产生影响吗?
【问题讨论】:
-
根据您的问题描述,使用
sigmoid作为最后一层激活绝对没有意义;也不清楚为什么你在中间层使用softplus而不是relu。 -
我原本想使用leaky relu,因为普通relu的梯度为零,但遇到了一些问题,所以我使用softplus作为折衷方案……我不确定这是否是个好主意。我认为可以使用 sigmoid,因为 min max 标量......但是我不太精通激活函数,对于这个问题你会推荐什么?
标签: python tensorflow machine-learning keras scikit-learn