为什么我在 Tensorflow 中的逻辑回归分类器没有学习？答案

【问题标题】：Why is my logistic regression classifier in Tensorflow not learning?为什么我在 Tensorflow 中的逻辑回归分类器没有学习？
【发布时间】：2020-07-20 18:08:17
【问题描述】：

我通过实现逻辑回归分类器来对二进制 MNIst 数字数据集进行分类来学习 Tensorflow。我正在使用 tensorflow 1.13，如下面的代码所示

import tensorflow as tf
gpu_options = tf.GPUOptions(allow_growth=True, per_process_gpu_memory_fraction=0.1)
s = tf.InteractiveSession(config=tf.ConfigProto(gpu_options=gpu_options))
print("We're using TF", tf.__version__)

数据集如下：

from sklearn.datasets import load_digits
mnist = load_digits(2)

X,y = mnist.data, mnist.target

以下数据集具有以下形状

>> print("y [shape - %s]:" % (str(y.shape)), y[:10])
y [shape - (360,)]: [0 1 0 1 0 1 0 0 1 1]

>> print("X [shape - %s]:" % (str(X.shape)))
X [shape - (360, 64)]:

根据这些形状，我为输入定义了占位符，为权重定义了变量（我希望它们是正确的）

weights = tf.Variable(tf.zeros([X.shape[1],1]), name="weights")
input_x = tf.placeholder('float32', shape=[None, X.shape[1]], name="input_x")
input_y = tf.placeholder('float32', shape=[None, 1], name="input_y")

现在我定义损失、优化器并计算类概率如下

#predicted_y = <predicted probabilities for input_X>
logits = tf.matmul(input_x, weights)
predicted_y = tf.nn.softmax(logits)
probas=tf.argmax(predicted_y, axis=1)

#loss = <logistic loss (scalar, mean over sample)>
loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=logits, labels=input_y))

#optimizer = <optimizer that minimizes loss>
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.0001).minimize(loss)

然后，我创建一个函数来从概率调用类的计算

predict_function=lambda vector1: probas.eval({input_x:vector1})

现在，我开始分离训练集和测试集

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y)

最后，我对每次迭代进行训练和测试

from sklearn.metrics import roc_auc_score

y_train_reshaped=np.reshape(y_train, (y_train.shape[0], 1))
s.run(tf.global_variables_initializer())

for i in range(5):

    #<run optimizer operation>
    s.run(optimizer, feed_dict={input_x:X_train,input_y:y_train_reshaped})

    #loss_i = <compute loss at iteration i>
    loss_i = loss.eval({input_x:X_train, input_y:y_train_reshaped})

    print("loss at iter %i:%.4f" % (i, loss_i))

    #My problem starts here
    print("train auc:",roc_auc_score(y_train, predict_function(X_train)))
    print("test auc:",roc_auc_score(y_test, predict_function(X_test)))

我对上述代码的问题是，虽然我可以看到损失在每次迭代中减少，但 ROC 指标保持不变。这个循环的输出如下：

loss at iter 0:0.6820
train auc: 0.5
test auc: 0.5
loss at iter 1:0.6712
train auc: 0.5
test auc: 0.5
loss at iter 2:0.6606
train auc: 0.5
test auc: 0.5
loss at iter 3:0.6503
train auc: 0.5
test auc: 0.5
loss at iter 4:0.6403
train auc: 0.5
test auc: 0.5

通过打印 predict_function(X_train) 或 predict_function(X_test) 的输出，我看到预测始终为 0。因此，我可能无法理解或未正确执行某些操作。我在这里错过了什么？

编辑：我还尝试按照建议将学习率提高到 0.1，将迭代次数提高到 50000，损失很快就为零，但是训练和测试 AUC 都是 0.5，这意味着分类器只预测一个类别。我确定我的代码有问题，到底是什么？

【问题讨论】：

损失正在下降，尽管速度很慢。您可以尝试提高学习率。更重要的是，5 次迭代没什么，你需要更多才能看到结果。尝试 500 左右开始。
@xdurch0 我将学习率提高到 0.1，迭代次数提高到 5000，但问题仍然存在。奇怪的是，损失很快就变成了零，但是分类器一直在为训练和测试数据预测同一个类别。感谢您的帮助。
auc score 函数的输入可能是错误的。按照我的理解，您应该提供概率（即predicted_y，softmax 输出），但您提供的是“硬”类（softmax 的 argmax）
@xdurch0 问题解决了 ;)

标签： python tensorflow machine-learning logistic-regression

【解决方案1】：

这里有两个不同的错误：

predicted_y = tf.nn.softmax(logits)
probas=tf.argmax(predicted_y, axis=1)

首先，由于您的y 不是单热编码，因此您不应使用softmax，而应使用sigmoid（您在loss 定义中正确执行的操作）；所以，第一行应该是

predicted_y = tf.nn.sigmoid(logits)

第二行，同样因为您的 y 不是单热编码，所以不会做您认为的事情：由于您的预测是单元素数组，argmax 的定义为 0，所以您不要'没有得到从概率到硬预测的正确转换（在任何情况下，硬预测不用于计算 ROC - 你需要概率）。

您应该完全放弃 probas，并将您的 prediction_function 更改为：

prediction_function=lambda vector1: predicted_y.eval({input_x:vector1})

这样，对于learning_rate=0.1，AUC 从第一次迭代开始就达到 1.0：

loss at iter 0:0.0085
train auc: 0.9998902365402557
test auc: 1.0
loss at iter 1:0.0066
train auc: 1.0
test auc: 1.0
loss at iter 2:0.0052
train auc: 1.0
test auc: 1.0
loss at iter 3:0.0042
train auc: 1.0
test auc: 1.0
loss at iter 4:0.0035
train auc: 1.0
test auc: 1.0

你得到X_train的正确预测：

np.round(prediction_function(X_train)).reshape(1,-1)
# result:
array([[0., 1., 1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1., 1., 1.,
        1., 1., 0., 0., 1., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 0.,
        1., 1., 0., 1., 1., 0., 0., 0., 0., 1., 1., 0., 0., 1., 0., 0.,
        1., 1., 0., 0., 1., 1., 1., 0., 0., 1., 0., 1., 0., 0., 0., 1.,
        0., 1., 1., 1., 0., 1., 0., 1., 0., 0., 1., 0., 1., 1., 1., 1.,
        0., 0., 1., 1., 0., 1., 1., 0., 1., 0., 0., 0., 1., 0., 1., 1.,
        0., 1., 1., 0., 1., 1., 1., 1., 0., 1., 0., 1., 0., 1., 1., 1.,
        1., 0., 0., 1., 0., 0., 1., 0., 1., 0., 0., 0., 1., 1., 0., 0.,
        0., 0., 0., 1., 0., 1., 1., 1., 1., 1., 0., 0., 0., 1., 1., 1.,
        0., 0., 0., 1., 1., 1., 1., 0., 0., 1., 1., 0., 1., 1., 1., 0.,
        1., 1., 0., 1., 1., 1., 0., 1., 0., 1., 1., 0., 0., 1., 1., 0.,
        1., 1., 1., 1., 0., 0., 1., 1., 0., 0., 0., 0., 1., 1., 0., 0.,
        0., 0., 1., 0., 0., 1., 1., 0., 1., 0., 0., 1., 1., 0., 0., 1.,
        1., 0., 0., 1., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0., 1.,
        1., 0., 1., 1., 1., 0., 0., 0., 0., 1., 0., 0., 1., 0., 0., 0.,
        1., 1., 1., 1., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 1., 1.,
        0., 1., 1., 0., 1., 0., 1., 0., 0., 0., 1., 0., 0., 1.]],
      dtype=float32)

【讨论】：