Tensorflow 最大边际损失训练？答案

【问题标题】：Tensorflow max-margin loss training?Tensorflow 最大边际损失训练？
【发布时间】：2016-07-08 15:17:09
【问题描述】：

我想在 tensorflow 中训练一个具有最大边距损失函数的神经网络，每个正样本使用一个负样本：

max(0,1 -pos_score +neg_score)

我目前正在做的是：该网络接受三个输入：input1，然后是一个正例 input2_pos 和一个负例 input2_neg。（这些是词嵌入层的索引。）网络应该计算一个分数，表示两个示例的相关程度。这是我的代码的简化版本：

input1 = tf.placeholder(dtype=tf.int32, shape=[batch_size])
input2_pos = tf.placeholder(dtype=tf.int32, shape=[batch_size])
input2_neg = tf.placeholder(dtype=tf.int32, shape=[batch_size])

# f is a neural network outputting a score
pos_score = f(input1,input2_pos)
neg_score = f(input1,input2_neg)

cost = tf.maximum(0., 1. -pos_score +neg_score)
optimizer= tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

当我运行它时，我看到的是，像这样，网络只是学习哪个输入包含正样本 - 它总是预测类似的分数：

pos_score = 0.9965983
neg_score = 0.00341663

如何构建变量/训练，以便网络学习任务？

我只想要一个网络，它接受两个输入并计算一个表示它们之间相关性的分数，然后用最大边际损失对其进行训练。

分别计算正面和负面的分数对我来说似乎不是一种选择，因为那样它就不会正确地反向传播。另一种选择似乎是随机化输入 - 但对于损失函数，我需要知道哪个示例是正例 - 将其作为另一个参数输入会再次给出解决方案？

有什么想法吗？

【问题讨论】：

您的代码看起来不错。如果网络预测 1. 表示正对，0. 表示负对，它似乎已经完美地学习了你的任务！损失是否向 0 收敛？

标签： neural-network tensorflow

【解决方案1】：

考虑到您的结果（每个正面为 1，每个负面为 0），您似乎有两个不同的网络学习：

为第一个预测 1
为第二个预测 0

当使用 max-margin loss 时，您需要使用相同的网络来计算 pos_score 和 neg_score。这样做的方法是共享变量。我会给你一个使用tf.get_variable()的小例子：

with tf.variable_scope("network"):
    w = tf.get_variable("weights", shape=..., initializer=...)

def f(x, y):
    with tf.variable_scope("network", reuse=True):
        w = tf.get_variable("weights")
        res = w * (x - y)  # some computation
    return res

使用此函数f 作为模型，训练将优化名称为“network/weights”的共享变量。

【讨论】：

非常感谢！！这似乎已经解决了这个问题:)