在非训练数据上使用训练权重来设计新的损失函数答案

【问题标题】：Using training weights on a non-training data to design a new loss function在非训练数据上使用训练权重来设计新的损失函数
【发布时间】：2021-11-01 07:33:42
【问题描述】：

我想在训练迭代中访问训练点，并通过使用未包含在训练集中的数据点将软约束合并到我的损失函数中。我将使用this post 作为参考。

import numpy as np
import keras.backend as K
from keras.layers import Dense, Input
from keras.models import Model

# Some random training data and labels
features = np.random.rand(100, 5)
labels = np.random.rand(100, 2)

# Simple neural net with three outputs
input_layer = Input((20,))
hidden_layer = Dense(16)(input_layer)
output_layer = Dense(3)(hidden_layer)


# Model
model = Model(inputs=input_layer, outputs=output_layer)


#each training point has another data pair. In the real example, I will have multiple 
#supporters. That is why I am using dict.

holder =  np.random.rand(100, 5)
iter = np.arange(start=1, stop=features.shape[0], step=1)
supporters = {}

for i,j in zip(iter, holder): #i represent the ith training data
    supporters[i]=j


# Write a custom loss function
def custom_loss(y_true, y_pred):
    # Normal MSE loss
    mse = K.mean(K.square(y_true-y_pred), axis=-1)
    new_constraint = .... 

       

    return(mse+new_constraint)


model.compile(loss=custom_loss, optimizer='sgd')
model.fit(features, labels, epochs=1, ,batch_size=1=1)

为简单起见，让我们假设我想通过使用固定的网络权重来最小化预测值与存储在supporters 中的对数据的预测之间的最小绝对值差异。另外，假设我每批都通过一个训练点。但是，我无法弄清楚如何执行此操作。我尝试了如下所示的方法，但显然它不正确。

new_constraint = K.sum(y_pred - model.fit(supporters))

【问题讨论】：

标签： tensorflow keras neural-network tensorflow2.0

【解决方案1】：

拟合是训练评估模型的过程。我认为您的问题最好使用当前权重加载模型的新实例并评估批量损失以计算主模型的损失。

main_model = Model()  # This is your main training model 

def custom_loss_1(y_true, y_pred):  # Avoid recursive calls
    mse = K.mean(K.square(y_true-y_pred), axis=-1)
    return mse

def custom_loss(y_true, y_pred):
    support_model =  tf.keras.models.clone_model(main_model)  # You copy the main model but the weights are uninitialized
    support_model.build((20,)) # You build with inputs same as your support data
    support_model.compile(loss=custom_loss_1, optimizer='sgd') 
    support_model.set_weights(main_model.get_weights())  # You  load the weight of the main model

    mse = custom_loss_1(y_true, y_pred)
    # You just want to evaluate the model, not to train. If you have more
    # metrics than just loss the use support_model.evaluate(supporters)[0]
    new_constraint = K.sum(y_pred -  support_model.predict(supporters))  # predict to get the output, evaluate to get the metrics

    return(mse+new_constraint)

【讨论】：

感谢伟大的 MWE。为什么我不能直接使用主模型提供的evaluate方法？或者我不能使用原始模型的权重进行快速评估？询问的原因是克隆模型和所有这些复制操作可能会导致时间和空间问题。你不觉得吗？
是的，这肯定是一项计算要求很高的任务。我认为另一种解决方案是两次实例化同一个模型并设置second_model.trainable=False。然后将权重设置为与每批之后的第一个模型相同，然后在损失函数中使用可训练和不可训练模型的输出或连接最终输出（可能以 logits 形式）并在您的损失函数。
最后一个问题，张量流如何计算出哪个支持向量对应哪个训练点？
您应该将它们作为单个样本提供，然后处理输入