网络值通过线性层变为 0答案

【问题标题】：network values goes to 0 by linear layers网络值通过线性层变为 0
【发布时间】：2020-09-03 10:43:01
【问题描述】：

我设计了图注意力网络。
但是，在层内的操作过程中，特征的值变得相等。

class GraphAttentionLayer(nn.Module):
    ## in_features = out_features = 1024
    def __init__(self, in_features, out_features, dropout):
        super(GraphAttentionLayer, self).__init__()
        self.dropout = dropout
        self.in_features = in_features
        self.out_features = out_features
   
        self.W = nn.Parameter(torch.zeros(size=(in_features, out_features)))
        self.a1 = nn.Parameter(torch.zeros(size=(out_features, 1)))
        self.a2 = nn.Parameter(torch.zeros(size=(out_features, 1)))
        nn.init.xavier_normal_(self.W.data, gain=1.414)
        nn.init.xavier_normal_(self.a1.data, gain=1.414)
        nn.init.xavier_normal_(self.a2.data, gain=1.414)
        self.leakyrelu = nn.LeakyReLU()

    def forward(self, input, adj):
        h = torch.mm(input, self.W)
        a_input1 = torch.mm(h, self.a1)
        a_input2 = torch.mm(h, self.a2)
        a_input = torch.mm(a_input1, a_input2.transpose(1, 0))
        e = self.leakyrelu(a_input)

        zero_vec = torch.zeros_like(e)
        attention = torch.where(adj > 0, e, zero_vec) # most of values is close to 0
        attention = F.softmax(attention, dim=1) # all values are 0.0014 which is 1/707 (707^2 is the dimension of attention)
        attention = F.dropout(attention, self.dropout)
        return attention

“注意力”的维度是 (707 x 707)，我观察到在 softmax 之前注意力的值接近 0。
在 softmax 之后，所有值都是 0.0014，即 1/707。
我想知道如何保持值标准化并防止这种情况发生。

谢谢

【问题讨论】：

这种情况何时发生，您是否有想要执行的最终训练模型，还是在训练期间？
@Nopileos 它发生在训练期间。我怀疑当特征维度很大时softmax函数无效。即，我们使用 softmax 进行分类，维度为 2，输出形式如 [0.001, 0.999]。但是对于维度超过 1k 的特征，由于函数中的指数，该值将相等，尤其是对于较小的值 ((e^0.0001) ~ 1)

标签： deep-learning pytorch attention-model

【解决方案1】：

既然你说这发生在训练期间，我会假设它是在开始时。通过随机初始化，您通常会在训练过程开始时在网络末端获得接近相同的值。

当所有值或多或少相等时，每个元素的 softmax 输出将为 1/num_elements，因此它们在您选择的维度上的总和为 1。因此，在您的情况下，您会得到 1/707 作为所有值，这对我来说只是听起来您的权重是新初始化的，并且在此阶段输出大多是随机的。

我会让它训练一段时间，然后观察它是否会发生变化。

【讨论】：

嘿，如果它回答了您的问题，请接受它或告诉您缺少什么。谢谢。
糟糕，对不起 :) 我忘了