在 Tensorflow 和 Keras 的两个通道上生成 softmax答案

【问题标题】：Producing a softmax on two channels in Tensorflow and Keras在 Tensorflow 和 Keras 的两个通道上生成 softmax
【发布时间】：2025-12-16 13:20:09
【问题描述】：

我的网络倒数第二层的形状为(U, C)，其中C 是通道数。我想分别在每个通道上应用 softmax 函数。

例如，如果U=2 和C=3，并且该层产生[ [1 2 3], [10 20 30] ]，我希望输出对通道0 执行softmax(1, 2, 3)，对通道1 执行softmax(10, 20, 30)。

有没有办法用 Keras 做到这一点？我使用 TensorFlow 作为后端。

更新

还请解释如何确保损失是两个交叉熵的总和，以及我如何验证这一点？（也就是说，我不希望优化器仅针对其中一个 softmax 的损失进行训练，而是针对每个的交叉熵损失的总和进行训练）。该模型使用 Keras 内置的 categorical_crossentropy 进行损失。

【问题讨论】：

"U=2 and C=3" 与[ [1 10] [2 20] [3 30] ] 不一致，因为它是一个形状为(3, 2) 而不是(2, 3) 的数组。请编辑您的帖子并使其保持一致。
@today 我从未掌握过 numpy 将数组显示为字符串的方式。 U=2 和 C=3 是正确的；如果您可以编辑为正确的 numpy 字符串，我将不胜感激；如果没有，请告诉我它是什么，我会自己编辑。

标签： python tensorflow keras softmax

【解决方案1】：

对多个输出使用功能 api。 https://keras.io/getting-started/functional-api-guide/

input = Input(...)
...
t = some_tensor
t0 = t0[:,:,0]
t1 = t0[:,:,1]
soft0 = Softmax(output_shape)(t0)
soft1 = Softmax(output_shape)(t1)
outputs = [soft0,soft1]
model = Model(inputs=input, outputs=outputs)
model.compile(...)
model.fit(x_train, [y_train0, ytrain1], epoch = 10, batch_size=32)

【讨论】：

【解决方案2】：

定义一个Lambda 层并使用来自后端的softmax 函数和所需的轴来计算该轴上的softmax：

from keras import backend as K
from keras.layers import Lambda

soft_out = Lambda(lambda x: K.softmax(x, axis=my_desired_axis))(input_tensor)

更新： N 维的 numpy 数组的形状为 (d1, d2, d3, ..., dn)。它们中的每一个都称为轴。所以第一个轴（即axis=0）的尺寸为d1，第二个轴（即axis=1）的尺寸为d2，依此类推。此外，数组最常见的情况是二维数组或矩阵，其形状为(m, n)，即m 行（即axis=0）和n 列（即axis=1）。现在，当我们指定执行操作的轴时，这意味着应该在该轴上计算操作。让我通过例子更清楚地说明这一点：

>>> import numpy as np
>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

>>> a.shape
(3, 4)   # three rows and four columns

>>> np.sum(a, axis=0)  # compute the sum over the rows (i.e. for each column)
array([12, 15, 18, 21])

>>> np.sum(a, axis=1)  # compute the sum over the columns (i.e. for each row)
array([ 6, 22, 38])

>>> np.sum(a, axis=-1) # axis=-1 is equivalent to the last axis (i.e. columns)
array([ 6, 22, 38])

现在，在您的示例中，计算 softmax 函数也是如此。您必须首先确定要在哪个轴上计算 softmax，然后使用 axis 参数指定它。此外，请注意，softmax 默认应用于最后一个轴（即axis=-1），因此如果您想在最后一个轴上计算它，则不需要上面的 Lambda 层。只需改用Activation 层：

from keras.layers import Activation

soft_out = Activation('softmax')(input_tensor)

更新 2：还有另一种使用Softmax 层的方法：

from keras.layers import Softmax

soft_out = Softmax(axis=desired_axis)(input_tensor)

【讨论】：

@SRobertJames 我已经更新了我的答案。请看一看。
非常有帮助； Q 相应更新。您能否解决最后剩下的一点，即我们如何确保我们独立计算axis -1 的每个条目的交叉熵损失，优化的损失是两个交叉熵的 sum ?
@SRobertJames 这取决于损失函数是什么。你是使用自定义的损失函数还是内置的categorical_crossentropy函数？
在categorical_crossentropy中构建
@SRobertJames categorical_crossentropy 默认应用于最后一个轴（即axis=-1）。至于错误，请编辑您的问题并包含代码和您得到的错误。