Tensorflow tf.matmul 示例不正确？答案

【问题标题】：Tensorflow tf.matmul example is incorrect?Tensorflow tf.matmul 示例不正确？
【发布时间】：2017-02-23 15:26:32
【问题描述】：

我看了tf.matmul的官方文档我理解第一个例子。这是一个简单的 [2,3] x [3,2] 操作：

a = tf.constant([1, 2, 3, 4, 5, 6], shape=[2, 3])

b = tf.constant([7, 8, 9, 10, 11, 12], shape=[3, 2])

c = tf.matmul(a, b) => [[58 64]
                    [139 154]]

然而，第二个例子似乎很奇怪：

a = tf.constant(np.arange(1, 13, dtype=np.int32),
            shape=[2, 2, 3])

b = tf.constant(np.arange(13, 25, dtype=np.int32),
            shape=[2, 3, 2])

c = tf.matmul(a, b) => [[[ 94 100]
                     [229 244]],
                    [[508 532]
                     [697 730]]]

为什么形状为[2,2,3]的矩阵可以与[2,3,2]相乘？

【问题讨论】：

标签： python matrix tensorflow

【解决方案1】：

来自同一页面 (https://web.archive.org/web/20170223153510/https://www.tensorflow.org/api_docs/python/tf/matmul)：

返回：与a 和b 类型相同的Tensor，其中每个最里面的矩阵是 a 和 b 中相应矩阵的乘积，例如我摔倒转置或伴随属性为False:
output[..., i, j] = sum_k (a[..., i, k] * b[..., k, j]), 对于所有索引 i, j。

因此形状为 [2,2,3] 的矩阵可以与 [2,3,2] 相乘。

【讨论】：

感谢您的指出。我不擅长矩阵数学，在上面的文档中没有得到“最内层矩阵”的含义......
@rainfer 如果张量的形状为 [a,b,c]，则其最内层矩阵的形状为 [b,c]
我明白了。我只是用矩阵 [2,3,4,5] 和 [2,3,5,4] 进行测试，它可以工作。但是，是不是意味着tf.matmul和numpy的矩阵乘法不同？
@rainfer 看起来很相似 web.archive.org/web/20170223160336/https://docs.scipy.org/doc/… “如果任一参数是 N-D，N > 2，则它被视为驻留在最后两个索引中的矩阵堆栈并相应地广播。” （我不确定 TensorFlow 是否会在这种情况下广播）
对不起。我对正常的 numpy 的 a * b 操作感到困惑。非常感谢。