Theano 内积 3d 矩阵答案

【问题标题】：Theano inner product 3d matrixTheano 内积 3d 矩阵
【发布时间】：2015-01-05 09:52:51
【问题描述】：

感谢您阅读本文。

我正在尝试使用 theano 实现多标签逻辑回归：

import numpy
import theano
import theano.tensor as T
rng = numpy.random

examples = 5
features = 10
labels = 2
D = (rng.randn(examples, labels, features), rng.randint(size=(labels, examples), low=0, high=2))
training_steps = 10000

# Declare Theano symbolic variables
x = T.matrix("x")
y = T.vector("y")
w = theano.shared(rng.randn(1 , labels ,features), name="w")
b = theano.shared(0., name="b")
print "Initial model:"
print w.get_value(), b.get_value()

# Construct Theano expression graph
p_1 = 1 / (1 + T.exp(-T.dot(x, w) - b))   # Probability that target = 1
prediction = p_1 > 0.5                    # The prediction thresholded
xent = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # Cross-entropy loss function
cost = xent.mean() + 0.01 * (w ** 2).sum()# The cost to minimize
gw, gb = T.grad(cost, [w, b])             # Compute the gradient of the cost
                                          # (we shall return to this in a
                                          # following section of this tutorial)

# Compile
train = theano.function(
          inputs=[x,y],
          outputs=[prediction, xent],
          updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb)),
          name='train')
predict = theano.function(inputs=[x], outputs=prediction , name='predict')

# Train
for i in range(training_steps):
    pred, err = train(D[0], D[1])

print "Final model:"
print w.get_value(), b.get_value()
print "target values for D:", D[1]
print "prediction on D:", predict(D[0])

但是 -T.dot(x, w) 产品因以下错误而失败：

TypeError: ('Theano 函数的输入参数错误，名称为“train”，索引为 0（基于 0）'，'错误的维数：预期为 2，得到 3，形状为 (5, 10, 2)。' )

x 的形状为 (5, 2, 10) 和 W (1, 2, 10)。我希望点积具有形状 (5,2)。

我的问题是：反正有做这个内积吗？您认为有更好的方法来实现多标签逻辑回归吗？

谢谢！

---- 编辑 -----

所以这里是我想用 numpy 做的一个实现。

x = rng.randn(examples,labels,features)
w = rng.randn (labels,features)
dot = numpy.zeros((examples,labels))
for example in range(examples):
    for label in range(labels):
        dot[example,label] = x[example,label,:].dot(w[label,:])
print dot

输出：

[[-1.70321498  2.51088139]
 [-5.73608956  0.1066286 ]
 [ 2.31334531  3.31892284]
 [ 1.56301872 -0.56150922]
 [-1.98815855 -2.98866706]]

但我不知道如何使用 Theano 象征性地做到这一点。

【问题讨论】：

点积不应该返回一个数字吗？ en.wikipedia.org/wiki/Dot_product你可能需要 T.tensordot 吗？
我猜这可能是点积的泛化，在 theano 的单标签逻辑回归示例中。他们将 train_x (examples, features) 与 Weights (features) 相乘并获得点积 (examples) (5, 10) 点积 (10,) = (5,) 在应用其他元素操作后得出sigmod 函数最终得到具有形状的预测向量（示例，），这是每个示例一个预测。但是因为我想做多标签，所以我想要一个大小矩阵（示例，标签数量）作为预测
这在我看来是标准的matrix x vector product。
对，但不是矩阵x向量，而是matrix3d x matrix2d
上次我检查时，多标签预测是通过分别训练每个标签然后查看哪个预测器给出最佳置信度来实现的。同时进行多标签预测会不太可靠（或者您需要更多数据）。

标签： python linear-algebra logistic-regression theano

【解决方案1】：

经过几个小时的斗争，这似乎产生了正确的结果：

我有一个错误，输入为 rng.randn(examples,features,labels) 而不是 rng.randn(examples,features)。这意味着，除了有更多标签之外，输入的大小应该相同。

正确计算点积的方法是使用 theano.scan 方法，例如：结果，更新 = theano.scan(lambda label: T.dot(x, w[label,:]) - b[label], sequences=T.arange(labels))

感谢大家的帮助！

import numpy as np
import theano
import theano.tensor as T
rng = np.random

examples = 5
features = 10
labels = 2
D = (rng.randn(examples,features), rng.randint(size=(labels, examples), low=0, high=2))
training_steps = 10000

# Declare Theano symbolic variables
x = T.matrix("x")
y = T.matrix("y")
w = theano.shared(rng.randn(labels ,features), name="w")
b = theano.shared(np.zeros(labels), name="b")
print "Initial model:"
print w.get_value(), b.get_value()

results, updates = theano.scan(lambda label: T.dot(x, w[label,:]) - b[label], sequences=T.arange(labels))

# Construct Theano expression graph
p_1 = 1 / (1 + T.exp(- results))   # Probability that target = 1
prediction = p_1 > .5                         # The prediction thresholded
xent = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # Cross-entropy loss function
cost = xent.mean() + 0.01 * (w ** 2).sum()# The cost to minimize
gw, gb = T.grad(cost, [w, b])             # Compute the gradient of the cost
                                          # (we shall return to this in a
                                          # following section of this tutorial)

# Compile
train = theano.function(
          inputs=[x,y],
          outputs=[prediction, xent],
          updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb)),
          name='train')
predict = theano.function(inputs=[x], outputs=prediction , name='predict')

# Train
for i in range(training_steps):
    pred, err = train(D[0], D[1])

print "Final model:"
print w.get_value(), b.get_value()
print "target values for D:", D[1]
print "prediction on D:", predict(D[0])

【讨论】：