【发布时间】:2020-02-10 10:01:11
【问题描述】:
我正在尝试在 python 中构建一个具有一个隐藏层的 XOR 神经网络,但我遇到了维度问题,我无法弄清楚为什么我一开始就得到错误的维度,因为数学在我看来是正确的。
维度问题从反向传播部分开始并被评论。错误具体是
File "nn.py", line 52, in <module>
d_a1_d_W1 = inp * deriv_sigmoid(z1)
File "/usr/local/lib/python3.7/site-packages/numpy/matrixlib/defmatrix.py", line 220, in __mul__
return N.dot(self, asmatrix(other))
ValueError: shapes (1,2) and (3,1) not aligned: 2 (dim 1) != 3 (dim 0)
另外,为什么这里的 sigmoid_derivative 函数只有在我转换为 numpy 数组时才有效?
代码:
import numpy as np
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def deriv_sigmoid(x):
fx = np.array(sigmoid(x)) # gives dimensions issues unless I cast to array
return fx * (1 - fx)
hiddenNeurons = 3
outputNeurons = 1
inputNeurons = 2
X = np.array( [ [0, 1] ])
elem = np.matrix(X[0])
elem_row, elem_col = elem.shape
y = np.matrix([1])
W1 = np.random.rand(hiddenNeurons, elem_col)
b1 = np.random.rand(hiddenNeurons, 1)
W2 = np.random.rand(outputNeurons, hiddenNeurons)
b2 = np.random.rand(outputNeurons, 1)
lr = .01
for inp, ytrue in zip(X, y):
inp = np.matrix(inp)
# feedforward
z1 = W1 * inp.T + b1 # get weight matrix1 * inputs + bias1
a1 = sigmoid(z1) # get activation of hidden layer
z2 = W2 * a1 + b2 # get weight matrix2 * activated hidden + bias 2
a2 = sigmoid(z2) # get activated output
ypred = a2 # and call it ypred (y prediction)
# backprop
d_L_d_ypred = -2 * (ytrue - ypred) # derivative of mean squared error loss
d_ypred_d_W2 = a1 * deriv_sigmoid(z2) # deriviative of y prediction with respect to weight matrix 2
d_ypred_d_b2 = deriv_sigmoid(z2) # deriviative of y prediction with respect to bias 2
d_ypred_d_a1 = W2 * deriv_sigmoid(z2) # deriviative of y prediction with respect to hidden activation
d_a1_d_W1 = inp * deriv_sigmoid(z1) # dimensions issue starts here ––––––––––––––––––––––––––––––––
d_a1_d_b1 = deriv_sigmoid(b1)
W1 -= lr * d_L_d_ypred * d_ypred_d_a1 * d_a1_d_W1
b1 -= lr * d_L_d_ypred * d_ypred_d_a1 * d_a1_d_b1
W2 -= lr * d_L_d_ypred * d_ypred_d_W2
b2 -= lr * d_L_d_ypred * d_ypred_d_b2
【问题讨论】:
-
绝对有必要使用 numpy 矩阵吗?这可能不是问题的唯一原因,但普遍的共识似乎是ndarray is the better choice。 docs 状态:“不再建议使用此类,即使对于线性代数也是如此。而是使用常规数组。将来可能会删除该类。”
-
谢谢。我实际上已经尝试用 np.array 替换所有内容,但仍然遇到相同的错误。
-
好吧,我会试着看一下代码 :) 不过我对神经网络了解不多,所以不能保证!
标签: python numpy neural-network xor sigmoid