使用 Python 和 NumPy 的神经网络答案

【问题标题】：Neural Networks Using Python and NumPy使用 Python 和 NumPy 的神经网络
【发布时间】：2020-01-18 19:29:25
【问题描述】：

我是 NN 的新手，我正在尝试使用 Python/Numpy 从我在以下位置找到的代码实现 NN： “从零开始用 Python 创建一个简单的神经网络” enter link description here

我的输入数组是：

array([[5.71, 5.77, 5.94],
   [5.77, 5.94, 5.51],
   [5.94, 5.51, 5.88],
   [5.51, 5.88, 5.73]])

输出数组是：

array([[5.51],
   [5.88],
   [5.73],
   [6.41]])

运行代码后，我看到以下不正确的结果：

synaptic_weights after training
[[1.90625275]
[2.54867698]
[1.07698312]]
outputs after training
[[1.]
[1.]
[1.]
[1.]]

这里是核心代码：

for iteration in range(1000):
    input_layer = tr_input
    outputs = sigmoid(np.dot(input_layer, synapic_weights))

    error = tr_output - outputs

    adjustmnets = error * sigmoid_derivative(outputs)

    synapic_weights +=np.dot(input_layer.T, adjustmnets )

print('synaptic_weights after training')  
print(synapic_weights)

print('outputs after training')  
print(outputs)

我应该在此代码中进行哪些更改以使其适用于我的数据？还是我应该采取不同的方法？非常感谢任何帮助。

【问题讨论】：

看看 pandas 和 keras

标签： python numpy neural-network

【解决方案1】：

这些是我的神经网络实现所涉及的步骤。

随机初始化权重 (θ Theta)
实现前向传播
计算成本函数
实现反向传播以计算偏导数
使用梯度下降

def forward_prop(X, theta_list):

    m = X.shape[0]
    a_list = []
    z_list = []
    
    a_list.append(np.insert(X, 0, values=np.ones(m), axis=1))
   
    idx = 0
    for idx, thera in enumerate(theta_list):
        z_list.append(a_list[idx] * (theta_list[idx].T))
        if idx != (len(theta_list)-1):
            a_list.append(np.insert(sigmoid(z_list[idx]), 0, values=np.ones(m), axis=1))
        else:
            a_list.append(sigmoid(z_list[idx]))

    return a_list, z_list

def back_prop(params, input_size, hidden_layers, num_labels, X, y, regularization, regularize):

    m = X.shape[0]
    X = np.matrix(X)
    y = np.matrix(y)
    
    theta_list = []
    startCount = 0
    idx = 0
    for idx, val in enumerate(hidden_layers):
        if idx == 0:
            startCount = val * (input_size + 1)
            theta_list.append(np.matrix(np.reshape(params[:startCount], (val, (input_size + 1)))))
        if idx != 0:
            tempCount = startCount
            startCount += (val * (hidden_layers[idx-1] + 1))
            theta_list.append(np.matrix(np.reshape(params[tempCount:startCount], (val, (hidden_layers[idx-1] + 1)))))
        if idx == (len(hidden_layers)-1):
            theta_list.append(np.matrix(np.reshape(params[startCount:], (num_labels, (val + 1)))))


    a_list, z_list= forward_prop(X, theta_list)
    J = cost(X, y, a_list[len(a_list)-1], theta_list, regularization, regularize)
    
    d_list = []
    d_list.append(a_list[len(a_list)-1] - y)
    
    idx = 0
    while idx < (len(theta_list)-1):
        d_temp = np.multiply(d_list[idx] * theta_list[len(a_list) - 2 - idx], sigmoid_gradient(a_list[len(a_list) - 2 - idx]))
        d_list.append(d_temp[:,1:])
        idx += 1    
    
    delta_list = []
    for theta in theta_list:
        delta_list.append(np.zeros(theta.shape))

    for idx, delta in enumerate(delta_list):
        delta_list[idx] = delta_list[idx] + ((d_list[len(d_list) - 1 -idx].T) * a_list[idx])
        delta_list[idx] = delta_list[idx] / m
   
    if regularize:
        for idx, delta in enumerate(delta_list):
            delta_list[idx][:, 1:] = delta_list[idx][:, 1:] + (theta_list[idx][:, 1:] * regularization)

    grad_list = np.ravel(delta_list[0])
    idx = 1
    while idx < (len(delta_list)):
        grad_list = np.concatenate((grad_list, np.ravel(delta_list[idx])), axis=None)
        idx += 1

    return J, grad_list

def cost(X, y, h, theta_list, regularization, regularize):

    m = X.shape[0]
    X = np.matrix(X)
    y = np.matrix(y)

    J = (np.multiply(-y, np.log(h)) - np.multiply((1 - y), np.log(1 - h))).sum() / m
        
    if regularize:
        regularization_value = 0.0
        for theta in theta_list:
            regularization_value += np.sum(np.power(theta[:, 1:], 2))
        J += (float(regularization) / (2 * m)) * regularization_value
        

    return J

Implementation

【讨论】：

【解决方案2】：

那是因为您使用了错误的激活函数（即 sigmoid）。我们使用 sigmoid 函数的主要原因是因为它存在于 (0 to 1) 之间。因此，它特别适用于我们必须将概率预测为输出的模型。由于任何事物的概率仅存在于 0 和 1 之间，sigmoid 是正确的选择。

如果您想训练模型来预测数组中的值，您应该使用回归模型。否则，您可以将输出转换为标签（例如，将 5.x 转换为 0，将 6.x 转换为 1）并重新训练您的模型。

【讨论】：

@arminrd 感谢您宝贵的时间和回复。你说的对。我没有注意它。我会尝试不同的回归模型，但如果您考虑特定模型，请发表评论。 Tnx。