简单的神经网络 - 如何存储权重？答案

【问题标题】：Simple neural network - how to store weights?简单的神经网络 - 如何存储权重？
【发布时间】：2022-04-04 16:54:50
【问题描述】：

我最近开始学习 Python，并正在尝试实现我的第一个神经网络。我的目标是编写一个函数来生成具有可变层数和节点的神经网络。所有必要的信息都存储在 layerStructure 中（例如：第一层有四个节点，第三层有三个节点）。

import numpy as np

#Vector of input layer
input = np.array([1,2,3,4])

#Amount of nodes in each layer
layerStructure = np.array([len(input),2,3])

#Generating empty weight matrix container
weightMatrix_arr = np.array([])

#Initialsing random weights matrices
for ii in range(len(layerStructure[0:-1])):
    randmatrix = np.random.rand(layerStructure[ii+1],layerStructure[ii])
    print(randmatrix)

上面的代码生成如下输出：

[[0.6067148  0.66445212 0.54061231 0.19334004]
 [0.22385007 0.8391435  0.73625366 0.86343394]]
[[0.61794333 0.9114799 ]
 [0.10626486 0.95307027]
 [0.50567023 0.57246852]]

我的第一次尝试是将每个随机权重矩阵存储在一个名为 weightMatrix_arr 的容器数组中。但是，由于各个矩阵的形状各不相同，我不能使用 np.append() 将它们全部存储在矩阵容器中。如何保存这些矩阵以便在反向传播期间访问它们？

【问题讨论】：

标签： python neural-network

【解决方案1】：

您可以使用list 代替np.array：

#Generating empty weight LIST container
weightMatrixes = []

#Initialsing random weights matrices
for ii in range(len(layerStructure[0:-1])):
    randmatrix = np.random.rand(layerStructure[ii+1],layerStructure[ii])
    weightMatrixes.append(randmatrix)
    print(randmatrix)

否则您可以将weightMatrix_arr dtype 设置为object：：

#Generating empty weight LIST container
weightMatrixes = np.array([], dtype=object)

#Initialsing random weights matrices
for ii in range(len(layerStructure[0:-1])):
   randmatrix = np.random.rand(layerStructure[ii+1],layerStructure[ii])
   weightMatrixes = np.append(weightMatrixes, randmatrix)

请注意，如果不访问层矩阵，您就无法访问内层索引：

weightMatrixes[layer, 0, 3] # ERROR
weightMatrixes[layer][0, 3] # OK

【讨论】：

【解决方案2】：

如果内存消耗不是问题，您可以将所有层设置为最长的层，并根据layerStructure 值忽略多余的单元格。

【讨论】：

【解决方案3】：

I used a python dictionary to store the weights for each hidden layer with layer number as a key to the dictionary, 
so that while retrieval is easy to access the weights I,e simple and clean use the dictionary to store the model weights, 
its doesn't matter the shape of weights. below is a snippet of code.


"""def generate_weights(layers):
    Weights={}
    for i  in range(1,len(layers)):
        w0=2*np.random.random((layers[i-1],layers[i]))-1
        Weights[i-1] = w0
    return Weights

generate_weights([3,4,2])"""

【讨论】：