如何手动初始化权重值？答案

【问题标题】：How to manually initialize the values for the weights?如何手动初始化权重值？
【发布时间】：2016-04-14 11:29:36
【问题描述】：

我想试验一下 Karpathy 在他的讲义中推荐的权重初始化，

推荐的启发式是初始化每个神经元的权重向量如：w = np.random.randn(n) / sqrt(n)，其中 n 是它的数量输入

来源：http://cs231n.github.io/neural-networks-2/#init

我是python的初学者，我不知道如何实现这个：/

weights = tf.Variable(??)

请帮忙？ ...

【问题讨论】：

标签： python-2.7 neural-network tensorflow

【解决方案1】：

对于单个值，使用：

weights = tf.Variable(10)

对于具有随机值的向量：

shape = [784, 625]
weights = tf.Variable(tf.random_normal(shape, stddev=0.01)/tf.sqrt(n))

请注意，您需要 sess.run 来评估变量。

另外，请查看其他随机张量：https://www.tensorflow.org/versions/r0.8/api_docs/python/constant_op.html#random-tensors

【讨论】：

非常感谢您的回复。我不明白您显示的代码行中的np.random.randn(n) 在哪里。我想我不想将tf.random_normal 与标准差参数一起使用，而是使用np.random.randn(n) 手动设置权重矩阵的每个权重。这可以实现吗？
我会使用 tf.random.X。你可以用 tf.random 替换 np.random.randn(n)，然后做同样的事情。请查看tensorflow.org/versions/r0.8/api_docs/python/…。
@Kalanit 只是好奇，您认为randn 与random_normal 有何不同？
@AlexI 仅在np.random.randn(n)、n 中出现。在 tf.random_normal(shape, stddev=0.01) 中，n 没有出现。另外，我必须决定一个值作为标准偏差。我在某个地方错了吗？（同样，我是初学者，所以欢迎任何有助于我理解的解释）
@Kalanit：np.random.randn(shape)*0.01 与 tf.random_normal(shape, stddev=0.01) 相同。 randn 产生 stddev=1 的数字，但你可以乘以得到任何你想要的 stdev。除此之外，它们完全相同。

【解决方案2】：

n = 10
init_x = np.random.randn(n)
x = tf.Variable(init_x)
sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
print(sess.run(x))

【讨论】：

【解决方案3】：

我是这样做的：

    self.w_full, self.b_full = [], []

    n_fc_layers = len(structure)
    structure.insert(0, self.n_inputs)

    with vs.variable_scope(self.scope):
        for lr_idx in range(n_fc_layers):
            n_in, n_out = structure[lr_idx], structure[lr_idx+1]
            self.w_full.append(
                vs.get_variable(
                   "FullWeights{}".format(lr_idx),
                    [n_in, n_out],
                    dtype=tf.float32,
                    initializer=tf.random_uniform_initializer(
                        minval=-tf.sqrt(tf.constant(6.0)/(n_in + n_out)),
                        maxval=tf.sqrt(tf.constant(6.0)/(n_in + n_out))
                    )
                )
            )

            self.b_full.append(
                vs.get_variable(
                    "FullBiases{}".format(lr_idx),
                    [n_out],
                    dtype=tf.float32,
                    initializer=tf.constant_initializer(0.0)
                )
            )

之后

structure.insert(0, self.n_inputs)

你将有 [n_inputs, 1st FC layer size, 2nd FC layer size ... output layer size]

【讨论】：