即使使用随机种子也无法使用 Tensorflow 重现结果答案

【问题标题】：Not able to reproduce results with Tensorflow even with random seed即使使用随机种子也无法使用 Tensorflow 重现结果
【发布时间】：2020-08-02 11:17:48
【问题描述】：

我正在使用我生成的数据在 Keras 中训练一个简单的自动编码器。我目前正在 Google Colab 笔记本中运行代码（以防可能相关的可能性很小）。为了获得可重复的结果，我目前正在设置如下所示的随机种子，但它似乎并不完全有效：

# Choose random seed value 
seed_value = 0

# Set numpy pseudo-random generator at a fixed value
np.random.seed(seed_value)

# Set tensorflow pseudo-random generator at a fixed value
import tensorflow as tf
tf.random.set_seed(seed_value)

随机种子代码似乎有助于在我每次初始化模型时获得相同的初始权重。创建模型后，我可以使用model.get_weights() 看到这一点（即使我重新启动笔记本并重新运行代码也是如此）。但是，我无法在模型性能方面获得可重现的结果，因为每次训练后模型的权重都不同。我假设上面的随机种子代码可以确保在训练期间每次都以相同的方式拆分和打乱数据，即使我没有事先拆分训练/验证数据（我改为使用validation_split=0.2）或指定@ 987654325@ 在拟合模型时，但也许我做出这个假设是不正确的？此外，我还需要包含其他随机种子以确保可重复的结果吗？这是我用来构建和训练模型的代码：

def construct_autoencoder(input_dim, encoded_dim):
   # Add input
   input = Input(shape=(input_dim,))

   # Add encoder layer
   encoder = Dense(encoded_dim, activation='relu')(input)

   # Add decoder layer
   # Input contains binary values, hence the sigmoid activation
   decoder = Dense(input_dim, activation='sigmoid')(encoder)
   model = Model(inputs=input, outputs=decoder)

   return model

autoencoder = construct_autoencoder(10, 6)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
# print(autoencoder.get_weights()) -> This is the same every time, even with restarting the notebook

autoencoder.fit([data,
                 data, 
                 epochs=20, 
                 validation_split=0.2,
                 batch_size=16,
                 verbose=0)

# print(autoencoder.get_weights()) -> This is different every time, but not sure why?

如果您对为什么我在模型训练期间没有获得可重现的结果有任何想法，请告诉我。我在 Keras 的网站上找到了这个 https://keras.io/getting-started/faq/#how-can-i-obtain-reproducible-results-using-keras-during-development，但不确定它是否与此相关（如果是，为什么？）。我知道还有其他问题询问模型训练的可重复性，但我没有找到任何一个可以解决这个特定问题。非常感谢！

【问题讨论】：

标签： python tensorflow keras deep-learning random-seed

【解决方案1】：

在 tensorflow 2.x 版本中

如果您在 cpu 中使用 tf，此代码可以重现结果。

seed_value = 42
import tensorflow as tf
tf.random.set_seed(seed_value)

但是如果你在 gpu 中使用 tf（默认），NVIDIA 的问题issue 会让你的结果无法重现，即使你写了tf.random.set_seed(seed_value)

所以解决办法是：

pip install tensorflow-determinism

然后使用下面的代码

def setup_seed(seed):
    random.seed(seed)  
    np.random.seed(seed) 
    tf.random.set_seed(seed)  # tf cpu fix seed
    os.environ['TF_DETERMINISTIC_OPS'] = '1'  # tf gpu fix seed, please `pip install tensorflow-determinism` first


setup_seed(42)

【讨论】：

【解决方案2】：

除了在 Keras 文章中设置种子和建议（它们确实相关），您需要确保您的所有 python 模块版本与笔记本中的相同。

使用pip freeze 命令（在命令行界面中）可以轻松地在本地检查所有模块的版本。可以通过以下方式逐个模块地在笔记本中进行检查：

import tensorflow as tf
print(tf.__version__)

【讨论】：

好的，我会确保按照文章进行。关于python模块的评论，可以扩展一下吗？谢谢！