为什么 NumPy 不允许字符串种子但 Random 库在 Python 中允许它？答案

【问题标题】：Why does NumPy disallows string seed but the Random library allows it in Python?为什么 NumPy 不允许字符串种子但 Random 库在 Python 中允许它？
【发布时间】：2021-05-17 21:11:19
【问题描述】：

在 Python 中使用random 库时，我可以做到以下几点

>>> import random
>>> random.seed("twenty five")

但如果我使用 NumPy 生成随机数，我无法使用字符串设置种子

>>> import numpy as np
>>> np.random.seed("twenty five")
Traceback (most recent call last):
  File "_mt19937.pyx", line 178, in numpy.random._mt19937.MT19937._legacy_seeding
TypeError: 'str' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "mtrand.pyx", line 244, in numpy.random.mtrand.RandomState.seed
  File "_mt19937.pyx", line 166, in numpy.random._mt19937.MT19937._legacy_seeding
  File "_mt19937.pyx", line 186, in numpy.random._mt19937.MT19937._legacy_seeding
TypeError: Cannot cast scalar from dtype('<U11') to dtype('int64') according to the rule 'safe'

NumPy 是否可以接受字符串作为种子？

如果没有，是否有正确方法将字符串转换为 int 以在 NumPy 中设置种子？

random 库的seed 函数背后的机制是什么使它能够接受字符串种子？

【问题讨论】：

你为什么要关心？您无法保证 numpy.random 和 Python 的 random 使用相同的算法。
因为我很好奇，我想使用 NumPy 和一个字符串作为种子
毫无意义。如果您使用的是基本种子，则任何整数都与其他整数一样好。记住，你有源代码——你可以去看看random.seed 做了什么。
@TimRoberts 我仍然觉得很有趣，知道random 库最适合将字符串转换为整数以保持输入空间相对相似。此外，我认为我不应该用我关心的原因来证明我在堆栈溢出上发布的每个问题都是合理的，但我可能错了
当有人专注于错误的事情时，我并不过分指出。随机数种子的确切值根本没有意义。没有种子比另一个更好或“更随机”。 seed(1) 产生与seed(2) 完全不同（且有效）的序列。因此，没有用于翻译字符串的“最佳”算法。使用第一个字符的 ASCII 值会很好。

标签： python numpy random

【解决方案1】：

https://docs.python.org/3/library/random.html

来自 python 文档，

random.seed(a=None, version=2)

在版本 2（默认）中，str、bytes 或 bytearray 对象被转换为 int 并使用其所有位。

这意味着"twenty five" 不会转换为25，而是通过其 ascii 表示（以及 Tim Roberts 在 cmets 中指出的其 SHA512 摘要）。

请注意，此功能现已弃用。

编辑：

源代码

    elif version == 2 and isinstance(a, (str, bytes, bytearray)):
        if isinstance(a, str):
            a = a.encode()
        a = int.from_bytes(a + _sha512(a).digest(), 'big')

【讨论】：

这几乎是真的。请记住，我们有源代码。字符的 ASCII 值加上字符串的 SHA512 摘要被用作一个大整数来播种序列。
所以它查看字符串 25 的位表示并将其用作种子的整数？所以twenty five 将被转换为 '1110100 1110111 1100101 1101110 1110100 1111001 100000 1100110 1101001 1110110 1100101' 所以 690267747476572330 1568？如果数字太大会怎样？（我猜种子的大小是有限制的）
它是twenty-five的位表示，在字符串“二十五”的SHA512摘要后附加。
+Nathan Marotte 我会在相关源代码中编辑
random 所基于的底层 C 代码返回一个双精度浮点数，因此任何超过 53 位的种子都是没有意义的。