Python - 我们应该在 train_test_split() 中为 random_state 使用什么值以及在哪种情况下？ [关闭]答案

【问题标题】：Python - What value should we use for random_state in train_test_split() and in which scenario? [closed]Python - 我们应该在 train_test_split() 中为 random_state 使用什么值以及在哪种情况下？ [关闭]
【发布时间】：2019-06-13 08:05:13
【问题描述】：

X_train, X_test, y_train, y_test = train_test_split (X, y, test_size=0.20, random_state=0)

在上面的代码中，random_state 使用了 0。为什么我们不使用 1？

【问题讨论】：

stackoverflow.com/questions/42191717/… 和 stackoverflow.com/questions/28064634/… 的可能重复项
随机状态的值不会显着影响预测（差异可以忽略不计）。它只是为了在将来或在不同的系统/环境上再次重现结果而提供。它只是一颗种子。因此，如果您使用 random_state=50，那么 7 天后使用相同的 random_state=50 值，您将获得完全相同的拆分输出（即使在不同的环境/系统上）。
Python random state in splitting dataset的可能重复

标签： python machine-learning data-science

【解决方案1】：

random_state 的 0 或 1 都没有任何意义，该参数控制随机数生成器使用的种子，因此设置为任何值都意味着分裂是随机的，但每个结果都会完全相同打电话。

这通常用于重现性，但通常您不应依赖random_state 作为特定值。

如果您将random_state 设置为None，则每次调用train_test_split 时，它总是会有不同的随机行为。

【讨论】：