如何避免收集错误 Python Numpy答案

【问题标题】：How to avoid Collection Error Python Numpy如何避免收集错误 Python Numpy
【发布时间】：2019-10-17 01:55:37
【问题描述】：

我正在尝试训练线性回归限定符以继续掌握。我的 csv 文件中有几千行数据，我将它们导入到 numpy 数组中。这是我的代码：

import pandas as pd 
import numpy as np 
from matplotlib import pyplot as plt 
import csv
import math
from sklearn import preprocessing, svm
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

def predict():
    sample_data = pd.read_csv("includes\\csv.csv")
    x = np.array(sample_data["day"])
    y = np.array(sample_data["balance"])

    for x in x:
        x = x.reshape(1, -1)
        #lol

    for y in y:
        y.reshape(1, -1)
        #lol

    X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

    clf = LinearRegression()
    clf.fit(x_train, y_train)
    clf.score(x_test, y_test)

当我运行它时，错误是：

TypeError: Singleton array 6014651 cannot be considered a valid collection.

有什么想法吗？

【问题讨论】：

您可以提供更广泛的堆栈跟踪，尤其是发生错误的行，但我会用 reshape 指向循环：首先，我不会说与被迭代的集合同名的迭代器是个好主意。其次，为什么不重塑整个阵列？像x =x.reshape(-1,1)，没有for 循环
错误发生在哪一行？这真的是最短的minimal reproducible example吗？您可以尝试通过丢弃所有不需要的东西来定位问题，也在处理的数据中。
错误出现在"X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2)"这一行
您可以尝试将循环更改为for xi in x: xi = xi.reshape(1, -1) 并且对于y 类似吗？但同样，我建议重塑整个数组，而不是单独的每一行。
任何样本输入

标签： python pandas scikit-learn sklearn-pandas

【解决方案1】：

在 cmets 讨论后：

import pandas as pd 
import numpy as np 
from matplotlib import pyplot as plt 
import csv
import math
from sklearn import preprocessing, svm
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

def predict():
    sample_data = pd.read_csv("includes\\csv.csv")
    x = np.array(sample_data["day"])
    y = np.array(sample_data["balance"])

    x = x.reshape(-1,1)

    y = y.reshape(-1,1)

    X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

    clf = LinearRegression()
    clf.fit(X_train, y_train)
    clf.score(X_test, y_test)

【讨论】：

【解决方案2】：

X_train, X_test 应该是大写，python 变量区分大小写

【讨论】：