【发布时间】:2022-01-17 16:42:04
【问题描述】:
所以,我正在尝试可视化我的线性模型回归。但是,当我尝试运行它时,它给了我一个 valueError。 我尝试了不同的解决方案,并查看了其他具有相同问题的主题。
df = pd.read_csv('housingmonthly.csv', sep=',')
X = df[['date', 'area', 'code','houses_sold', 'no_of_crimes']]
y = df['average_price']
X = pd.get_dummies(df[['date', 'area', 'code', 'houses_sold', 'no_of_crimes']])
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
print("Xtrain", X_train.shape, "y_train",
y_train.shape, "Xtest", X_test.shape, "y_test", y_test.shape)
regr = linear_model.LinearRegression()
lr = LinearRegression()
lr.fit(X_train,y_train)
print("Score on training set: {:.3f}".format(lr.score(X_train, y_train)))
print("Score on test set: {:.3f}".format(lr.score(X_test, y_test)))
regr.fit(X_train, y_train)
y_pred = regr.predict(X_test)
plt.scatter(X_test, y_test, color="black")
plt.plot(X_test, y_pred, color="blue", linewidth=3)
plt.xticks(())
plt.yticks(())
plt.show()
堆栈跟踪:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/var/folders/tl/80zdv_rx5sv1t7d5dgz86bzc0000gn/T/ipykernel_29101/3394670003.py in <module>
15 print("Coefficient of determination: %.2f" % r2_score(y_test, y_pred))
16
---> 17 plt.scatter(X_test, y_test, color="black")
18 plt.plot(X_test, y_pred, color="blue", linewidth=3)
19 plt.Xticks(())
/opt/anaconda3/lib/python3.8/site-packages/matplotlib/pyplot.py in scatter(x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, plotnonfinite, data, **kwargs)
2888 verts=cbook.deprecation._deprecated_parameter,
2889 edgecolors=None, *, plotnonfinite=False, data=None, **kwargs):
-> 2890 __ret = gca().scatter(
2891 x, y, s=s, c=c, marker=marker, cmap=cmap, norm=norm,
2892 vmin=vmin, vmax=vmax, alpha=alpha, linewidths=linewidths,
/opt/anaconda3/lib/python3.8/site-packages/matplotlib/__init__.py in inner(ax, data, *args, **kwargs)
1445 def inner(ax, *args, data=None, **kwargs):
1446 if data is None:
-> 1447 return func(ax, *map(sanitize_sequence, args), **kwargs)
1448
1449 bound = new_sig.bind(ax, *args, **kwargs)
/opt/anaconda3/lib/python3.8/site-packages/matplotlib/cbook/deprecation.py in wrapper(*inner_args, **inner_kwargs)
409 else deprecation_addendum,
410 **kwargs)
--> 411 return func(*inner_args, **inner_kwargs)
412
413 return wrapper
/opt/anaconda3/lib/python3.8/site-packages/matplotlib/axes/_axes.py in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, plotnonfinite, **kwargs)
4439 y = np.ma.ravel(y)
4440 if x.size != y.size:
-> 4441 raise ValueError("x and y must be the same size")
4442
4443 if s is None:
这是错误代码。我现在不知道,我真的应该如何解决这个问题。
非常感谢
【问题讨论】:
-
你能显示错误的完整回溯吗?此外,您似乎很清楚 X 和 y 之间的维度存在问题。打印它们和它们的尺寸。最后,您使用的是 x 还是 X。也许这是错误,因为您复制的错误出现 x。
-
我刚刚添加了错误的回溯。我正在使用 X 而不是 x。正如你所说,我很确定它是 X 和 y 之间的维度,但我不知道我需要使用哪个代码来重塑它。非常感谢。
-
嗯,这与
scikit-learn无关。这是一个matplotlib问题。plt.scatter用于绘制二维数据。所以它期望 x 和 1 轴的值。
标签: python matplotlib