【发布时间】:2018-05-05 17:34:04
【问题描述】:
无法判断它是来自我的代码还是框架中的错误。 好的,所以我只是在做一个供个人使用的个人项目,以更好地使用 python。这是我第一个有超过 100 行代码的项目,所以我一定会出错,但我一直收到这个错误。当我得到参考以防万一我有很大的语法错误时,我真的看不出有什么不同。它指向库内的错误 和代码,所以我想弄清楚是否有修复。事情是超过100行代码,所以我会尽力放一个简化版本。如果您能帮助我了解我在内部做错了什么,我将不胜感激。
from sklearn import tree
import pandas as pd
#to read the csv file
df = pd.read_csv('aapl.csv', parse_dates=True, index_col=0)
#sets up the Decision tree
clf = tree.DecisionTreeClassifier()
#input data for training ... there is a lot of data so this is
#the smaller version to get to the point
X = [[7, 1, 17], [7, 3, 17], [7, 5, 17], [7, 7, 17], [7, 10, 17],
[7, 11, 17], [7, 13, 17], [7, 15, 17], [7, 17, 17], [7, 19, 17]]
#Output data... This is only a fraction ,but it is simplified like X
Y = ['144.88, 145.30, 143.10, 143.50, 14277848',
'144.88, 145.30, 143.10, 143.50, 14277848',
'143.69, 144.79, 142.72, 144.09, 21569557',
'142.90, 144.75, 142.90, 144.18, 19201712',
'144.11, 145.95, 143.37, 145.06, 21090636',
'144.73, 145.85, 144.38, 145.53, 19781836',
'145.50, 148.49, 145.44, 147.77, 25199373',
'147.97, 149.33, 147.33, 149.04, 20132061',
'148.82, 150.90, 148.57, 149.56, 23793456',
'150.48, 151.42, 149.95, 151.02, 20922969']
#fitting the data in. This is where is said there was a error ,but it
#is still consistent with the variables above
clf = clf.fit(X, Y)
#tells it to predict
test = clf.predict([[9, 12, 17]])
#prints the prediction
print(test)
然后当我尝试运行它时它给我的错误
Traceback(最近一次调用最后一次): 文件“/Users/kodecreer/Documents/PersonalDataProj.py”,第 117 行,在 clf = clf.fit(X, Y) 文件“/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/tree/tree.py”,第 790 行,适合 X_idx_sorted=X_idx_sorted) 文件“/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/tree/tree.py”,第 236 行,适合 "样本数=%d" % (len(y), n_samples)) ValueError:标签数=44 与样本数=45 不匹配
我尝试卸载 scikit 然后重新安装并刷新 python 编译器。我也试过在stackoverflow上搜索,但找不到...
答案:输入与输出不匹配,这就是它这样做的原因。谢谢江川智宏的回答
【问题讨论】:
标签: python pandas scikit-learn