【发布时间】:2016-09-06 23:37:26
【问题描述】:
我想使用决策树来根据 6 个也是浮点值的特征来预测浮点值。我意识到决策树可能不是最好的方法,但我正在比较多种方法以尝试更好地理解它们
我得到的错误是我的训练数据列表中的“未知标签类型”。我已经读过“DecisionTreeClassifier”接受浮点值,并且通常这些值无论如何都会转换为浮点 32。我明确将列表中的值设置为 float32 但似乎仍然存在问题,有人可以帮忙吗?
我的 x 个训练数据样本 (features_x_train):
[[ 2.49496743e-01 6.07936502e-01 -4.20752168e-01 -3.88045199e-02
-7.59323120e-01 -7.59323120e-01]
[ 4.07418489e-01 5.36915325e-02 2.95270741e-01 1.87122121e-01
9.89770174e-01 9.89770174e-01]]
我的 y 训练数据样本 (predict_y_train):[ -7.59323120e-01 9.89770174e-01]
代码...
df_train = wellbeing_df[feature_cols].sample(frac=0.9)
#Split columns into predictor and result
features_x_train =
np.array(df_train[list(top_features_cols)].values).astype(np.float32)
predict_y_train = np.asarray(df_train['Happiness score'], dtype=np.float32)
#Setup decision tree
decision_tree = tree.DecisionTreeClassifier()
decision_tree = decision_tree.fit(features_x_train, predict_y_train)
#Train tree on 90% of available data
错误:
ValueError Traceback (most recent call last)
<ipython-input-103-a44a03982bdb> in <module>()
19 #Setup decision tree
20 decision_tree = tree.DecisionTreeClassifier()
---> 21 decision_tree = decision_tree.fit(features_x_train, predict_y_train) #Train tree on 90% of available data
22
23 #Test on remaining 10%
C:\Users\User\Anaconda2\lib\site-packages\sklearn\tree\tree.pyc in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
175
176 if is_classification:
--> 177 check_classification_targets(y)
178 y = np.copy(y)
179
C:\Users\User\Anaconda2\lib\site-packages\sklearn\utils\multiclass.pyc in check_classification_targets(y)
171 if y_type not in ['binary', 'multiclass', 'multiclass-multioutput',
172 'multilabel-indicator', 'multilabel-sequences']:
--> 173 raise ValueError("Unknown label type: %r" % y)
174
175
ValueError: Unknown label type: array([[ -7.59323120e-01],
[ 9.89770174e-01],
另外,如果我将列表更改为字符串值,那么代码就会运行
【问题讨论】:
标签: python-2.7 machine-learning data-mining decision-tree