【发布时间】:2020-03-23 06:07:44
【问题描述】:
我正在研究一个模型来预测房屋的价格。为了生成模型,我使用 sklearn 的 DecisionTreeRegressor。我将数据拆分为火车并与train_test_split 拆分。但是当我尝试将数据拟合到模型时,我收到以下错误
KeyError Traceback (most recent call last)
<ipython-input-25-f4acd876feae> in <module>
1 for max_leaf_nodes in [5, 50, 500, 5000]:
----> 2 my_mae = get_mae(max_leaf_nodes, train_X, val_X, train_y, val_y)
3 print("Max leaf nodes: %d \t\t Mean Absolute Error: %d" %(max_leaf_nodes, my_mae))
<ipython-input-21-1a489238552f> in get_mae(max_leaf_nodes, train_inp, val_inp, train_oup, val_oup)
2
3 model = DecisionTreeRegressor(max_leaf_nodes, random_state=0)
----> 4 model.fit(train_inp, train_oup)
5 predictions = model.predict(val_inp)
6 mae = mean_absolute_error(val_oup, predictions)
~/anaconda3/lib/python3.7/site-packages/sklearn/tree/tree.py in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
1140 sample_weight=sample_weight,
1141 check_input=check_input,
-> 1142 X_idx_sorted=X_idx_sorted)
1143 return self
1144
~/anaconda3/lib/python3.7/site-packages/sklearn/tree/tree.py in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
331 self.n_classes_)
332 else:
--> 333 criterion = CRITERIA_REG[self.criterion](self.n_outputs_,
334 n_samples)
335
KeyError: 5
这是我的代码
get_mae 函数
def get_mae(max_leaf_nodes, train_inp, val_inp, train_oup, val_oup):
model = DecisionTreeRegressor(max_leaf_nodes, random_state=0)
model.fit(train_inp, train_oup)
predictions = model.predict(val_inp)
mae = mean_absolute_error(val_oup, predictions)
return mae
读取数据集
df = pd.read_csv('../DATASETS/melb_data.csv')
y = df.Price
features = ['Rooms', 'Bathroom', 'Landsize', 'Lattitude', 'Longtitude']
X = df[features]
train_X, val_X, train_y, val_y = train_test_split(X, y, random_state=0)
循环寻找最佳叶子节点数
for max_leaf_nodes in [5, 50, 500, 5000]:
my_mae = get_mae(max_leaf_nodes, train_X, val_X, train_y, val_y)
print("Max leaf nodes: %d \t\t Mean Absolute Error: %d" %(max_leaf_nodes, my_mae))
【问题讨论】:
标签: python scikit-learn decision-tree