feature_importances_ 在 ExtraTreesClassifier 中显示为 NoneType :TypeError: 'NoneType' object is not iterable答案

【问题标题】：feature_importances_ showing up as NoneType in ExtraTreesClassifier :TypeError: 'NoneType' object is not iterablefeature_importances_ 在 ExtraTreesClassifier 中显示为 NoneType :TypeError: 'NoneType' object is not iterable
【发布时间】：2014-02-17 02:48:24
【问题描述】：

我正在尝试为给定数据集选择重要特征（或至少了解哪些特征解释了更多的可变性）。为此，我同时使用 ExtraTreesClassifier 和 GradientBoostingRegressor - 然后使用：-

clf = ExtraTreesClassifier(n_estimators=10,max_features='auto',random_state=0) # stops after 10 estimation passes, right ?
clf.fit(x_train, y_train)
feature_importance=clf.feature_importances_  # does NOT work - returns NoneType for feature_importance

发布此我真的很感兴趣对绘制它们（用于视觉表示） - 甚至是初步的，只是查看重要性的相对顺序和相应的索引

# Both of these do not work as the feature_importance is of NoneType
feature_importance = 100.0 * (feature_importance / feature_importance.max())
indices = numpy.argsort(feature_importance)[::-1]

我发现令人费解的是 - 如果我要使用 GradientBoostingRegressor，如下所示，我确实得到了 feature_importance 及其索引。我做错了什么？

#Works with GradientBoostingRegressor
params = {'n_estimators': 100, 'max_depth': 3, 'learning_rate': 0.1, 'loss': 'lad'}
clf = GradientBoostingRegressor(**params).fit(x_train, y_train)
clf.fit(x_train, y_train)
feature_importance=clf.feature_importances_

其他信息：我有 12 个独立的 vars(x_train) 和一个标签 var(y_train))，具有多个值（比如 4、5、7）和 type(x_train) 是和 type(feature_importance ) 是

致谢 : 部分内容借自这篇文章http://www.tonicebrian.com/2012/11/05/training-gradient-boosting-trees-with-python/

【问题讨论】：

标签： python-2.7 machine-learning scikit-learn nonetype

【解决方案1】：

初始化ExtraTreeClassifier 时，有一个选项compute_importances，默认为None。也就是说，需要将ExtraTreeClassifier初始化为

clf = ExtraTreesClassifier(n_estimators=10,max_features='auto',random_state=0,compute_importances=True)

以便计算特征重要性。

至于GradientBoostedRegressor，没有这样的选项，总是会计算特征重要性。

【讨论】：

可以获取 compute_importances 数组 - 谢谢您和 +1。您能否还提到如何选择 n_estimators ？在 scikit-learn 文档中，关于这方面的信息很少 - 我可以随机化这个并查看哪个是合适的（通过拟合优度）？
在确定“良好拟合”时，您可能希望监控验证集上的预测误差，而不是训练集上的误差，以避免过度拟合。例如，您可以尝试增加迭代次数，训练误差应该不断减少——但当验证误差开始增加时您就停止了。
ExtraTreeClassifier 的compute_importances 是新版本，不再需要