打印前 10 个特征的名称及其卡方值答案

【问题标题】：Printing the names of top 10 features and their chi square value打印前 10 个特征的名称及其卡方值
【发布时间】：2017-05-28 20:18:49
【问题描述】：

我有一个分类问题。我为数据构建了一组特征。我使用 SVM 进行分类。我想评估这些功能。

ch2=SelectKBest(score_func=chi2, k='all')
top_ranked_features = sorted(enumerate(ch2.scores_),key=lambda x:x[1], reverse=True)[:1000]
top_ranked_features_indices = map(list,zip(*top_ranked_features))[0]
for feature_pvalue in zip(np.asarray(featurenames)[top_ranked_features_indices],ch2.pvalues_[top_ranked_features_indices]):
       print feature_pvalue

但是当我运行它时，我得到了以下错误

AttributeError: 'SelectKBest' 对象没有属性 'scores_'

注意：我没有使用矢量化器。我在列表名称featurenames 中有特征的名称，我想打印所有或前 K 个特征的名称和卡方值

【问题讨论】：

标签： python machine-learning scikit-learn svm

【解决方案1】：

您只声明了您想要使用的评分功能以及要选择的功能数量。然而，特征选择需要数据来使用一些统计测试找到最佳特征，然后您就可以访问分数。这是一个示例，其中X 包含特征，Y 包含目标值。

ch2= SelectKBest(score_func=chi2, k='all').fit_transform(X, Y)
print(ch2.scores_)

【讨论】：