在 Python 中使用 Keras 的神经网络中的特征重要性图答案

【问题标题】：Feature Importance Chart in neural network using Keras in Python在 Python 中使用 Keras 的神经网络中的特征重要性图
【发布时间】：2018-01-03 19:39:19
【问题描述】：

我正在使用 python(3.6) anaconda (64 位) spyder (3.1.2)。我已经使用 keras (2.0.6) 为回归问题（一个响应，10 个变量）设置了一个神经网络模型。我想知道如何生成这样的特征重要性图表：

def base_model():
    model = Sequential()
    model.add(Dense(200, input_dim=10, kernel_initializer='normal', activation='relu'))
    model.add(Dense(1, kernel_initializer='normal'))
    model.compile(loss='mean_squared_error', optimizer = 'adam')
    return model

clf = KerasRegressor(build_fn=base_model, epochs=100, batch_size=5,verbose=0)
clf.fit(X_train,Y_train)

【问题讨论】：

标签： python neural-network keras

【解决方案1】：

目前 Keras 不提供任何功能来提取特征重要性。

您可以检查上一个问题： Keras: Any way to get variable importance?

或相关的 GoogleGroup：Feature importance

剧透：在 GoogleGroup 中，有人宣布了一个开源项目来解决这个问题..

【讨论】：

【解决方案2】：

我最近在寻找这个问题的答案，并发现了一些对我正在做的事情有用的东西，并认为分享它会有所帮助。我最终使用了来自eli5 package 的permutation importance 模块。它最容易与 scikit-learn 模型一起使用。幸运的是，Keras 提供了一个wrapper for sequential models。如下代码所示，使用起来非常简单。

from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
import eli5
from eli5.sklearn import PermutationImportance

def base_model():
    model = Sequential()        
    ...
    return model

X = ...
y = ...

my_model = KerasRegressor(build_fn=base_model, **sk_params)    
my_model.fit(X,y)

perm = PermutationImportance(my_model, random_state=1).fit(X,y)
eli5.show_weights(perm, feature_names = X.columns.tolist())

【讨论】：

这一行 eli5.show_weights(perm, feature_names = X.columns.tolist()) 返回错误：AttributeError: module 'eli5' has no attribute 'show_weights '
回溯（最近一次调用最后一次）：中的文件 eli5.show_weights(perm, feature_names = col) AttributeError: module 'eli5' has no attribute 'show_weights'
不确定是什么问题。它可以在我的电脑上运行，并在此处的文档中列出：eli5.readthedocs.io/en/latest/overview.html您有最新版本吗？
我和 eli5 开发者聊天；事实证明，错误：AttributeError: module 'eli5' has no attribute 'show_weights' 仅在我不使用 iPython Notebook 时才显示，而我在发布帖子时并未使用 iPython Notebook。奇怪的现象，但我会在安装 IPython 的情况下进行测试。
为什么所有排列的总和（perm.feature_importances_）不等于一？

【解决方案3】：

这是一篇相对较旧的帖子，答案相对较旧，所以我想提供另一个建议，使用SHAP 来确定您的 Keras 模型的特征重要性。 SHAP 提供对 2d 和 3d 数组的支持，而 eli5 目前仅支持 2d 数组（因此，如果您的模型使用需要 3d 输入的层，例如 LSTM 或 GRU，eli5 将不起作用）。

这是link 的示例，说明SHAP 如何为您的Keras 模型绘制特征重要性，但以防万一它损坏了一些示例代码，下面也提供了图表（取自说链接）：


import shap

# load your data here, e.g. X and y
# create and fit your model here

# load JS visualization code to notebook
shap.initjs()

# explain the model's predictions using SHAP
# (same syntax works for LightGBM, CatBoost, scikit-learn and spark models)
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X)

# visualize the first prediction's explanation (use matplotlib=True to avoid Javascript)
shap.force_plot(explainer.expected_value, shap_values[0,:], X.iloc[0,:])

shap.summary_plot(shap_values, X, plot_type="bar")

【讨论】：

使用DeepExplainer时出错：keras is no longer supported, please use tf.keras instead.
使用TreeExplainerSHAPError: Model type not yet supported by TreeExplainer: <class 'tensorflow.python.keras.engine.sequential.Sequential'>时出错
@HashRocketSyntax 我假设您正在尝试使用 Keras 的 Sequential 层。您可以尝试使用它来导入Sequential 吗？ from tensorflow.keras import Sequential
@jarrettyeo，from tensorflow.keras import Sequential 仍然不起作用。我收到错误：Exception: Model type not yet supported by TreeExplainer: <class 'tensorflow.python.keras.engine.sequential.Sequential'>
@user5305519 您能提供上述任何问题的解决方案吗？我也收到此错误：异常：TreeExplainer 尚不支持模型类型：