Sklearn 的 SimpleImputer 在管道中无法检索插补值答案

【问题标题】：Sklearn's SimpleImputer can't retrieve imputation values when in pipelineSklearn 的 SimpleImputer 在管道中无法检索插补值
【发布时间】：2026-02-07 02:25:01
【问题描述】：

我正在尝试打印出与SimpleImputer 匹配后的所有插补值。当单独使用SimpleImputer 时，我可以从实例的statistics_ 属性中检索这些。

这很好用：

s = SimpleImputer(strategy='mean')
s.fit(df[['feature_1', 'feature_2']])
print(s.statistics_)

但是，在管道中使用 SimpleImputer 时，我无法这样做。

这不起作用：

numeric_transformer = Pipeline(steps=[
    ('simple_imputer', SimpleImputer(strategy='mean')),
    ('scaler', StandardScaler())])

categorical_features = ['feature_3']
categorical_transformer = Pipeline(steps=[
    ('simple_imputer', SimpleImputer(strategy='most_frequent')),
    ('one_hot', OneHotEncoder(handle_unknown='ignore'))])

preprocessor = ColumnTransformer(
    transformers=[
        ('num', numeric_transformer, numeric_features),
        ('cat', categorical_transformer, categorical_features)])

clf = Pipeline(steps=[('preprocessor', preprocessor),
                      ('classifier', RandomForestClassifier(n_estimators=100))])

clf.fit(df[numeric_features + categorical_features], df['target'])

print(clf.named_steps['preprocessor'].transformers[0][1].named_steps['simple_imputer'].statistics_)

我收到以下错误：

AttributeError                            Traceback (most recent call last)
<ipython-input-523-7390eac0d9d6> in <module>
     19 clf.fit(df[numeric_features + categorical_features], df['target'])
     20 
---> 21 print(clf.named_steps['preprocessor'].transformers[0][1].named_steps['simple_imputer'].statistics_)

AttributeError: 'SimpleImputer' object has no attribute 'statistics_

我相信我正在抓取合适的 SimpleImputer 对象的正确实例。为什么我无法检索其statistics_ 属性以打印出插补值？

【问题讨论】：

标签： python scikit-learn pipeline sklearn-pandas imputation

【解决方案1】：

我发现在使用sklearn 管道时使用“点”表示法更容易，尤其是因为您可以使用自动完成功能来帮助您浏览管道的结构/属性。它还具有额外的好处（在我看来），更具可读性。

您可以使用以下行来访问SimpleImputer 的statistics_ 属性：

clf.named_steps.preprocessor.named_transformers_.num.named_steps.simple_imputer.statistics_

【讨论】：