【问题标题】:'DataFrame' object has no attribute '_check_fit_params''DataFrame' 对象没有属性 '_check_fit_params'
【发布时间】:2021-07-14 16:24:03
【问题描述】:

为了根据准确率分下降特征,我写了如下代码,这里'ABCDEFGHIJKLMNO'是列(特征),一共15个特征。

features = 'ABCDEFGHIJKLMNO'

for i in range(0,len(features)):
    
    pipeline = PMMLPipeline
    ([
    ('mapper', DataFrameMapper([(X_train.columns.drop([features[i:i+1]]).values)])),
    ('pca', PCA(n_components=3)),
    ('classifier', DecisionTreeClassifier())
    ])
    
    pipeline.fit(training_data.drop([features[i:i+1]],axis=1),training_data['Class'])
    
    result = pipeline.predict(X_test)
    actual = np.concatenate(y_test.values)
    
    print("Dropped feature: {}, Accuracy: {}".format(features[i:i+1], metrics.accuracy_score(actual,result)))

我正在使用sklearn2pmml.pipeline 库,但在拟合数据时出现以下错误。我无法弄清楚为什么?

【问题讨论】:

    标签: python python-3.x dataframe scikit-learn pmml


    【解决方案1】:

    似乎您的PMMLPipeline 缩进错误,很可能您不需要DataFrameMapper,因为它是(根据help page):

    DataFrameMapper,用于将 pandas 数据框列映射到的类 不同的 sklearn 转换

    你没有以不同的方式应用转换,所以我们不需要。

    设置一个示例数据集,例如:

    from sklearn2pmml.pipeline import PMMLPipeline
    from sklearn_pandas import DataFrameMapper
    from sklearn.decomposition import PCA
    import pandas as pd
    import numpy as np
    from sklearn.tree import DecisionTreeClassifier
    from sklearn.metrics import accuracy_score
    
    features = 'ABCDEFGHIJKLMNO'
    
    X = pd.DataFrame(np.random.uniform(0,1,(50,15)),
    columns=[i for i in features])
    y = np.random.binomial(1,0.5,50)
    
    X_train, X_test,y_train, y_test = train_test_split(X,y,test_size=0.3)
    

    运行更正后的代码可以正常工作:

    for i in range(0,len(features)):
        
        pipeline = PMMLPipeline([
        ('pca', PCA(n_components=3)),
        ('classifier', DecisionTreeClassifier())
        ])
        
        pipeline.fit(X_train.drop([features[i:i+1]],axis=1),y_train)
        
        result = pipeline.predict(X_test.drop([features[i:i+1]],axis=1))
        actual = y_test
        
        print("Dropped feature: {}, Accuracy: {}".format(features[i:i+1],
        accuracy_score(actual,result)))
    
    
    Dropped feature: A, Accuracy: 0.9333333333333333
    Dropped feature: B, Accuracy: 0.6
    Dropped feature: C, Accuracy: 0.7333333333333333
    Dropped feature: D, Accuracy: 0.6
    Dropped feature: E, Accuracy: 0.6666666666666666
    Dropped feature: F, Accuracy: 0.6666666666666666
    Dropped feature: G, Accuracy: 0.6
    Dropped feature: H, Accuracy: 0.8
    Dropped feature: I, Accuracy: 0.6666666666666666
    Dropped feature: J, Accuracy: 0.6666666666666666
    Dropped feature: K, Accuracy: 0.7333333333333333
    Dropped feature: L, Accuracy: 0.8
    Dropped feature: M, Accuracy: 0.6
    Dropped feature: N, Accuracy: 0.8
    Dropped feature: O, Accuracy: 0.6666666666666666
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-12-17
      • 2013-10-23
      • 2017-10-22
      • 2021-01-08
      • 2021-11-01
      • 2016-03-09
      • 2016-04-02
      • 2022-01-11
      相关资源
      最近更新 更多