【发布时间】:2020-07-18 21:47:32
【问题描述】:
从标题中可以看出,来自类实例的self 并不是类实例本身。
当我使用带有 scikit-learn 管道的自定义类时会发生这种情况,但当我单独使用相同的自定义类时不会发生这种情况。
import pandas as pd
import numpy as np
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.preprocessing import OneHotEncoder
class multi_feature_OHE(BaseEstimator, TransformerMixin):
''' Encode multiple redundant features as one, usign One-Hot-Encoder. '''
def __init__(self):
self.encoder = OneHotEncoder(handle_unknown='ignore')
def fit(self, X):
#self.encoder.fit(X)
print(type(self)) # <--- We print the type of self here!
return self
def transform(self, X):
...
这里打印<class 'numpy.ndarray'>
pipeline = make_pipeline(...,
multi_feature_OHE)
pipeline.fit(data)
在fit 方法中,self == X 和 X == None。
但这里打印的是<class '__main__.multi_feature_OHE'>
a = multi_feature_OHE()
a.fit(data)
【问题讨论】:
标签: python python-3.x numpy scikit-learn