【发布时间】:2021-06-17 08:36:36
【问题描述】:
我正在尝试正确使用 sklearn 中的管道和列转换器,但总是以错误告终。我在下面的例子中重现了它。
# Data to reproduce the error
X = pd.DataFrame([[1, 2 , 3, 1 ],
[1, '?', 2, 0 ],
[4, 5 , 6, '?']],
columns=['A', 'B', 'C', 'D'])
#SimpleImputer to change the values '?' with the mode
impute = SimpleImputer(missing_values='?', strategy='most_frequent')
#Simple one hot encoder
ohe = OneHotEncoder(handle_unknown='ignore', sparse=False)
col_transfo = ColumnTransformer(transformers=[
('missing_vals', impute, ['B', 'D']),
('one_hot', ohe, ['A', 'B'])],
remainder='passthrough'
)
然后调用transformer如下:
col_transfo.fit_transform(X)
返回以下错误:
TypeError: Encoders require their input to be uniformly strings or numbers. Got ['int', 'str']
【问题讨论】:
标签: pandas scikit-learn preprocessor one-hot-encoding