【发布时间】:2019-01-12 21:09:50
【问题描述】:
我在 scikit-learn 中学习 Pipelines 和 FeatureUnions,因此想知道是否可以在一个类上重复应用“make_union”?
考虑以下代码:
import numpy as np
import pandas as pd
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.linear_model import LogisticRegression
import sklearn.datasets as d
class IrisDataManupulation(BaseEstimator, TransformerMixin):
"""
Raise the matrix of feature in power
"""
def __init__(self, power=2):
self.power = power
def fit(self, X, y=None):
return self
def transform(self, X):
return np.power(X, self.power)
iris_data = d.load_iris()
X, y = iris_data.data, iris_data.target
# feature union:
fu = FeatureUnion(transformer_list=[('squared', IrisDataManupulation(power=2)),
('third', IrisDataManupulation(power=3))])
问题 有什么巧妙的方法可以创建 FeatureUnion 而无需重复相同的转换器,而是传递参数列表?
例如:
fu_new = FeatureUnion(transformer_list=[('raise_power', IrisDataManupulation(),
param_grid = {'raise_power__power':[2,3]})
【问题讨论】:
标签: python scikit-learn pipeline