【问题标题】:Invalid Parameters for Sklearn GridSearchCVSklearn GridSearchCV 的参数无效
【发布时间】:2020-01-16 17:45:39
【问题描述】:

我的网格中的每一行都得到ValueError: Invalid parameter...

我尝试逐行删除每个网格选项,直到网格为空。我从pipeline.get_params() 复制并粘贴了参数的名称,以确保它们没有拼写错误。

from sklearn.model_selection import train_test_split
x_in, x_out, y_in, y_out = train_test_split(X, Y, test_size=0.2, stratify=Y)

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.pipeline import Pipeline
from sklearn.feature_selection import SelectKBest, chi2, f_classif
from sklearn.svm import LinearSVC
from sklearn.model_selection import GridSearchCV

grid = {
    'TF-IDF__ngram_range':[(1,2),(2,3)],
    'TF-IDF__stop_words': [None, 'english'],
    'SelectKBest__k': [10000, 15000],
    'SelectKBest__score_func': [f_classif, chi2],
    'linearSVC__penalty': ['l1', 'l2']
}

pipeline = Pipeline([('tfidf', TfidfVectorizer(sublinear_tf=True)),
                     ('selectkbest', SelectKBest()),
                     ('linearscv', LinearSVC(max_iter=10000, dual=False))])

grid_search = GridSearchCV(pipeline, param_grid=grid, scoring='accuracy', n_jobs=-1, cv=5)
grid_search.fit(X=x_in, y=y_in)

【问题讨论】:

  • 我已编辑问题中的代码以显示 x_in 来自较早的 train_test_split

标签: python scikit-learn pipeline gridsearchcv


【解决方案1】:

我认为您不是指网格上具有正确名称的管道阶段。您在管道上为每个阶段分配的名称(tfidf、selectkbest、linearscv)应该与网格中的名称相同。我会这样做:

pipeline = Pipeline([('tfidf', TfidfVectorizer(sublinear_tf=True)),
                     ('selectkbest', SelectKBest()),
                     ('linearscv', LinearSVC(max_iter=10000, dual=False))]) 
grid = {
    'tfidf__ngram_range':[(1,2),(2,3)],
    'tfidf__stop_words': [None, 'english'],
    'selectkbest__k': [10000, 15000],
    'selectkbest__score_func': [f_classif, chi2],
    'linearscv__penalty': ['l1', 'l2'] }

【讨论】:

  • 谢谢,我在更改名称之前运行了 pipeline.get_params()。需要检查的重要事项。
猜你喜欢
  • 2018-03-21
  • 2017-08-23
  • 2020-11-18
  • 2020-09-10
  • 2020-02-22
  • 2020-06-18
  • 2020-07-14
  • 2021-08-22
  • 2016-12-22
相关资源
最近更新 更多