【问题标题】:How to use Recursive Feature elimination?如何使用递归特征消除?
【发布时间】:2020-04-05 20:31:49
【问题描述】:

我是 ML 新手,一直在尝试使用 RFE 方法进行特征选择。我的数据集有 5K 条记录及其二进制分类问题。这是我根据教程 online

编写的代码
#no of features
nof_list=np.arange(1,13)            
high_score=0
#Variable to store the optimum features
nof=0           
score_list =[]
for n in range(len(nof_list)):
    X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.3, random_state = 0)
    model = RandomForestClassifier()
    rfe = RFE(model,nof_list[n])
    X_train_rfe = rfe.fit_transform(X_train,y_train)
    X_test_rfe = rfe.transform(X_test)
    model.fit(X_train_rfe,y_train)
    score = model.score(X_test_rfe,y_test)
    score_list.append(score)
    if(score>high_score):
        high_score = score
        nof = nof_list[n]
print("Optimum number of features: %d" %nof)
print("Score with %d features: %f" % (nof, high_score))

我遇到以下错误。有人可以帮忙吗

TypeError                                 Traceback (most recent call last)
<ipython-input-332-a23dfb331001> in <module>
      9     model = RandomForestClassifier()
     10     rfe = RFE(model,nof_list[n])
---> 11     X_train_rfe = rfe.fit_transform(X_train,y_train)
     12     X_test_rfe = rfe.transform(X_test)
     13     model.fit(X_train_rfe,y_train)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\base.py in fit_transform(self, X, y, **fit_params)
    554             Training set.
    555 
--> 556         y : numpy array of shape [n_samples]
    557             Target values.
    558 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\feature_selection\_base.py in transform(self, X)
     75         X = check_array(X, dtype=None, accept_sparse='csr',
     76                         force_all_finite=not tags.get('allow_nan', True))
---> 77         mask = self.get_support()
     78         if not mask.any():
     79             warn("No features were selected: either the data is"

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\feature_selection\_base.py in get_support(self, indices)
     44             values are indices into the input feature vector.
     45         """
---> 46         mask = self._get_support_mask()
     47         return mask if not indices else np.where(mask)[0]
     48 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\feature_selection\_rfe.py in _get_support_mask(self)
    269 
    270     def _get_support_mask(self):
--> 271         check_is_fitted(self)
    272         return self.support_
    273 

TypeError: check_is_fitted() missing 1 required positional argument: 'attributes'

【问题讨论】:

    标签: machine-learning scikit-learn feature-extraction feature-selection rfe


    【解决方案1】:

    你的sklearn 是什么版本?

    以下(使用人工数据)应该可以正常工作:

    from sklearn.model_selection import train_test_split
    import numpy as np
    from sklearn.feature_selection import RFE
    from sklearn.ensemble import RandomForestClassifier
    
    X = np.random.rand(100,20)
    y = np.ones((X.shape[0]))
    
    #no of features
    nof_list=np.arange(1,13)            
    high_score=0
    #Variable to store the optimum features
    nof=0           
    score_list =[]
    for n in range(len(nof_list)):
        X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.3, random_state = 0)
        model = RandomForestClassifier()
        rfe = RFE(model,nof_list[n])
        X_train_rfe = rfe.fit_transform(X_train,y_train)
        X_test_rfe = rfe.transform(X_test)
        model.fit(X_train_rfe,y_train)
        score = model.score(X_test_rfe,y_test)
        score_list.append(score)
        if(score>high_score):
            high_score = score
            nof = nof_list[n]
    print("Optimum number of features: %d" %nof)
    print("Score with %d features: %f" % (nof, high_score))
    

    最佳特征数:1

    1 个特征的得分:1.000000

    测试的版本:

    sklearn.__version__
    '0.20.4'
    
    sklearn.__version__
    '0.21.3'
    

    【讨论】:

    • 你做了什么改变?
    • 有 2 个潜在问题:1)您的 sklearn 版本 2)X,y 的形状。你的 sklearn 版本是多少?
    • 如果你运行上面的代码,它可以工作吗?如果没有,我会更新 sklearn 并尝试重现,以防是版本错误
    • 我只是在python3,sklearn 0.21.3中运行代码没有错误。我建议尝试重新安装 sklearn。
    • 很遗憾我没有管理员权限来重新安装它。
    猜你喜欢
    • 1970-01-01
    • 2019-07-15
    • 1970-01-01
    • 1970-01-01
    • 2018-05-21
    • 2017-09-29
    • 2019-05-29
    • 2019-01-26
    • 2020-10-27
    相关资源
    最近更新 更多