【问题标题】:f1-score always ~0.75?f1 分数总是 ~0.75?
【发布时间】:2019-11-14 02:50:52
【问题描述】:

我正在研究(我认为是)一个简单的二元分类问题。我从参数网格搜索中得到了这个奇怪的结果,无论参数是什么,模型总是返回 ~0.75 的 f1 分数。我不确定这是否:a) 反映了我对 f1 分数作为指标的误解,b) 是由于需要更正的数据或模型(我正在使用 XGBoost)存在问题,或者c) 只是表明模型参数基本上是不相关的,并且 f1 分数约为 0.75 与我得到的一样好。

更令人困惑的是,我对同一个问题的两组完全不同的预测变量得到了相同的结果(例如,如果我预测房地产价值,一组使用社区价格,另一组使用房屋特征——不同相同问题的预测变量集)。一组的范围约为 0.67-0.82,方差接近正常,而对于第二组(如下所示),每个参数集的 f1-score 几乎完全相同,均为 0.7477。

为了更详细一些,当前数据集有大约 30,000 个示例,一个类约占示例的 60%(另一个是 40%)。我还没有深入研究这个新数据集,但是对于之前的数据集,当我更仔细地检查一个模型时,我发现了合理的精度和召回值,随着不同的参数集发生了一些变化,这破坏了我对模型的担忧只是猜测更流行的类。

我正在使用 XGBoost,并使用 scikit-learn 的 GridSearchCV。跳过导入等网格搜索代码是

grid_values = {'n_estimators':[50,100,200,500,1000],'max_depth':[1,3,5,8], 'min_child_weight':range(1,6,2)}

clf=XGBClassifier()

grid_clf=GridSearchCV(clf,param_grid=grid_values,scoring='f1',verbose=10)
grid_clf.fit(game_records,hora)

print('Grid best score (f1): ', grid_clf.best_score_)
print('Grid best parameter (max. f1): ', grid_clf.best_params_)

https://pastebin.com/NSB0yaNi 的完整输出,此处显示部分(大部分):

Fitting 3 folds for each of 60 candidates, totalling 180 fits
[CV] max_depth=1, min_child_weight=1, n_estimators=50 ................
[CV]  max_depth=1, min_child_weight=1, n_estimators=50, score=0.7477603583426652, total=  11.1s
[CV] max_depth=1, min_child_weight=1, n_estimators=50 ................
[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:   11.4s remaining:    0.0s
[CV]  max_depth=1, min_child_weight=1, n_estimators=50, score=0.74772504549909, total=  11.3s
[CV] max_depth=1, min_child_weight=1, n_estimators=50 ................
[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:   23.1s remaining:    0.0s
[CV]  max_depth=1, min_child_weight=1, n_estimators=50, score=0.7477773888694436, total=  11.2s
[CV] max_depth=1, min_child_weight=1, n_estimators=100 ...............
[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:   34.8s remaining:    0.0s
[CV]  max_depth=1, min_child_weight=1, n_estimators=100, score=0.7477603583426652, total=  21.4s
[CV] max_depth=1, min_child_weight=1, n_estimators=100 ...............
[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:   56.8s remaining:    0.0s
[CV]  max_depth=1, min_child_weight=1, n_estimators=100, score=0.74772504549909, total=  21.3s
[CV] max_depth=1, min_child_weight=1, n_estimators=100 ...............
[Parallel(n_jobs=1)]: Done   5 out of   5 | elapsed:  1.3min remaining:    0.0s
[CV]  max_depth=1, min_child_weight=1, n_estimators=100, score=0.7477773888694436, total=  21.0s
[CV] max_depth=1, min_child_weight=1, n_estimators=200 ...............
[Parallel(n_jobs=1)]: Done   6 out of   6 | elapsed:  1.7min remaining:    0.0s
[CV]  max_depth=1, min_child_weight=1, n_estimators=200, score=0.7477603583426652, total=  41.3s
[CV] max_depth=1, min_child_weight=1, n_estimators=200 ...............
[Parallel(n_jobs=1)]: Done   7 out of   7 | elapsed:  2.4min remaining:    0.0s
[CV]  max_depth=1, min_child_weight=1, n_estimators=200, score=0.74772504549909, total=  41.1s
[CV] max_depth=1, min_child_weight=1, n_estimators=200 ...............
[Parallel(n_jobs=1)]: Done   8 out of   8 | elapsed:  3.1min remaining:    0.0s
[CV]  max_depth=1, min_child_weight=1, n_estimators=200, score=0.7477773888694436, total=  41.1s
[CV] max_depth=1, min_child_weight=1, n_estimators=500 ...............
[Parallel(n_jobs=1)]: Done   9 out of   9 | elapsed:  3.7min remaining:    0.0s
[CV]  max_depth=1, min_child_weight=1, n_estimators=500, score=0.7477603583426652, total= 1.7min
[CV] max_depth=1, min_child_weight=1, n_estimators=500 ...............
[CV]  max_depth=1, min_child_weight=1, n_estimators=500, score=0.74772504549909, total= 1.8min
[CV] max_depth=1, min_child_weight=1, n_estimators=500 ...............
[CV]  max_depth=1, min_child_weight=1, n_estimators=500, score=0.7477773888694436, total= 1.7min
[CV] max_depth=1, min_child_weight=1, n_estimators=1000 ..............
[CV]  max_depth=1, min_child_weight=1, n_estimators=1000, score=0.7477603583426652, total= 3.4min
[CV] max_depth=1, min_child_weight=1, n_estimators=1000 ..............
[CV]  max_depth=1, min_child_weight=1, n_estimators=1000, score=0.74772504549909, total= 3.4min

...

[CV] max_depth=3, min_child_weight=1, n_estimators=50 ................
[CV]  max_depth=3, min_child_weight=1, n_estimators=50, score=0.7477773888694436, total=  10.9s
[CV] max_depth=3, min_child_weight=1, n_estimators=100 ...............
[CV]  max_depth=3, min_child_weight=1, n_estimators=100, score=0.7477603583426652, total=  21.2s
[CV] max_depth=3, min_child_weight=1, n_estimators=100 ...............
[CV]  max_depth=3, min_child_weight=1, n_estimators=100, score=0.74772504549909, total=  21.0s
[CV] max_depth=3, min_child_weight=1, n_estimators=100 ...............
[CV]  max_depth=3, min_child_weight=1, n_estimators=100, score=0.7477773888694436, total=  20.9s
[CV] max_depth=3, min_child_weight=1, n_estimators=200 ...............
[CV]  max_depth=3, min_child_weight=1, n_estimators=200, score=0.7477603583426652, total=  41.0s
[CV] max_depth=3, min_child_weight=1, n_estimators=200 ...............
[CV]  max_depth=3, min_child_weight=1, n_estimators=200, score=0.74772504549909, total=  41.2s
[CV] max_depth=3, min_child_weight=1, n_estimators=200 ...............
[CV]  max_depth=3, min_child_weight=1, n_estimators=200, score=0.7477773888694436, total=  41.4s
[CV] max_depth=3, min_child_weight=1, n_estimators=500 ...............
[CV]  max_depth=3, min_child_weight=1, n_estimators=500, score=0.7477603583426652, total= 1.7min
[CV] max_depth=3, min_child_weight=1, n_estimators=500 ...............
[CV]  max_depth=3, min_child_weight=1, n_estimators=500, score=0.74772504549909, total= 1.7min
[CV] max_depth=3, min_child_weight=1, n_estimators=500 ...............
[CV]  max_depth=3, min_child_weight=1, n_estimators=500, score=0.7477773888694436, total= 1.7min
[CV] max_depth=3, min_child_weight=1, n_estimators=1000 ..............
[CV]  max_depth=3, min_child_weight=1, n_estimators=1000, score=0.7477603583426652, total= 3.4min
[CV] max_depth=3, min_child_weight=1, n_estimators=1000 ..............
[CV]  max_depth=3, min_child_weight=1, n_estimators=1000, score=0.74772504549909, total= 3.4min
[CV] max_depth=3, min_child_weight=1, n_estimators=1000 ..............
[CV]  max_depth=3, min_child_weight=1, n_estimators=1000, score=0.7477773888694436, total= 3.4min
[CV] max_depth=3, min_child_weight=3, n_estimators=50 ................
[CV]  max_depth=3, min_child_weight=3, n_estimators=50, score=0.7477603583426652, total=  10.9s
[CV] max_depth=3, min_child_weight=3, n_estimators=50 ................
[CV]  max_depth=3, min_child_weight=3, n_estimators=50, score=0.74772504549909, total=  11.0s
[CV] max_depth=3, min_child_weight=3, n_estimators=50 ................
[CV]  max_depth=3, min_child_weight=3, n_estimators=50, score=0.7477773888694436, total=  10.9s
[CV] max_depth=3, min_child_weight=3, n_estimators=100 ...............
[CV]  max_depth=3, min_child_weight=3, n_estimators=100, score=0.7477603583426652, total=  20.9s
[CV] max_depth=3, min_child_weight=3, n_estimators=100 ...............
[CV]  max_depth=3, min_child_weight=3, n_estimators=100, score=0.74772504549909, total=  21.0s
[CV] max_depth=3, min_child_weight=3, n_estimators=100 ...............
[CV]  max_depth=3, min_child_weight=3, n_estimators=100, score=0.7477773888694436, total=  21.0s
[CV] max_depth=3, min_child_weight=3, n_estimators=200 ...............
[CV]  max_depth=3, min_child_weight=3, n_estimators=200, score=0.7477603583426652, total=  41.2s
[CV] max_depth=3, min_child_weight=3, n_estimators=200 ...............
[CV]  max_depth=3, min_child_weight=3, n_estimators=200, score=0.74772504549909, total=  41.2s
[CV] max_depth=3, min_child_weight=3, n_estimators=200 ...............
[CV]  max_depth=3, min_child_weight=3, n_estimators=200, score=0.7477773888694436, total=  41.2s
[CV] max_depth=3, min_child_weight=3, n_estimators=500 ...............
[CV]  max_depth=3, min_child_weight=3, n_estimators=500, score=0.7477603583426652, total= 1.7min
[CV] max_depth=3, min_child_weight=3, n_estimators=500 ...............
[CV]  max_depth=3, min_child_weight=3, n_estimators=500, score=0.74772504549909, total= 1.7min
[CV] max_depth=3, min_child_weight=3, n_estimators=500 ...............
[CV]  max_depth=3, min_child_weight=3, n_estimators=500, score=0.7477773888694436, total= 1.7min
[CV] max_depth=3, min_child_weight=3, n_estimators=1000 ..............
[CV]  max_depth=3, min_child_weight=3, n_estimators=1000, score=0.7477603583426652, total= 3.4min
[CV] max_depth=3, min_child_weight=3, n_estimators=1000 ..............
[CV]  max_depth=3, min_child_weight=3, n_estimators=1000, score=0.74772504549909, total= 3.4min
[CV] max_depth=3, min_child_weight=3, n_estimators=1000 ..............
[CV]  max_depth=3, min_child_weight=3, n_estimators=1000, score=0.7477773888694436, total= 3.4min
[CV] max_depth=3, min_child_weight=5, n_estimators=50 ................
[CV]  max_depth=3, min_child_weight=5, n_estimators=50, score=0.7477603583426652, total=  11.0s
[CV] max_depth=3, min_child_weight=5, n_estimators=50 ................
[CV]  max_depth=3, min_child_weight=5, n_estimators=50, score=0.74772504549909, total=  10.9s
[CV] max_depth=3, min_child_weight=5, n_estimators=50 ................
[CV]  max_depth=3, min_child_weight=5, n_estimators=50, score=0.7477773888694436, total=  10.9s
[CV] max_depth=3, min_child_weight=5, n_estimators=100 ...............
[CV]  max_depth=3, min_child_weight=5, n_estimators=100, score=0.7477603583426652, total=  21.2s
[CV] max_depth=3, min_child_weight=5, n_estimators=100 ...............
[CV]  max_depth=3, min_child_weight=5, n_estimators=100, score=0.74772504549909, total=  21.0s
[CV] max_depth=3, min_child_weight=5, n_estimators=100 ...............
[CV]  max_depth=3, min_child_weight=5, n_estimators=100, score=0.7477773888694436, total=  21.0s
[CV] max_depth=3, min_child_weight=5, n_estimators=200 ...............
[CV]  max_depth=3, min_child_weight=5, n_estimators=200, score=0.7477603583426652, total=  41.1s
[CV] max_depth=3, min_child_weight=5, n_estimators=200 ...............
[CV]  max_depth=3, min_child_weight=5, n_estimators=200, score=0.74772504549909, total=  41.3s
[CV] max_depth=3, min_child_weight=5, n_estimators=200 ...............
[CV]  max_depth=3, min_child_weight=5, n_estimators=200, score=0.7477773888694436, total=  41.0s
[CV] max_depth=3, min_child_weight=5, n_estimators=500 ...............
[CV]  max_depth=3, min_child_weight=5, n_estimators=500, score=0.7477603583426652, total= 1.7min
[CV] max_depth=3, min_child_weight=5, n_estimators=500 ...............
[CV]  max_depth=3, min_child_weight=5, n_estimators=500, score=0.74772504549909, total= 1.7min
[CV] max_depth=3, min_child_weight=5, n_estimators=500 ...............
[CV]  max_depth=3, min_child_weight=5, n_estimators=500, score=0.7477773888694436, total= 1.7min
[CV] max_depth=3, min_child_weight=5, n_estimators=1000 ..............
[CV]  max_depth=3, min_child_weight=5, n_estimators=1000, score=0.7477603583426652, total= 3.4min
[CV] max_depth=3, min_child_weight=5, n_estimators=1000 ..............
[CV]  max_depth=3, min_child_weight=5, n_estimators=1000, score=0.74772504549909, total= 3.4min
[CV] max_depth=3, min_child_weight=5, n_estimators=1000 ..............
[CV]  max_depth=3, min_child_weight=5, n_estimators=1000, score=0.7477773888694436, total= 3.4min
[CV] max_depth=5, min_child_weight=1, n_estimators=50 ................
[CV]  max_depth=5, min_child_weight=1, n_estimators=50, score=0.7477603583426652, total=  10.9s
[CV] max_depth=5, min_child_weight=1, n_estimators=50 ................
[CV]  max_depth=5, min_child_weight=1, n_estimators=50, score=0.74772504549909, total=  10.9s
[CV] max_depth=5, min_child_weight=1, n_estimators=50 ................
[CV]  max_depth=5, min_child_weight=1, n_estimators=50, score=0.7477773888694436, total=  10.9s
[CV] max_depth=5, min_child_weight=1, n_estimators=100 ...............
[CV]  max_depth=5, min_child_weight=1, n_estimators=100, score=0.7477603583426652, total=  21.0s
[CV] max_depth=5, min_child_weight=1, n_estimators=100 ...............
[CV]  max_depth=5, min_child_weight=1, n_estimators=100, score=0.74772504549909, total=  21.1s
[CV] max_depth=5, min_child_weight=1, n_estimators=100 ...............
[CV]  max_depth=5, min_child_weight=1, n_estimators=100, score=0.7477773888694436, total=  21.0s
[CV] max_depth=5, min_child_weight=1, n_estimators=200 ...............
[CV]  max_depth=5, min_child_weight=1, n_estimators=200, score=0.7477603583426652, total=  41.3s
[CV] max_depth=5, min_child_weight=1, n_estimators=200 ...............
[CV]  max_depth=5, min_child_weight=1, n_estimators=200, score=0.74772504549909, total=  41.1s
[CV] max_depth=5, min_child_weight=1, n_estimators=200 ...............
[CV]  max_depth=5, min_child_weight=1, n_estimators=200, score=0.7477773888694436, total=  41.1s
[CV] max_depth=5, min_child_weight=1, n_estimators=500 ...............
[CV]  max_depth=5, min_child_weight=1, n_estimators=500, score=0.7477603583426652, total= 1.7min
[CV] max_depth=5, min_child_weight=1, n_estimators=500 ...............
[CV]  max_depth=5, min_child_weight=1, n_estimators=500, score=0.74772504549909, total= 1.7min
[CV] max_depth=5, min_child_weight=1, n_estimators=500 ...............
[CV]  max_depth=5, min_child_weight=1, n_estimators=500, score=0.7477773888694436, total= 1.7min
[CV] max_depth=5, min_child_weight=1, n_estimators=1000 ..............
[CV]  max_depth=5, min_child_weight=1, n_estimators=1000, score=0.7477603583426652, total= 3.4min
[CV] max_depth=5, min_child_weight=1, n_estimators=1000 ..............
[CV]  max_depth=5, min_child_weight=1, n_estimators=1000, score=0.74772504549909, total= 3.4min
[CV] max_depth=5, min_child_weight=1, n_estimators=1000 ..............
[CV]  max_depth=5, min_child_weight=1, n_estimators=1000, score=0.7477773888694436, total= 3.4min
[CV] max_depth=5, min_child_weight=3, n_estimators=50 ................
[CV]  max_depth=5, min_child_weight=3, n_estimators=50, score=0.7477603583426652, total=  10.9s
[CV] max_depth=5, min_child_weight=3, n_estimators=50 ................
[CV]  max_depth=5, min_child_weight=3, n_estimators=50, score=0.74772504549909, total=  10.9s
[CV] max_depth=5, min_child_weight=3, n_estimators=50 ................
[CV]  max_depth=5, min_child_weight=3, n_estimators=50, score=0.7477773888694436, total=  11.0s
[CV] max_depth=5, min_child_weight=3, n_estimators=100 ...............
[CV]  max_depth=5, min_child_weight=3, n_estimators=100, score=0.7477603583426652, total=  21.3s
[CV] max_depth=5, min_child_weight=3, n_estimators=100 ...............
[CV]  max_depth=5, min_child_weight=3, n_estimators=100, score=0.74772504549909, total=  20.9s
[CV] max_depth=5, min_child_weight=3, n_estimators=100 ...............
[CV]  max_depth=5, min_child_weight=3, n_estimators=100, score=0.7477773888694436, total=  20.9s
[CV] max_depth=5, min_child_weight=3, n_estimators=200 ...............
[CV]  max_depth=5, min_child_weight=3, n_estimators=200, score=0.7477603583426652, total=  41.1s
[CV] max_depth=5, min_child_weight=3, n_estimators=200 ...............
[CV]  max_depth=5, min_child_weight=3, n_estimators=200, score=0.74772504549909, total=  41.4s
[CV] max_depth=5, min_child_weight=3, n_estimators=200 ...............
[CV]  max_depth=5, min_child_weight=3, n_estimators=200, score=0.7477773888694436, total=  41.1s
[CV] max_depth=5, min_child_weight=3, n_estimators=500 ...............
[CV]  max_depth=5, min_child_weight=3, n_estimators=500, score=0.7477603583426652, total= 1.7min
[CV] max_depth=5, min_child_weight=3, n_estimators=500 ...............
[CV]  max_depth=5, min_child_weight=3, n_estimators=500, score=0.74772504549909, total= 1.7min
[CV] max_depth=5, min_child_weight=3, n_estimators=500 ...............
[CV]  max_depth=5, min_child_weight=3, n_estimators=500, score=0.7477773888694436, total= 1.7min
[CV] max_depth=5, min_child_weight=3, n_estimators=1000 ..............
[CV]  max_depth=5, min_child_weight=3, n_estimators=1000, score=0.7477603583426652, total= 3.4min
[CV] max_depth=5, min_child_weight=3, n_estimators=1000 ..............
[CV]  max_depth=5, min_child_weight=3, n_estimators=1000, score=0.74772504549909, total= 3.4min
[CV] max_depth=5, min_child_weight=3, n_estimators=1000 ..............
[CV]  max_depth=5, min_child_weight=3, n_estimators=1000, score=0.7477773888694436, total= 3.4min
[CV] max_depth=5, min_child_weight=5, n_estimators=50 ................
[CV]  max_depth=5, min_child_weight=5, n_estimators=50, score=0.7477603583426652, total=  11.0s
[CV] max_depth=5, min_child_weight=5, n_estimators=50 ................
[CV]  max_depth=5, min_child_weight=5, n_estimators=50, score=0.74772504549909, total=  11.0s
[CV] max_depth=5, min_child_weight=5, n_estimators=50 ................
[CV]  max_depth=5, min_child_weight=5, n_estimators=50, score=0.7477773888694436, total=  10.9s
[CV] max_depth=5, min_child_weight=5, n_estimators=100 ...............
[CV]  max_depth=5, min_child_weight=5, n_estimators=100, score=0.7477603583426652, total=  21.0s
[CV] max_depth=5, min_child_weight=5, n_estimators=100 ...............
[CV]  max_depth=5, min_child_weight=5, n_estimators=100, score=0.74772504549909, total=  21.0s
[CV] max_depth=5, min_child_weight=5, n_estimators=100 ...............
[CV]  max_depth=5, min_child_weight=5, n_estimators=100, score=0.7477773888694436, total=  21.8s
[CV] max_depth=5, min_child_weight=5, n_estimators=200 ...............
[CV]  max_depth=5, min_child_weight=5, n_estimators=200, score=0.7477603583426652, total=  41.2s
[CV] max_depth=5, min_child_weight=5, n_estimators=200 ...............
[CV]  max_depth=5, min_child_weight=5, n_estimators=200, score=0.74772504549909, total=  41.6s
[CV] max_depth=5, min_child_weight=5, n_estimators=200 ...............
[CV]  max_depth=5, min_child_weight=5, n_estimators=200, score=0.7477773888694436, total=  41.2s
[CV] max_depth=5, min_child_weight=5, n_estimators=500 ...............
[CV]  max_depth=5, min_child_weight=5, n_estimators=500, score=0.7477603583426652, total= 1.7min
[CV] max_depth=5, min_child_weight=5, n_estimators=500 ...............
[CV]  max_depth=5, min_child_weight=5, n_estimators=500, score=0.74772504549909, total= 1.7min
[CV] max_depth=5, min_child_weight=5, n_estimators=500 ...............
[CV]  max_depth=5, min_child_weight=5, n_estimators=500, score=0.7477773888694436, total= 1.7min
[CV] max_depth=5, min_child_weight=5, n_estimators=1000 ..............
[CV]  max_depth=5, min_child_weight=5, n_estimators=1000, score=0.7477603583426652, total= 3.4min
[CV] max_depth=5, min_child_weight=5, n_estimators=1000 ..............
[CV]  max_depth=5, min_child_weight=5, n_estimators=1000, score=0.74772504549909, total= 3.4min
[CV] max_depth=5, min_child_weight=5, n_estimators=1000 ..............
[CV]  max_depth=5, min_child_weight=5, n_estimators=1000, score=0.7477773888694436, total= 3.4min
[CV] max_depth=8, min_child_weight=1, n_estimators=50 ................
[CV]  max_depth=8, min_child_weight=1, n_estimators=50, score=0.7477603583426652, total=  10.9s
[CV] max_depth=8, min_child_weight=1, n_estimators=50 ................
[CV]  max_depth=8, min_child_weight=1, n_estimators=50, score=0.74772504549909, total=  10.9s
[CV] max_depth=8, min_child_weight=1, n_estimators=50 ................
[CV]  max_depth=8, min_child_weight=1, n_estimators=50, score=0.7477773888694436, total=  10.9s
[CV] max_depth=8, min_child_weight=1, n_estimators=100 ...............
[CV]  max_depth=8, min_child_weight=1, n_estimators=100, score=0.7477603583426652, total=  21.2s
[CV] max_depth=8, min_child_weight=1, n_estimators=100 ...............
[CV]  max_depth=8, min_child_weight=1, n_estimators=100, score=0.74772504549909, total=  21.0s
[CV] max_depth=8, min_child_weight=1, n_estimators=100 ...............
[CV]  max_depth=8, min_child_weight=1, n_estimators=100, score=0.7477773888694436, total=  20.9s
[CV] max_depth=8, min_child_weight=1, n_estimators=200 ...............
[CV]  max_depth=8, min_child_weight=1, n_estimators=200, score=0.7477603583426652, total=  41.0s
[CV] max_depth=8, min_child_weight=1, n_estimators=200 ...............
[CV]  max_depth=8, min_child_weight=1, n_estimators=200, score=0.74772504549909, total=  41.4s
[CV] max_depth=8, min_child_weight=1, n_estimators=200 ...............
[CV]  max_depth=8, min_child_weight=1, n_estimators=200, score=0.7477773888694436, total=  41.0s
[CV] max_depth=8, min_child_weight=1, n_estimators=500 ...............
[CV]  max_depth=8, min_child_weight=1, n_estimators=500, score=0.7477603583426652, total= 1.7min
[CV] max_depth=8, min_child_weight=1, n_estimators=500 ...............
[CV]  max_depth=8, min_child_weight=1, n_estimators=500, score=0.74772504549909, total= 1.7min
[CV] max_depth=8, min_child_weight=1, n_estimators=500 ...............
[CV]  max_depth=8, min_child_weight=1, n_estimators=500, score=0.7477773888694436, total= 1.7min
[CV] max_depth=8, min_child_weight=1, n_estimators=1000 ..............
[CV]  max_depth=8, min_child_weight=1, n_estimators=1000, score=0.7477603583426652, total= 3.4min
[CV] max_depth=8, min_child_weight=1, n_estimators=1000 ..............
[CV]  max_depth=8, min_child_weight=1, n_estimators=1000, score=0.74772504549909, total= 3.4min
[CV] max_depth=8, min_child_weight=1, n_estimators=1000 ..............
[CV]  max_depth=8, min_child_weight=1, n_estimators=1000, score=0.7477773888694436, total= 3.4min
[CV] max_depth=8, min_child_weight=3, n_estimators=50 ................
[CV]  max_depth=8, min_child_weight=3, n_estimators=50, score=0.7477603583426652, total=  10.9s
[CV] max_depth=8, min_child_weight=3, n_estimators=50 ................
[CV]  max_depth=8, min_child_weight=3, n_estimators=50, score=0.74772504549909, total=  10.9s
[CV] max_depth=8, min_child_weight=3, n_estimators=50 ................
[CV]  max_depth=8, min_child_weight=3, n_estimators=50, score=0.7477773888694436, total=  10.9s
[CV] max_depth=8, min_child_weight=3, n_estimators=100 ...............
[CV]  max_depth=8, min_child_weight=3, n_estimators=100, score=0.7477603583426652, total=  20.9s
[CV] max_depth=8, min_child_weight=3, n_estimators=100 ...............
[CV]  max_depth=8, min_child_weight=3, n_estimators=100, score=0.74772504549909, total=  21.0s
[CV] max_depth=8, min_child_weight=3, n_estimators=100 ...............
[CV]  max_depth=8, min_child_weight=3, n_estimators=100, score=0.7477773888694436, total=  20.9s
[CV] max_depth=8, min_child_weight=3, n_estimators=200 ...............
[CV]  max_depth=8, min_child_weight=3, n_estimators=200, score=0.7477603583426652, total=  41.3s
[CV] max_depth=8, min_child_weight=3, n_estimators=200 ...............
[CV]  max_depth=8, min_child_weight=3, n_estimators=200, score=0.74772504549909, total=  41.1s
[CV] max_depth=8, min_child_weight=3, n_estimators=200 ...............
[CV]  max_depth=8, min_child_weight=3, n_estimators=200, score=0.7477773888694436, total=  41.2s
[CV] max_depth=8, min_child_weight=3, n_estimators=500 ...............
[CV]  max_depth=8, min_child_weight=3, n_estimators=500, score=0.7477603583426652, total= 1.7min
[CV] max_depth=8, min_child_weight=3, n_estimators=500 ...............
[CV]  max_depth=8, min_child_weight=3, n_estimators=500, score=0.74772504549909, total= 1.7min
[CV] max_depth=8, min_child_weight=3, n_estimators=500 ...............
[CV]  max_depth=8, min_child_weight=3, n_estimators=500, score=0.7477773888694436, total= 1.7min
[CV] max_depth=8, min_child_weight=3, n_estimators=1000 ..............
[CV]  max_depth=8, min_child_weight=3, n_estimators=1000, score=0.7477603583426652, total= 3.4min
[CV] max_depth=8, min_child_weight=3, n_estimators=1000 ..............
[CV]  max_depth=8, min_child_weight=3, n_estimators=1000, score=0.74772504549909, total= 3.4min
[CV] max_depth=8, min_child_weight=3, n_estimators=1000 ..............
[CV]  max_depth=8, min_child_weight=3, n_estimators=1000, score=0.7477773888694436, total= 3.4min
[CV] max_depth=8, min_child_weight=5, n_estimators=50 ................
[CV]  max_depth=8, min_child_weight=5, n_estimators=50, score=0.7477603583426652, total=  10.9s
[CV] max_depth=8, min_child_weight=5, n_estimators=50 ................
[CV]  max_depth=8, min_child_weight=5, n_estimators=50, score=0.74772504549909, total=  10.9s
[CV] max_depth=8, min_child_weight=5, n_estimators=50 ................
[CV]  max_depth=8, min_child_weight=5, n_estimators=50, score=0.7477773888694436, total=  10.9s
[CV] max_depth=8, min_child_weight=5, n_estimators=100 ...............
[CV]  max_depth=8, min_child_weight=5, n_estimators=100, score=0.7477603583426652, total=  20.9s
[CV] max_depth=8, min_child_weight=5, n_estimators=100 ...............
[CV]  max_depth=8, min_child_weight=5, n_estimators=100, score=0.74772504549909, total=  21.4s
[CV] max_depth=8, min_child_weight=5, n_estimators=100 ...............
[CV]  max_depth=8, min_child_weight=5, n_estimators=100, score=0.7477773888694436, total=  21.0s
[CV] max_depth=8, min_child_weight=5, n_estimators=200 ...............
[CV]  max_depth=8, min_child_weight=5, n_estimators=200, score=0.7477603583426652, total=  41.2s
[CV] max_depth=8, min_child_weight=5, n_estimators=200 ...............
[CV]  max_depth=8, min_child_weight=5, n_estimators=200, score=0.74772504549909, total=  41.3s
[CV] max_depth=8, min_child_weight=5, n_estimators=200 ...............
[CV]  max_depth=8, min_child_weight=5, n_estimators=200, score=0.7477773888694436, total=  41.0s
[CV] max_depth=8, min_child_weight=5, n_estimators=500 ...............
[CV]  max_depth=8, min_child_weight=5, n_estimators=500, score=0.7477603583426652, total= 1.7min
[CV] max_depth=8, min_child_weight=5, n_estimators=500 ...............
[CV]  max_depth=8, min_child_weight=5, n_estimators=500, score=0.74772504549909, total= 1.7min
[CV] max_depth=8, min_child_weight=5, n_estimators=500 ...............
[CV]  max_depth=8, min_child_weight=5, n_estimators=500, score=0.7477773888694436, total= 1.7min
[CV] max_depth=8, min_child_weight=5, n_estimators=1000 ..............
[CV]  max_depth=8, min_child_weight=5, n_estimators=1000, score=0.7477603583426652, total= 3.4min
[CV] max_depth=8, min_child_weight=5, n_estimators=1000 ..............
[CV]  max_depth=8, min_child_weight=5, n_estimators=1000, score=0.74772504549909, total= 3.4min
[CV] max_depth=8, min_child_weight=5, n_estimators=1000 ..............
[CV]  max_depth=8, min_child_weight=5, n_estimators=1000, score=0.7477773888694436, total= 3.4min
[Parallel(n_jobs=1)]: Done 180 out of 180 | elapsed: 227.8min finished
Grid best score (f1):  0.7477542636024276
Grid best parameter (max. f1):  {'max_depth': 1, 'min_child_weight': 1, 'n_estimators': 50}

【问题讨论】:

  • 你的数据集中有多少样本?
  • ~30,000。另外,我只是想了一些更有用的信息来添加到主帖子中,现在就做
  • 混淆矩阵怎么样?你检查了吗?

标签: python machine-learning scikit-learn data-science xgboost


【解决方案1】:

假设您的分类器将所有内容预测为多数类,那么您的:

precision = tp/(tp+fp) = 60/(60+40) = 0,6
recall = tp/(tp+fn) = 60/(60+0) = 1

和你的 f1 分数:

f1 = 2*precision*recall/(precision+recall)= 2*0,6*1/(0,6+1)
   = 1,2/1,6= 0,75

所以很可能你的分类器总是预测多数类。

要检查一次您的混淆矩阵,您可以使用以下内容:

from sklearn.metrics import confusion_matrix
print(confusion_matrix(y_true, y_pred))

【讨论】:

    猜你喜欢
    • 2020-05-04
    • 2017-10-25
    • 1970-01-01
    • 2021-08-29
    • 1970-01-01
    • 2019-09-01
    • 2017-05-07
    • 2020-06-19
    • 2020-02-22
    相关资源
    最近更新 更多