【问题标题】:Why KS curve starts with (0,0)?为什么 KS 曲线以 (0,0) 开头?
【发布时间】:2020-04-25 14:11:04
【问题描述】:

KS曲线的纵轴是tpr,fpr和(tpr-fpr),横轴是阈值。

tpr=(tp/tp+fn).

threshold = 0时,预测所有样本为1,所以tp = number of positive samplesfn = 0

因此,tpr=1

但是我在网上找到的所有 KS 曲线都是以 (0,0) 开头的。不应该是(0,1)吗?我感到很困惑!谢谢回答!

【问题讨论】:

    标签: machine-learning statistics data-mining data-analysis


    【解决方案1】:
    • TP:实际为1的正预测数
    • FP:实际为0的正预测数
    • TN:实际为0的负预测数
    • FN:实际为 1 的负预测数

    当threshoud = 0时,模型只预测正数,所以FN=TN=0。 FPR = FP/(FP+TN) = 1,TPR=TP/(TP+FN) = 1,所以这个点应该是(1,1)。你犯了一个错误

    当threshoud = 1时,模型只预测负数,所以TP = FP = 0. FPR = FP/(FP+TN) = 0, TPR=TP/(TP+FN) = 0,所以这个点应该是(0,0)。

    # roc curve and auc
    from sklearn.datasets import make_classification
    from sklearn.neighbors import KNeighborsClassifier
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import roc_curve
    from sklearn.metrics import roc_auc_score
    from matplotlib import pyplot
    import pandas as pd
    
    # generate 2 class dataset
    X, y = make_classification(n_samples=1000, n_classes=2, weights=[1,1], random_state=1)
    # split into train/test sets
    trainX, testX, trainy, testy = train_test_split(X, y, test_size=0.5, random_state=2)
    # fit a model
    model = KNeighborsClassifier(n_neighbors=3)
    model.fit(trainX, trainy)
    # predict probabilities
    probs = model.predict_proba(testX)
    # keep probabilities for the positive outcome only
    probs = probs[:, 1]
    # calculate AUC
    auc = roc_auc_score(testy, probs)
    print('AUC: %.3f' % auc)
    # calculate roc curve
    fpr, tpr, thresholds = roc_curve(testy, probs)
    # plot no skill
    pyplot.plot([0, 1], [0, 1], linestyle='--')
    # plot the roc curve for the model
    pyplot.plot(fpr, tpr, marker='.')
    # show the plot
    pyplot.show()
    # see calculations
    pd.DataFrame({'fpr':fpr,'tpr':tpr,'thresholds':thresholds})
    

    输出:

         fpr        tpr         threshouds
    0   0.000000    0.000000    2.000000
    1   0.054264    0.561983    1.000000
    2   0.217054    0.884298    0.666667
    3   0.406977    0.975207    0.333333
    4   1.000000    1.000000    0.000000
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2013-05-30
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2011-01-29
      • 2011-09-06
      • 1970-01-01
      相关资源
      最近更新 更多