【发布时间】:2017-11-12 02:01:25
【问题描述】:
我编写了一个简单的函数,我使用来自scikit-learn 的average_precision_score 来计算平均精度。
我的代码:
def compute_average_precision(predictions, gold):
gold_predictions = np.zeros(predictions.size, dtype=np.int)
for idx in range(gold):
gold_predictions[idx] = 1
return average_precision_score(predictions, gold_predictions)
函数执行时,会产生如下错误。
Traceback (most recent call last):
File "test.py", line 91, in <module>
total_avg_precision += compute_average_precision(np.asarray(probs), len(gold_candidates))
File "test.py", line 29, in compute_average_precision
return average_precision_score(predictions, gold_predictions)
File "/if5/wua4nw/anaconda3/lib/python3.5/site-packages/sklearn/metrics/ranking.py", line 184, in average_precision_score
average, sample_weight=sample_weight)
File "/if5/wua4nw/anaconda3/lib/python3.5/site-packages/sklearn/metrics/base.py", line 81, in _average_binary_score
raise ValueError("{0} format is not supported".format(y_type))
ValueError: continuous format is not supported
如果我打印两个 numpy 数组 predictions 和 gold_predictions,举个例子,它看起来没问题。 [下面提供了一个示例。]
[ 0.40865014 0.26047812 0.07588802 0.26604077 0.10586583 0.17118802
0.26797949 0.34618672 0.33659923 0.22075308 0.42288553 0.24908153
0.26506338 0.28224747 0.32942101 0.19986877 0.39831917 0.23635269
0.34715138 0.39831917 0.23635269 0.35822859 0.12110706]
[1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
我在这里做错了什么?错误是什么意思?
【问题讨论】:
-
predictions代表什么?它们是某个估计器的 predict() 方法的输出,还是表示获得正类的概率,或者可能是predict_proba()的输出?无论如何,y_true或您的gold_predictions需要成为第一个参数,predictions第二个。
标签: python scikit-learn