【发布时间】:2019-01-19 18:27:13
【问题描述】:
我根据这个site写了一段代码,做了不同的多标签分类器。
我想根据每个班级的准确度和每个班级的 F1 测量来评估我的模型。
问题是我在所有模型中的准确度和 f1 测量值都相同。
我怀疑我做错了什么。我想知道在什么情况下会发生这种情况。
代码与网站完全相同,我这样计算 f1 测量值:
print('Logistic Test accuracy is {} '.format(accuracy_score(test[category], prediction)))
print 'Logistic f1 measurement is {} '.format(f1_score(test[category], prediction, average='micro'))
更新 1
这是整个代码,
df = pd.read_csv("finalupdatedothers.csv")
categories = ['ADR','WD','EF','INF','SSI','DI','others']
train,test = train_test_split(df,random_state=42,test_size=0.3,shuffle=True)
X_train = train.sentences
X_test = test.sentences
NB_pipeline = Pipeline([('tfidf', TfidfVectorizer(stop_words=stop_words)),
('clf',OneVsRestClassifier(MultinomialNB(fit_prior=True,class_prior=None))),])
for category in categories:
print 'processing {} '.format(category)
NB_pipeline.fit(X_train,train[category])
prediction = NB_pipeline.predict(X_test)
print 'NB test accuracy is {} '.format(accuracy_score(test[category],prediction))
print 'NB f1 measurement is {} '.format(f1_score(test[category],prediction,average='micro'))
print "\n"
这是输出:
processing ADR
NB test accuracy is 0.821963394343
NB f1 measurement is 0.821963394343
这就是我的数据的外观:
,sentences,ADR,WD,EF,INF,SSI,DI,others
0,"extreme weight gain, short-term memory loss, hair loss.",1,0,0,0,0,0,0
1,I am detoxing from Lexapro now.,0,0,0,0,0,0,1
2,I slowly cut my dosage over several months and took vitamin supplements to help.,0,0,0,0,0,0,1
3,I am now 10 days completely off and OMG is it rough.,0,0,0,0,0,0,1
4,"I have flu-like symptoms, dizziness, major mood swings, lots of anxiety, tiredness.",0,1,0,0,0,0,0
5,I have no idea when this will end.,1,0,0,0,0,0,1
为什么我得到相同的号码?
谢谢。
【问题讨论】:
-
你能分享一下你写了什么代码,你得到了什么输出?
-
感谢您的评论,确定我正在更新
-
@user2906838 更新为一种模式和该模式的输出。感谢关注:)
-
您是在所有类别中获得相同的准确度和 f1_score 还是仅在这一类别中获得?
-
@AkshayNevrekar 感谢您的评论。我对所有模型都获得了相同的精度和 f1。就像我得到的 svm ...处理 ADR SVM 线性测试精度为 0.814753189129 SVM 线性 f1 测量值为 0.814753189129
标签: python machine-learning scikit-learn svm multilabel-classification