【发布时间】:2017-08-29 03:39:06
【问题描述】:
我正在使用 nltk 的朴素贝叶斯分类器进行情感分析。我只是插入一个包含单词及其标签的 csv 文件作为训练集,尚未对其进行测试。我正在寻找每个句子的情绪,然后在最后找到所有句子的情绪平均值。我的文件包含以下格式的单词:
good,0.6
amazing,0.95
great,0.8
awesome,0.95
love,0.7
like,0.5
better,0.4
beautiful,0.6
bad,-0.6
worst,-0.9
hate,-0.8
sad,-0.4
disappointing,-0.6
angry,-0.7
happy,0.7
但是文件没有得到训练,并且出现了上述错误。这是我的python代码:
import nltk.classify.util
from nltk.classify import NaiveBayesClassifier
from nltk.corpus import stopwords
from nltk.tokenize import sent_tokenize
from nltk.classify.api import ClassifierI
operators=set(('not','never','no'))
stop_words=set(stopwords.words("english"))-operators
text="this restaurant is good but i hate it ."
sent=0.0
x=0
text2=""
xyz=[]
dot=0
if "but" in text:
i=text.find("but")
text=text[:i]+"."+text[i+3:]
if "whereas" in text:
i=text.find("whereas")
text=text[:i]+"."+text[i+7:]
if "while" in text:
i=text.find("while")
text=text[:i]+"."+text[i+5:]
a=open('C:/Users/User/train_words.csv','r')
for w in text.split():
if w in stop_words:
continue
else:
text2=text2+" "+w
print (text2)
cl=nltk.NaiveBayesClassifier.train(a)
xyz=sent_tokenize(text2)
print(xyz)
for s in xyz:
x=x+1
print(s)
if "not" in s or "n't" in s:
print(float(cl.classify(s))*-1)
sent=sent+(float(cl.classify(s))*-1)
else:
print(cl.classify(s))
sent=sent+float(cl.classify(s))
print("sentiment of the overall document:",sent/x)
错误:
runfile('C:/Users/User/Documents/untitled1.py', wdir='C:/Users /User/Documents')
restaurant good . hate .
Traceback (most recent call last):
File "<ipython-input-8-d03fac6844c7>", line 1, in <module>
runfile('C:/Users/User/Documents/untitled1.py', wdir='C:/Users/User/Documents')
File "C:\ProgramData\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)
File "C:\ProgramData\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/User/Documents/untitled1.py", line 37, in <module>
cl = nltk.NaiveBayesClassifier.train(a)
File "C:\ProgramData\Anaconda3\lib\site-packages\nltk\classify\naivebayes.py", line 194, in train
for featureset, label in labeled_featuresets:
ValueError: too many values to unpack (expected 2)
【问题讨论】:
-
你能发布完整的堆栈跟踪吗?
-
不,堆栈跟踪。完整的错误输出。请编辑您的答案并添加此信息,以便我们更好地诊断问题。
-
哦,对不起。好的
-
这可能是因为您将文件对象传递给
.train(),但它需要一个带有第一个元素hashable的元组。 -
如果有帮助可以参考这个答案:stackoverflow.com/questions/20827741/…
标签: python