【发布时间】:2025-12-12 12:15:02
【问题描述】:
我正在使用 scikit-learn MultinomialNB 和 Vectorizer 来构建评论好坏的预测模型。
在对标记数据进行训练后,我如何使用它来预测新评论(或现有评论)?我收到以下错误消息。
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.cross_validation import train_test_split
from sklearn.naive_bayes import MultinomialNB
X = vectorizer.fit_transform(df.quote)
X = X.tocsc()
Y = (df.fresh == 'fresh').values.astype(np.int)
xtrain, xtest, ytrain, ytest = train_test_split(X, Y)
clf = MultinomialNB().fit(xtrain, ytrain)
new_review = ['this is a new review, movie was awesome']
new_review = vectorizer.fit_transform(new_review)
print df.quote[15]
print(clf.predict(df.quote[10])) #predict existing review in dataframe
print(clf.predict(new_review)) #predict new review
Technically, Toy Story is nearly flawless.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-91-27a0698bbd1f> in <module>()
15
16 print df.quote[15]
---> 17 print(clf.predict(df.quote[10])) #predict existing quote in dataframe
18 print(clf.predict(new_review)) #predict new review
//anaconda/lib/python2.7/site-packages/sklearn/naive_bayes.pyc in predict(self, X)
60 Predicted target values for X
61 """
---> 62 jll = self._joint_log_likelihood(X)
63 return self.classes_[np.argmax(jll, axis=1)]
64
//anaconda/lib/python2.7/site-packages/sklearn/naive_bayes.pyc in _joint_log_likelihood(self, X)
439 """Calculate the posterior log probability of the samples X"""
440 X = atleast2d_or_csr(X)
--> 441 return (safe_sparse_dot(X, self.feature_log_prob_.T)
442 + self.class_log_prior_)
443
//anaconda/lib/python2.7/site-packages/sklearn/utils/extmath.pyc in safe_sparse_dot(a, b, dense_output)
178 return ret
179 else:
--> 180 return fast_dot(a, b)
181
182
TypeError: Cannot cast array data from dtype('float64') to dtype('S32') according to the rule 'safe'
【问题讨论】:
-
仅供参考,这称为情绪分析。而且您的问题与您的错误无关。
-
谢谢,是的,这叫做情绪分析。我正在尝试使用 clf.predict() 预测新评论。可能错过了什么,让我知道,我可以澄清一下。 @keyser
标签: python machine-learning scikit-learn