【发布时间】:2018-07-27 14:53:57
【问题描述】:
我是一个 python 菜鸟,但试图对一个字符串进行矢量化却没有运气。到目前为止,我从 URL 中的文章中提取数据,现在我试图对那篇文章进行分类,但到目前为止它不起作用。
(不断收到错误:raise AttributeError(attr + " not 找到”)AttributeError: 未找到下层)
似乎也没有任何帮助。
url = input("Paste the webiste containing the article you want to analise here: ");
print "Analysing Webpage"
#Gets the URL from the extension
#Goose loaded
g = Goose()
#Extract the text and feed it to the classifier
article = g.extract(url=url)
article = article.cleaned_text
article = clean(article)
article =str(article)
print "Vectorising Text"
article = article.split();
vect = CountVectorizer(min_df=0., max_df=1.0)
X = vect.fit_transform(article)
X.toarray()
X = vect.transform(X).toarray()
print X
print "Predicting Political Bias"
loaded_model = pickle.load(open("text_clf_svm.pkl", 'rb'))
predicted_svm = loaded_model.predict(X)
print predicted_svm
非常欢迎任何形式的帮助或指示,并表示感谢 =)
【问题讨论】:
标签: python-2.7 svm sklearn-pandas