【发布时间】:2021-01-10 13:57:13
【问题描述】:
word_vectorizer = CountVectorizer(ngram_range=(2,2), analyzer='word')
for each in (train_incidents_word_issue["Summary"].index):
text_issue_list = [data_word_issue["Summary"][each]]
sparse_matrix = word_vectorizer.fit_transform(text_issue_list)
frequencies = sum(sparse_matrix).toarray()[0]
bi_grams_issue_df = pd.DataFrame(frequencies, index=word_vectorizer.get_feature_names(), columns=['frequency'])
data_word_issue["data_issue_count"][each] = bi_grams_issue_df[bi_grams_issue_df.index.str.contains("^data issue$")]["frequency"].sum()
我收到以下错误:
值错误 在 (train_incidents_word_issue["Summary"].index) 中的每个 5: 6 text_issue_list = [data_word_issue[“摘要”][每个]] ----> 7 sparse_matrix = word_vectorizer.fit_transform(text_issue_list) 8 个频率 = sum(sparse_matrix).toarray()[0] 9 bi_grams_issue_df = pd.DataFrame(frequencies, index=word_vectorizer.get_feature_names(), >columns=['frequency'])
ValueError: 空词汇;也许文档只包含停用词>
帮助我了解错误和推荐的解决方案...我刚开始使用 python
【问题讨论】:
-
使用下面的代码 word_vectorizer.fit_transform(text_issue_list.split('\n')) 得到下面的错误 AttributeError: 'list' object has no attribute 'split'
标签: python