【发布时间】:2017-08-04 22:15:20
【问题描述】:
我不明白为什么这不起作用:
import nltk
from nltk.corpus import stopwords
import string
with open('moby.txt', 'r') as f:
moby_raw = f.read()
stop = set(stopwords.words('english'))
moby_tokens = nltk.word_tokenize(moby_raw)
text_no_stop_words_punct = [t for t in moby_tokens if t not in stop or t not in string.punctuation]
print(text_no_stop_words_punct)
查看输出我有这个:
[...';', 'surging', 'from', 'side', 'to', 'side', ';', 'spasmodically', 'dilating', 'and', 'contracting',...]
标点符号似乎还在。我做错了什么?
【问题讨论】:
标签: python nltk punctuation