【发布时间】:2019-12-14 09:29:55
【问题描述】:
我想使用 NLTK 库标记文本语料库。
我的语料库看起来像:
['Did you hear about the Native American man that drank 200 cups of tea?',
"What's the best anti diarrheal prescription?",
'What do you call a person who is outside a door and has no arms nor legs?',
'Which Star Trek character is a member of the magic circle?',
"What's the difference between a bullet and a human?",
我试过了:
tok_corp = [nltk.word_tokenize(sent.decode('utf-8')) for sent in corpus]
其中提出:
AttributeError: 'str' 对象没有属性 'decode'
我们将不胜感激。谢谢。
【问题讨论】:
标签: python pandas numpy recommendation-engine