【发布时间】:2021-07-26 00:45:48
【问题描述】:
我正在使用以下代码通过 jupyter notebook 使用停用词。我在 Linux 服务器上托管 jupyter 并使用笔记本。
python3 -m nltk.downloader stopwords
python3 -m nltk.downloader words
python3 -m nltk.downloader punkt
python3
>>>from nltk.corpus import stopwords
>>>stop_words = set(stopwords.words("english"))
>>>print(stop_words)
这在 python 终端中运行时可以正常工作,但是当我在 Jupyternotebook 中尝试下面时它失败并出现错误。
from nltk.corpus import stopwords
stop_words = set(stopwords.words("english"))
print(stop_words)
---------------------------------------------------------------------------
LookupError Traceback (most recent call last)
/usr/local/lib/python3.7/site-packages/nltk/corpus/util.py in __load(self)
82 try:
---> 83 root = nltk.data.find("{}/{}".format(self.subdir, zip_name))
84 except LookupError:
/usr/local/lib/python3.7/site-packages/nltk/data.py in find(resource_name, paths)
582 resource_not_found = "\n%s\n%s\n%s\n" % (sep, msg, sep)
--> 583 raise LookupError(resource_not_found)
584
LookupError:
**********************************************************************
Resource stopwords not found.
Please use the NLTK Downloader to obtain the resource:
当我尝试在 python3 终端中下载时,我看到它已经是最新的了。
>>> import nltk >>> nltk.download('stopwords') [nltk_data] Downloading package stopwords to /root/nltk_data... [nltk_data] Package stopwords is already up-to-date!
但是当通过 jupyter hub 下载时尝试超时。理想情况下,如果它是最新的,则不需要下载。那么jupyter hub中是否有配置来处理这个
【问题讨论】:
标签: python-3.x jupyter-notebook nltk stop-words jupyterhub