【发布时间】:2017-05-25 14:50:08
【问题描述】:
我想在我的 FlaskApp 中使用 NLP 的 spaCy 功能。我一直在官方网站上搜索不同的例子:(对于 spaCy) https://spacy.io/docs/usage/tutorials
和(对于 Flask) https://realpython.com/blog/python/flask-by-example-part-3-text-processing-with-requests-beautifulsoup-nltk/
在 MyWebapp 中,我有代码可以发布来自 parse_news_from 的 NLP 分析结果:
@app.route('/submit', methods=['POST'])
def submit_textarea():
if(parse_news_from(format(request.form["text"]))):
print("The news is parsed sucessfully!");
return talk_title;
目前parse_news_from 与 NLTK 库一起使用,但我将使用 spaCy。
这是我从官方来源获得的 spaCy 代码:
from spacy.en import English
import _regex
parser = English()
# Test Data
multiSentence = "There is an art, it says, or rather, a knack to flying." \
"The knack lies in learning how to throw yourself at the ground and miss." \
"In the beginning the Universe was created. This has made a lot of people "\
"very angry and been widely regarded as a bad move."
# all you have to do to parse text is this:
#note: the first time you run spaCy in a file it takes a little while to load up its modules
parsedData = parser(multiSentence)
# Let's look at the tokens
# All you have to do is iterate through the parsedData
# Each token is an object with lots of different properties
# A property with an underscore at the end returns the string representation
# while a property without the underscore returns an index (int) into spaCy's vocabulary
# The probability estimate is based on counts from a 3 billion word
# corpus, smoothed using the Simple Good-Turing method.
for i, token in enumerate(parsedData):
print("original:", token.orth, token.orth_)
print("lowercased:", token.lower, token.lower_)
print("lemma:", token.lemma, token.lemma_)
print("shape:", token.shape, token.shape_)
print("prefix:", token.prefix, token.prefix_)
print("suffix:", token.suffix, token.suffix_)
print("log probability:", token.prob)
print("Brown cluster id:", token.cluster)
print("----------------------------------------")
if i > 1:
break
执行后出现错误:
File "/home/xxx/anaconda3/lib/python3.6/site-packages/_regex_core.py", line 21, in <module>
import _regex
ImportError: /home/xxx/anaconda3/lib/python3.6/site-packages/_regex.cpython-36m-x86_64-linux-gnu.so: undefined symbol: PySlice_AdjustIndices
是否有任何工作示例如何开始?我的错在哪里?谢谢
【问题讨论】:
标签: python ios web-applications nlp