【发布时间】:2016-11-06 19:21:20
【问题描述】:
我正在尝试从以下段落结构中提取此类信息:
women_ran men_ran kids_ran walked
1 2 1 3
2 4 3 1
3 6 5 2
text = ["On Tuesday, one women ran on the street while 2 men ran and 1 child ran on the sidewalk. Also, there were 3 people walking.", "One person was walking yesterday, but there were 2 women running as well as 4 men and 3 kids running.", "The other day, there were three women running and also 6 men and 5 kids running on the sidewalk. Also, there were 2 people walking in the park."]
我使用 Python 的 spaCy 作为我的 NLP 库。我是 NLP 工作的新手,希望获得一些指导,了解从此类句子中提取此表格信息的最佳方法。
如果只是识别是否有人在跑步或走路,我会使用sklearn 来拟合分类模型,但我需要提取的信息显然比这更细化(我正在尝试检索每个子类别和值)。任何指导将不胜感激。
【问题讨论】:
标签: python nlp information-extraction spacy