【发布时间】:2022-01-02 20:59:17
【问题描述】:
我有以下文本,想隔离与关键字相关的句子的一部分,在本例中为keywords = ['pizza', 'chips']。
text = "The pizza is great but the chips aren't the best"
预期输出:
{'pizza': 'The pizza is great'}
{'chips': "the chips aren't the best"}
我尝试过使用Spacy Dependency Matcher,但我承认我不太确定它是如何工作的。我为chips 尝试了以下模式,但没有产生匹配项。
import spacy
from spacy.matcher import DependencyMatcher
nlp = spacy.load("en_core_web_sm")
pattern = [
{
"RIGHT_ID": "chips_id",
"RIGHT_ATTRS": {"ORTH": "chips"}
},
{
"LEFT_ID": "chips_id",
"REL_OP": "<<",
"RIGHT_ID": "other_words",
"RIGHT_ATTRS": {"POS": '*'}
}
]
matcher = DependencyMatcher(nlp.vocab)
matcher.add("chips", [pattern])
doc = nlp("The pizza is great but the chips aren't the best")
for id_, (_, other_words) in matcher(doc):
print(doc[other_words])
编辑:
补充例句:
example_sentences = [
"The pizza's are just OK, the chips is stiff and the service mediocre",
"Then the mains came and the pizza - these we're really average - chips had loads of oil and was poor",
"Nice pizza freshly made to order food is priced well, but chips are not so keenly priced.",
"The pizzas and chips taste really good and the Tango Ice Blast was refreshing"
]
【问题讨论】:
-
您需要处理的句子在结构上是否与您使用的示例相似?
-
是的,提供的例句很好地代表了我需要处理的文本。我已经用更多例句更新了这个问题。
-
我可以发布一个初步的解决方案,这样我们都可以检查一下吗?我的解决方案适用于您输入的第一句话和一些例句,但我们可能需要以某种方式修改其他一些例句,然后 SpaCy 才能有效地处理它们
-
看起来您正在简化句子以进行基于方面的情感分析。 spaCy 为您提供了执行此操作的工具,但如果您以前不熟悉这些问题,它会有点涉及。我建议查看 Jurafsky 和 Martin 的书(免费在线)关于依赖解析和情感分析的部分。开始。 web.stanford.edu/~jurafsky/slp3