是的。我可以想到两种方法:
查找包含这些词的某个子集的所有文章,然后仅返回提及的词数是您在词表中提供的词数的文章。
获取给定单词列表的 :Word 节点,然后获取文章中提及所有单词的文章。
这是一个用于测试的示例图表:
MERGE (a1:Article {name:'a1'}),
(a2:Article {name:'a2'}),
(a3:Article {name:'a3'})
MERGE (w1:Word{name:'orange'}),
(w2:Word{name:'apple'}),
(w3:Word{name:'pineapple'}),
(w4:Word{name:'banana'})
MERGE (a1)-[:MENTIONED]->(w1),
(a1)-[:MENTIONED]->(w2),
(a1)-[:MENTIONED]->(w3),
(a1)-[:MENTIONED]->(w4),
(a2)-[:MENTIONED]->(w1),
(a2)-[:MENTIONED]->(w4),
(a3)-[:MENTIONED]->(w1),
(a3)-[:MENTIONED]->(w2),
(a3)-[:MENTIONED]->(w3)
方法一,比较wordlist的大小和文章中提到的字数,看起来是这样的:
WITH ["orange", "apple"] as words
MATCH (word:Word)<-[:MENTIONED]-(article:Article)
WHERE word.name IN words
WITH words, article, COUNT(word) as wordCount
WHERE wordCount = SIZE(words)
RETURN article
这仅在文章和提及的单词之间只有一个 :MENTIONED 关系时才有效,无论该单词被提及多少次。
方法 2 对 :Words 的集合使用 ALL() 以确保我们匹配提及所有单词的文章:
WITH ["orange", "apple"] as words
MATCH (word:Word)
WHERE word.name in words
WITH COLLECT(word) as words
MATCH (article:Article)
WHERE ALL (word in words WHERE (word)<-[:MENTIONED]-(article))
RETURN article
您可以尝试将 PROFILE 与其中的每一个一起使用,以确定哪个最适合您的数据集。