【发布时间】:2018-02-02 12:34:38
【问题描述】:
给定输入“quick brown fox jumped”,我想为单词创建所有可能的标记组合。所以示例字符串将被标记为
[
"quick", "quick brown", "quick fox", "quick jumped",
"brown", "brown quick", "brown fox", "brown jumped",
...,
"jumped quick", "jumped brown", "jumped fox", "jumped"
]
我可以使用shingle tokeniser,但它只能通过连接相邻的术语来创建新的标记,我最终得到:
[
"quick", "quick brown", "quick brown fox", "quick brown fox jumped",
"brown", "brown fox", "brown fox jumped",
"fox", "fox jumped",
"jumped"
]
这是正确的一步,但不是我正在寻找的东西。
【问题讨论】:
-
你能解释一下你所追求的用例吗?
-
@Val 长话短说 - 不仅针对单个术语([“quick”、“brown”、“fox”、“jumped”])而且还针对这些单词的组合生成术语聚合/条款
标签: elasticsearch combinations elasticsearch-5 elasticsearch-2.0