【发布时间】:2018-05-28 13:15:49
【问题描述】:
假设我们在 elasticsearch 中有以下映射:
PUT /synonyms_test/
{
"settings": {
"index": {
"max_result_window": "5000000",
"queries.cache.enabled": true,
"requests.cache.enable": true
},
"analysis": {
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms": [
"USA, America, United States of America, The United States"
],
"tokenizer": "keyword"
}
},
"analyzer": {
"synonyms_analyzer": {
"filter": [
"synonym_filter",
"lowercase"
],
"tokenizer": "standard"
}
}
}
},
"mappings": {
"synonyms_index": {
"properties": {
"full_text": {
"type": "text",
"analyzer": "synonyms_analyzer",
"search_analyzer": "synonyms_analyzer"
}
}
}
}
}
下面是三个带有同义词的索引文档的列表。
POST synonyms_test/synonyms_index/1
{
"full_text": "Washington is capital of USA"
}
POST synonyms_test/synonyms_index/2
{
"full_text": "Washington is capital of the America"
}
POST synonyms_test/synonyms_index/3
{
"full_text": "Washington is capital of the United States of America"
}
使用多词同义词搜索不起作用。我希望“美国”在 elasticsearch 中被转换为同义词,并且 elasticsearch 应该匹配所有三个文档。
GET synonyms_test/synonyms_index/_search
{
"query": {
"match": {
"full_text": {
"query": "Washington United States of America",
"operator": "And"
}
}
}
}
如果我将 synonym_filter 中的标记器类型更改为标准,那么即使键入状态也会带来我不想要的所有三个结果。
【问题讨论】:
标签: elasticsearch