【问题标题】:Elasticsearch not highlighting all matchesElasticsearch 没有突出显示所有匹配项
【发布时间】:2020-08-19 22:03:12
【问题描述】:

我很难理解为什么以下查询对象不会使 ES 突出显示 _source 列中的所有单词。

{
    _source: [
        'baseline',
        'cdrp',
        'date',
        'description',
        'dev_status',
        'element',
        'event',
        'id'
    ],
    track_total_hits: true,
    query: {
        bool: {
            filter: [],
            should: [
                {
                    multi_match:{
                        query: "imposed calcs",
                        fields: ["cdrp","description","narrative.*","title","cop"]
                    }
                }
            ]
        } 
    },
    highlight: { fields: { '*': {} } },
    sort: [],
    from: 0,
    size: 50
}

通过运行此查询,我得到以下高亮对象返回。请注意,仅突出显示“calcs”一词。如何构建突出显示对象以使 ES 突出显示“强加”?

"highlight": {
    "description": [
        "GAP Sub-window conn ONe-e: heve PP-BE Defined ASST requirem RV confsng, des MAN Imposed <em>calcs</em> mising"
    ]
} 

我正在使用以下“描述”映射:

"description": {
    "type": "text",
    "analyzer": "search_synonyms"
},



"analysis": {
    "analyzer": {
        "search_synonyms": {
            "tokenizer": "whitespace",
            "filter": [
                "graph_synonyms"
            ],
            "normalizer": [
                "normalizer_1"
            ]
        }
    },
    "filter": {
        "graph_synonyms": {
            "type": "synonym_graph",
            "synonyms_path": "synonym.txt"
        }
    },
    "normalizer": {
        "normalizer_1": {
            "type": "custom",
            "char_filter": [],
            "filter": ["lowercase", "asciifolding"]
        }
    }
}

【问题讨论】:

    标签: elasticsearch highlight word synonym multiple-matches


    【解决方案1】:

    编辑

    我认为您的 graph_synonyms 过滤器正在覆盖规范器的过滤器。试试这个:

    PUT highlighter
    {
      "settings": {
        "analysis": {
          "analyzer": {
            "search_synonyms": {
              "tokenizer": "whitespace",
              "filter": [
                "graph_synonyms",
                "lowercase",
                "asciifolding"
              ]
            }
          },
          "filter": {
            "graph_synonyms": {
              "type": "synonym_graph",
              "synonyms_path": "synonym.txt"
            }
          }
        }
      },
      "mappings": {
        "properties": {
          "description": {
            "type": "text",
            "analyzer": "search_synonyms"
          }
        }
      }
    }
    

    原创

    我怀疑您的映射中有某种设置阻止了匹配,因为我无法使用半默认映射复制它:

    PUT highlighter
    {
      "settings": {
        "analysis": {
          "analyzer": {
            "my_analyzer": {
              "tokenizer": "standard",
              "filter": [
                "lowercase"
              ]
            }
          }
        }
      },
      "mappings": {
        "properties": {
          "description": {
            "type": "text",
            "fields": {
              "lowercase": {
                "type": "text",
                "analyzer": "my_analyzer"
              }
            }
          }
        }
      }
    }
    
    POST highlighter/_doc
    {
      "description": "GAP Sub-window conn ONe-e: heve PP-BE Defined ASST requirem RV confsng, des MAN Imposed calcs mising"
    }
    

    插入您的查询

    GET highlighter/_search
    {
      "_source": [
        "baseline",
        "cdrp",
        "date",
        "description",
        "dev_status",
        "element",
        "event",
        "id"
      ],
      "track_total_hits": true,
      "query": {
        "bool": {
          "filter": [],
          "should": [
            {
              "multi_match": {
                "query": "imposed calcs",
                "fields": [
                  "cdrp",
                  "description.lowercase",
                  "narrative.*",
                  "title",
                  "cop"
                ]
              }
            }
          ]
        }
      },
      "highlight": {
        "fields": {
          "*": {}
        }
      },
      "sort": [],
      "from": 0,
      "size": 50
    }
    

    屈服

    [
      {
        "_index":"highlighter",
        "_type":"_doc",
        "_id":"Bf5F5HEBW-D5QnrWwTyh",
        "_score":0.5753642,
        "_source":{
          "description":"GAP Sub-window conn ONe-e: heve PP-BE Defined ASST requirem RV confsng, des MAN Imposed calcs mising"
        },
        "highlight":{
          "description":[
            "GAP Sub-window conn ONe-e: heve PP-BE Defined ASST requirem RV confsng, des MAN <em>Imposed</em> <em>calcs</em> mising"
          ]
        }
      }
    ]
    

    【讨论】:

    • 请查看我更新的答案。我需要更新我的 search_synonym 分析器吗?
    • 感谢您的回复。不幸的是,它不起作用。我无法摆脱 normalizer_1,因为其他领域正在使用它
    • 那么不要删除它,而是将"lowercase", "asciifolding" 添加到分析仪的filter[] 中。
    猜你喜欢
    • 1970-01-01
    • 2016-11-30
    • 2020-05-28
    • 2015-03-30
    • 1970-01-01
    • 2012-01-07
    • 2017-07-17
    • 2015-03-06
    • 1970-01-01
    相关资源
    最近更新 更多