如何检索触发弹性搜索查询命中的字段答案

【问题标题】：How to retrieve the field which triggered a hit for a elasticsearch query如何检索触发弹性搜索查询命中的字段
【发布时间】：2017-08-15 09:39:17
【问题描述】：

运行一个 wagtail 站点 (1.11)，使用 elasticsearch (5.5) 作为搜索后端并索引多个字段，例如：

search_fields = Page.search_fields + [
    index.SearchField('body'),
    index.SearchField('get_post_type_display'),
    index.SearchField('document_excerpt', boost=2),
    index.SearchField('get_dark_data_full_text'),
]

我想在我的搜索结果模板中指出搜索在哪个字段中出现“命中”（或者更好地显示命中的 sn-p，但这似乎是另一个问题）。

This question 似乎解决了我的问题，但我不知道如何将其集成到我的 wagtail 网站中。

任何提示如何获取此信息以及如何将其集成到 wagtail 搜索中？

【问题讨论】：

标签： django elasticsearch wagtail

【解决方案1】：

ElasticSearch 有一个解释 API，它可以解释它如何在内部对具有特定 id 的特定记录的字段命中进行评分。

这是文档：

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-explain.html

它肯定会为您提供有关如何提升每个字段以及如何建立分数的答案。

例如，如果您的 hits max_score 为 2.0588222，并且您想知道哪些字段对该分数的贡献，您可以使用 explain API。

这是一个解释查询响应的示例，您可以看到字段 firstName 贡献了 1.2321436 到最高得分，而 lastName 贡献了 0.8266786：

{
  "_index" : "customer_test",
  "_type" : "customer",
  "_id" : "597f2b3a79c404fafefcd46e",
  "matched" : true,
  "explanation" : {
    "value" : **2.0588222**,
    "description" : "sum of:",
    "details" : [ {
      "value" : 2.0588222,
      "description" : "sum of:",
      "details" : [ {
        "value" : **1.2321436**,
        "description" : "weight(firstName:merge in 23) [PerFieldSimilarity], result of:",
        "details" : [ {
          "value" : 1.2321436,
          "description" : "score(doc=23,freq=1.0 = termFreq=1.0\n), product of:",
          "details" : [ {
            "value" : 1.2321436,
            "description" : "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
            "details" : [ {
              "value" : 3.0,
              "description" : "docFreq",
              "details" : [ ]
            }, {
              "value" : 11.0,
              "description" : "docCount",
              "details" : [ ]
            } ]
          }, {
            "value" : 1.0,
            "description" : "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
            "details" : [ {
              "value" : 1.0,
              "description" : "termFreq=1.0",
              "details" : [ ]
            }, {
              "value" : 1.2,
              "description" : "parameter k1",
              "details" : [ ]
            }, {
              "value" : 0.75,
              "description" : "parameter b",
              "details" : [ ]
            }, {
              "value" : 1.0,
              "description" : "avgFieldLength",
              "details" : [ ]
            }, {
              "value" : 1.0,
              "description" : "fieldLength",
              "details" : [ ]
            } ]
          } ]
        } ]
      }, {
        "value" : 0.8266786,
        "description" : "weight(lastName:doe in 23) [PerFieldSimilarity], result of:",
        "details" : [ {
          "value" : 0.8266786,
          "description" : "score(doc=23,freq=1.0 = termFreq=1.0\n), product of:",
          "details" : [ {
            "value" : **0.8266786**,
            "description" : "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
            "details" : [ {
              "value" : 3.0,
              "description" : "docFreq",
              "details" : [ ]
            }, {
              "value" : 7.0,
              "description" : "docCount",
              "details" : [ ]
            } ]
          }, {
            "value" : 1.0,
            "description" : "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
            "details" : [ {
              "value" : 1.0,
              "description" : "termFreq=1.0",
              "details" : [ ]
            }, {
              "value" : 1.2,
              "description" : "parameter k1",
              "details" : [ ]
            }, {
              "value" : 0.75,
              "description" : "parameter b",
              "details" : [ ]
            }, {
              "value" : 1.0,
              "description" : "avgFieldLength",
              "details" : [ ]
            }, {
              "value" : 1.0,
              "description" : "fieldLength",
              "details" : [ ]
            } ]
          } ]
        } ]
      } ]
    }, {
      "value" : 0.0,
      "description" : "match on required clause, product of:",
      "details" : [ {
        "value" : 0.0,
        "description" : "# clause",
        "details" : [ ]
      }, {
        "value" : 1.0,
        "description" : "_type:customer, product of:",
        "details" : [ {
          "value" : 1.0,
          "description" : "boost",
          "details" : [ ]
        }, {
          "value" : 1.0,
          "description" : "queryNorm",
          "details" : [ ]
        } ]
      } ]
    } ]
  }
}

关于鹡鸰：我没有这方面的经验。但您绝对可以访问 REST API 并解析 Explain 查询的 JSON。

【讨论】：

这听起来很棒@gil.fernandes - 但我不知道如何使用内置搜索和elasticsearch作为后端的wagtail来使用此功能，如果有人可以，我会很高兴指向一个示例实现