为什么在同一个查询中某些结果分数包含 queryWeight 而其他结果分数不包含？答案

【问题标题】：Why is queryWeight included for some result scores, but not others, in the same query?为什么在同一个查询中某些结果分数包含 queryWeight 而其他结果分数不包含？
【发布时间】：2014-02-08 10:32:18
【问题描述】：

我正在对多个字段_all 和tags.name 执行一个查询字符串查询，并试图了解评分。查询：{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}。以下是查询返回的文档：

文档 1 在 tags.name 上完全匹配，但在 _all 上却没有。
文档 8 与 tags.name 和 _all 完全匹配。

Document 8 应该会赢，而且确实赢了，但我对评分的结果感到困惑。似乎文档 1 的 tags.name 分数乘以 IDF 两次而受到惩罚，而文档 8 的 tags.name 分数仅乘以 IDF 一次。简而言之：

它们都有一个组件weight(tags.name:animal in 0) [PerFieldSimilarity]。
在文档 1 中，我们有 weight = score = queryWeight x fieldWeight。
在文档 8 中，我们有 weight = fieldWeight!

由于queryWeight 包含idf，这将导致文档 1 被其 idf 惩罚两次。

谁能理解这个？

其他信息

如果我从查询字段中删除 _all，queryWeight 将完全从解释中消失。
添加"use_dis_max":true 作为选项无效。
- 但是，另外添加 "tie_breaker":0.7（或任何值）确实会影响 Document 8，因为它会使用我们在 Document 1 中看到的更复杂的公式。
- 想法：布尔查询（就是这样）可能会故意这样做，以便为匹配多个子查询的查询赋予更多权重，这似乎是合理的。但是，这对于 dis_max 查询没有任何意义，它应该只返回子查询的最大值。

以下是相关的解释请求。寻找嵌入式 cmets。

文档 1（仅匹配 tags.name）：

curl -XGET 'http://localhost:9200/questions/question/1/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}':

{
  "ok" : true,
  "_index" : "questions_1390104463",
  "_type" : "question",
  "_id" : "1",
  "matched" : true,
  "explanation" : {
    "value" : 0.058849156,
    "description" : "max of:",
    "details" : [ {
      "value" : 0.058849156,
      "description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:",
      // weight = score = queryWeight x fieldWeight
      "details" : [ {
        // score and queryWeight are NOT a part of the other explain!
        "value" : 0.058849156,
        "description" : "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
        "details" : [ {
          "value" : 0.30685282,
          "description" : "queryWeight, product of:",
          "details" : [ {
            // This idf is NOT a part of the other explain!
            "value" : 0.30685282,
            "description" : "idf(docFreq=1, maxDocs=1)"
          }, {
            "value" : 1.0,
            "description" : "queryNorm"
          } ]
        }, {
          "value" : 0.19178301,
          "description" : "fieldWeight in 0, product of:",
          "details" : [ {
            "value" : 1.0,
            "description" : "tf(freq=1.0), with freq of:",
            "details" : [ {
              "value" : 1.0,
              "description" : "termFreq=1.0"
            } ]
          }, {
            "value" : 0.30685282,
            "description" : "idf(docFreq=1, maxDocs=1)"
          }, {
            "value" : 0.625,
            "description" : "fieldNorm(doc=0)"
          } ]
        } ]
      } ]
    } ]
  }

文档 8（匹配 _all 和 tags.name）：

curl -XGET 'http://localhost:9200/questions/question/8/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}':

{
  "ok" : true,
  "_index" : "questions_1390104463",
  "_type" : "question",
  "_id" : "8",
  "matched" : true,
  "explanation" : {
    "value" : 0.15342641,
    "description" : "max of:",
    "details" : [ {
      "value" : 0.033902764,
      "description" : "btq, product of:",
      "details" : [ {
        "value" : 0.033902764,
        "description" : "weight(_all:anim in 0) [PerFieldSimilarity], result of:",
        "details" : [ {
          "value" : 0.033902764,
          "description" : "fieldWeight in 0, product of:",
          "details" : [ {
            "value" : 0.70710677,
            "description" : "tf(freq=0.5), with freq of:",
            "details" : [ {
              "value" : 0.5,
              "description" : "phraseFreq=0.5"
            } ]
          }, {
            "value" : 0.30685282,
            "description" : "idf(docFreq=1, maxDocs=1)"
          }, {
            "value" : 0.15625,
            "description" : "fieldNorm(doc=0)"
          } ]
        } ]
      }, {
        "value" : 1.0,
        "description" : "allPayload(...)"
      } ]
    }, {
      "value" : 0.15342641,
      "description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:",
      // weight = fieldWeight
      // No score or queryWeight in sight!
      "details" : [ {
        "value" : 0.15342641,
        "description" : "fieldWeight in 0, product of:",
        "details" : [ {
          "value" : 1.0,
          "description" : "tf(freq=1.0), with freq of:",
          "details" : [ {
            "value" : 1.0,
            "description" : "termFreq=1.0"
          } ]
        }, {
          "value" : 0.30685282,
          "description" : "idf(docFreq=1, maxDocs=1)"
        }, {
          "value" : 0.5,
          "description" : "fieldNorm(doc=0)"
        } ]
      } ]
    } ]
  }
}

【问题讨论】：

您好，您自己找到答案了吗？或者你有什么资料可以研究吗？我正遭受同样的缺乏理解。在我们的例子中，这会严重影响一些命中，我需要了解为什么以及如何调整我们的查询。
不，很遗憾，我从来没有找到答案.. 很想知道你听到了什么。

标签： elasticsearch lucene

【解决方案1】：

我没有答案。只想提一下我在 Elasticsearch 论坛上发布的问题：https://groups.google.com/forum/#!topic/elasticsearch/xBKlFkq0SP0 当我得到答案时，我会在这里通知。

【讨论】：