【问题标题】:should + distance_function in ElasticSearch应该 + ElasticSearch 中的 distance_function
【发布时间】:2025-11-30 09:20:15
【问题描述】:

我试图在 Elasticsearch 中返回的值中加权地理邻近性。我希望紧密接近不如某些字段(例如legal_name)重要,但比其他字段更重要。

从文档看来,目前的做法是使用distance_feature。但是,我创建的应该条件永远不会以任何相关方式改变结果。事实上,如果我取出必须值,看起来分数是基于 更远的距离。也就是说,更高的分数与更远的距离相关。当然,我想要的是近距离奖励分数。对我可能做错的任何建议表示赞赏......

(注意 - 'coordinate' 字段的类型为 geo_point)

简化文档

{
_index: "organizations",
_type: "_doc",
_id: "3",
_version: 9,
_seq_no: 16944,
_primary_term: 5,
found: true,
_source: {
   id: 3,
   legal_name: "Air Canada",
   operating_name: "Air Canada",
   ...
   coordinate: "43.85133,-79.36572",
}
}

查询

{
    "from": 0,
    "size": 100,
    "query": {
        "bool": {
            "must": [{
                "multi_match": {
                    "query": "Air Canada",
                    "fields": ["legal_name^7","operating_name^7","interest_areas^4","city^3", "description","state","country"
                    ]
                }},
                {"term" : { "organization_type.keyword": "Sponsor" }},
                {"term" : { "approved" : true }}
            ],
            "should": {
              "distance_feature": {
                "field": "coordinate",
                "pivot": "25km",
                "origin": [43.63, -79.3716],
                "boost": 5.0
              }
            }
        }
    }
}```

【问题讨论】:

    标签: elasticsearch


    【解决方案1】:

    最后我放弃了distance_feature,改用了高斯函数。

    { “来自”:0, “大小”:20, “查询”:{

    "function_score": {
      "query": {
        "bool": {
          "must": [{
            "multi_match": {
                "query": "national bank",
                "fields": ["legal_name^0.7", "operating_name^0.7", "interest_areas^0.4", "city^0.4", "description^0.4", "state^0.1", "country^0.1"]
            }},
            {"term" : { "organization_type.keyword": "Sponsor" }},
            {"term" : { "approved" : true }}
          ]
        }
      },
      "boost": "1",
      "boost_mode": "sum",
      "functions": [{
        "gauss": {
          "coordinate": {
            "origin": { "lat": 43.63, "lon": -79.3716 },
            "offset": "500km",
            "decay": 0.5,
            "scale": "100km"
          }
        },
        "weight": 1
      }]
    }
    

    } }`

    【讨论】:

    • 高斯应用于时间场我注意到性能下降。此外,在您的原始查询中,可能是您的带有 3 个“必须”子句的基本查询在最终得分中占主导地位,因此附加的“应该”子句不会将基本得分移动到任何地方。您可以尝试降低“字段”内的提升(就像您在答案中所做的那样)