使用另一个搜索对 Elastic 结果进行后处理（从 Solr 迁移）答案

【问题标题】：Postprocessing Elastic results with another search (migrating from Solr)使用另一个搜索对 Elastic 结果进行后处理（从 Solr 迁移）
【发布时间】：2026-02-18 06:05:03
【问题描述】：

我目前正在将一个应用程序从 Solr 迁移到 Elastic，并偶然发现了一个有趣的 Solr 功能，我无法在 Elastic 中重现该功能：对 Solr 的查询返回一个后处理标志，该标志对结果进行质量检查，指示所有令牌是否都是在结果字段中找到。

q  = some_field:(the brown fox)
fl = some_field, full_match:exists(query({!edismax v='some_field:(the brown fox)' mm='100%'}))

Solr 结果如下：

{
    "response": {
        "docs": [
            {
                "some_field": "The Brown Bear",
                "full_match": false
            },
            {
                "some_field": "The Quick Brown Fox",
                "full_match": true
            }
        ]
    }
}

客户端使用该标志来进一步处理结果文档，与分数无关（我在示例中省略了）。我发现这很聪明，因为使用了 Solr 的标记化和分布式计算能力，而不是在客户端做所有事情。

现在在 Elastic 中，我认为这应该在 script_fields 块中完成，但实际上我不知道如何使用无痛脚本执行子查询，经过两天的调查，我怀疑这是否可能：

{
    "query": {
        "match": {
            "some_field": "the brown fox"
        }
    },
    "_source": [
        "some_field"
    ],
    "script_fields": {
        "full_match": {
            "script": "???" <-- Search with Painless script?
        }
    }
}

欢迎任何创意。

【问题讨论】：

标签： elasticsearch solr elasticsearch-painless

【解决方案1】：

如何将 Elasticsearch 的 named queries 与 minimum_should_match 参数结合使用并将其设置为 100% 以仅匹配所有标记都匹配的文档？

然后，您将能够检测到响应中所有标记都匹配的查询。您还可以设置 "boost": 0 以避免影响主查询的分数。

这是一个示例请求：

{
    "query": {
        "bool": {
            "should": [
                {
                    "match": {
                        "message": {
                            "query": "the brown fox",
                            "_name": "main_query"
                        }
                    }
                },
                {
                    "match": {
                        "message": {
                            "query": "the brown fox",
                            "_name": "all_tokens_match",
                            "minimum_should_match": "100%",
                            "boost": 0
                        }
                    }
                }
            ]
        }
    }
}

然后你会得到一个看起来有点像这样的响应：

{
    "hits": [
        {
            "_score": 0.99938476,
            "_source": {
                "message": "The Quick Brown Fox"
            },
            "matched_queries": [
                "main_query",
                "all_tokens_match"
            ]
        },
        {
            "_score": 0.38727614,
            "_source": {
                "message": "The Brown Bear"
            },
            "matched_queries": [
                "main_query"
            ]
        }
    ]
}

查询中所有标记匹配的文档将在响应的 matched_queries 部分中包含 all_tokens_match。

【讨论】：

谢谢，就像一个魅力！与 Solr 不同的是，该标志现在是在搜索阶段计算的，而不是在获取阶段。但这似乎不是什么大的性能问题。