【发布时间】:2021-09-19 07:42:45
【问题描述】:
我正在使用 ElasticSearch 6.3,并且正在处理具有多个子聚合的聚合,其中我需要根据较低级别的 reverse_nested 聚合的 doc_count 对顶级聚合存储桶进行排序。
这就是我的索引的创建方式:
PUT /myindex
{
"mappings": {
"default": {
"properties": {
"items": {
"type": "nested",
"properties": {
"subitems": {
"type": "nested",
"properties": {
"id": {
"type": "long"
},
"name": {
"type": "keyword"
}
}
}
}
},
"name": {
"type": "keyword"
}
}
}
}
}
这些是我索引的示例文档:
{
"name": "Document #1",
"items": [
{
"subitems": [
{
"id": 1,
"name": "Subitem #1"
},
{
"id": 2,
"name": "Subitem #2"
}
]
},
{
"subitems": [
{
"id": 2,
"name": "Subitem #2"
},
{
"id": 3,
"name": "Subitem #3"
}
]
}
]
}
{
"name": "Document #2",
"items": [
{
"subitems": [
{
"id": 2,
"name": "Subitem #2"
}
]
}
]
}
{
"name": "Document #3",
"items": [
{
"subitems": [
{
"id": 3,
"name": "Subitem #3"
}
]
},
{
"subitems": [
{
"id": 2,
"name": "Subitem #2"
}
]
}
]
}
{
"name": "Document #4",
"items": [
{
"subitems": [
{
"id": 2,
"name": "Subitem #2"
},
{
"id": 5,
"name": "Subitem #5"
}
]
}
]
}
{
"name": "Document #5",
"items": [
{
"subitems": [
{
"id": 2,
"name": "Subitem #2"
}
]
},
{
"subitems": [
{
"id": 2,
"name": "Subitem #2"
}
]
},
{
"subitems": [
{
"id": 2,
"name": "Subitem #2"
}
]
},
{
"subitems": [
{
"id": 2,
"name": "Subitem #2"
}
]
},
{
"subitems": [
{
"id": 2,
"name": "Subitem #2"
}
]
},
{
"subitems": [
{
"id": 2,
"name": "Subitem #2"
}
]
}
]
}
{
"name": "Document #6",
"items": [
{
"subitems": [
{
"id": 3,
"name": "Subitem #3"
}
]
}
]
}
{
"name": "Document #7",
"items": [
{
"subitems": [
{
"id": 3,
"name": "Subitem #3"
}
]
}
]
}
{
"name": "Document #8",
"items": [
{
"subitems": [
{
"id": 3,
"name": "Subitem #3"
}
]
}
]
}
{
"name": "Document #9",
"items": [
{
"subitems": [
{
"id": 3,
"name": "Subitem #3"
}
]
}
]
}
我需要我的聚合能够提取包含每个子项 id/name 对的文档的数量。 (考虑子项 ID 始终对应于相同的子项名称)。 那就是:
id | name | count
---+------------+------
2 | Subitem #2 | 5
3 | Subitem #3 | 6
1 | Subitem #1 | 1
5 | Subitem #5 | 1
这是原始的聚合查询:
GET /myindex/default/_search
{
"size": 0,
"aggregations": {
"my_nested_agg": {
"nested": {
"path": "items.subitems"
},
"aggregations": {
"subitem_id": {
"terms": {
"field": "items.subitems.id"
},
"aggregations": {
"subitem_name": {
"terms": {
"field": "items.subitems.name"
},
"aggregations": {
"my_rev_agg": {
"reverse_nested": {}
}
}
}
}
}
}
}
}
}
聚合似乎返回了我需要的所有数据:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 9,
"max_score": 0.0,
"hits": []
},
"aggregations": {
"my_nested_agg": {
"doc_count": 19,
"subitem_id": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 2,
"doc_count": 11,
"subitem_name": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Subitem #2",
"doc_count": 11,
"my_rev_agg": {
"doc_count": 5
}
}
]
}
},
{
"key": 3,
"doc_count": 6,
"subitem_name": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Subitem #3",
"doc_count": 6,
"my_rev_agg": {
"doc_count": 6
}
}
]
}
},
{
"key": 1,
"doc_count": 1,
"subitem_name": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Subitem #1",
"doc_count": 1,
"my_rev_agg": {
"doc_count": 1
}
}
]
}
},
{
"key": 5,
"doc_count": 1,
"subitem_name": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Subitem #5",
"doc_count": 1,
"my_rev_agg": {
"doc_count": 1
}
}
]
}
}
]
}
}
}
}
但是,桶是根据“subitem_id”子聚合的 doc_count 降序排列的。
相反,我需要根据 reverse_nested 子聚合的 doc_count 降序排列存储桶。像这样:
id | name | count
---+------------+------
3 | Subitem #3 | 6
2 | Subitem #2 | 5
1 | Subitem #1 | 1
5 | Subitem #5 | 1
我尝试通过以下查询来实现这一点:
GET /myindex/default/_search
{
"size": 0,
"aggregations": {
"my_nested_agg": {
"nested": {
"path": "items.subitems"
},
"aggregations": {
"subitem_id": {
"terms": {
"field": "items.subitems.id",
"order": [
{
"subitem_name>my_rev_agg._count": "desc"
}
]
},
"aggregations": {
"subitem_name": {
"terms": {
"field": "items.subitems.name"
},
"aggregations": {
"my_rev_agg": {
"reverse_nested": {}
}
}
}
}
}
}
}
}
}
然后我得到错误:
聚合订单路径无效 [subitem_name>my_rev_agg._count]。桶只能在子聚合器路径上排序,该路径由路径内的零个或多个单桶聚合和路径末端的最终单桶或指标聚合构建而成。子路径[subitem_name]指向非单桶聚合
请您给点建议。 非常感谢您。
【问题讨论】:
标签: elasticsearch nested reverse aggregation