【问题标题】:ElasticSearch multiple terms aggregation orderElasticSearch 多词聚合顺序
【发布时间】:2016-03-20 02:13:00
【问题描述】:

我有一个描述容器的文档结构,它的一些字段是:

containerId -> Unique Id,String
containerManufacturer -> String
containerValue -> Double
estContainerWeight ->Double
actualContainerWeight -> Double

我想运行一个搜索聚合,它在两个权重字段上具有两个级别的术语聚合,但按权重字段的降序排列,如下所示:

{
  "size": 0,
  "aggs": {
    "by_manufacturer": {
      "terms": {
        "field": "containerManufacturer",
        "size": 10,
        "order": {"estContainerWeight": "desc"} //Cannot do this
      },
      "aggs": {
        "by_est_weight": {
          "terms": {
            "field": "estContainerWeight",
            "size": 10,
            "order": { "actualContainerWeight": "desc"} //Cannot do this
          },
          "aggs": {
            "by_actual_weight": {
              "terms": {
                "field": "actualContainerWeight",
                "size": 10
              },
              "aggs" : {
                "container_value_sum" : {"sum" : {"field" : "containerValue"}}
              }
            }
          }
        }
      }
    }
  }
}

示例文件:

{"containerId":1,"containerManufacturer":"A","containerValue":12,"estContainerWeight":5.0,"actualContainerWeight":5.1}
{"containerId":2,"containerManufacturer":"A","containerValue":24,"estContainerWeight":5.0,"actualContainerWeight":5.2}
{"containerId":3,"containerManufacturer":"A","containerValue":23,"estContainerWeight":5.0,"actualContainerWeight":5.2}
{"containerId":4,"containerManufacturer":"A","containerValue":32,"estContainerWeight":6.0,"actualContainerWeight":6.2}
{"containerId":5,"containerManufacturer":"A","containerValue":26,"estContainerWeight":6.0,"actualContainerWeight":6.3}
{"containerId":6,"containerManufacturer":"A","containerValue":23,"estContainerWeight":6.0,"actualContainerWeight":6.2}

预期输出(未完成):

{
  "by_manufacturer": {
    "buckets": [
      {
        "key": "A",
        "by_est_weight": {
          "buckets": [
            {
              "key" : 5.0,
              "by_actual_weight" : {
                "buckets" : [
                  {
                    "key" : 5.2,
                    "container_value_sum" : {
                      "value" : 1234 //Not actual sum
                    }
                  },
                  {
                    "key" : 5.1,
                    "container_value_sum" : {
                      "value" : 1234 //Not actual sum
                    }
                  }
                ]
              }
            },
            {
              "key" : 6.0,
              "by_actual_weight" : {
                "buckets" : [
                  {
                    "key" : 6.2,
                    "container_value_sum" : {
                      "value" : 1234 //Not actual sum
                    }
                  },
                  {
                    "key" : 6.3,
                    "container_value_sum" : {
                      "value" : 1234 //Not actual sum
                    }
                  }
                ]
              }
            }
          ]
        }
      }
    ]
  }
}

但是,我无法按嵌套聚合排序。 (错误:术语桶只能在子聚合器路径上排序,该路径由路径内的零个或多个单桶聚合和最终的单桶或指标聚合...)

例如,对于上面的示例输出,如果我在术语聚合上引入大小(如果我的数据很大,我将不得不这样做),我无法控制生成的桶,所以我只想得到每个术语聚合的前 N ​​个权重。

有没有办法做到这一点?

【问题讨论】:

  • 您能否发布一些示例文档,以及您对这些文档的预期聚合输出?

标签: elasticsearch


【解决方案1】:

如果我正确理解您的问题,您希望按照容器估计重量的降序对制造商术语进行排序,然后按实际重量的降序对每桶“估计重量”进行排序。

{
  "size": 0,
  "aggs": {
    "by_manufacturer": {
      "terms": {
        "field": "containerManufacturer",
        "size": 10
      },
        "by_est_weight": {
          "terms": {
            "field": "estContainerWeight",
            "size": 10,
            "order": {
              "_term": "desc"       <--- change to this
            }
          },
            "by_actual_weight": {
              "terms": {
                "field": "actualContainerWeight",
                "size": 10,
                "order" : {"_term" : "desc"}   <----- Change to this
              },
              "aggs": {
                "container_value_sum": {
                  "sum": {
                    "field": "containerValue"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

【讨论】:

  • 这与我想要的非常接近,但是没有办法按术语的降序排列术语聚合吗?我的问题是,由于权重字段是双值,生成的唯一桶的数量在功能上可能是无限的(或至少非常大),所以我想术语聚合前 x 权重(估计和实际)
  • 您介意用一些示例值更新您的问题并说明您期望的结果吗?
  • 已更新示例数据
  • 谢谢。为什么不简单地按{"order": {"_term": "desc"}} 对两个聚合进行排序?
  • 好吧,这似乎可行。文档对此有点不清楚,但这是否意味着对于术语聚合,_term 的顺序仅适用于正在聚合的特定术语?
猜你喜欢
  • 2017-08-28
  • 1970-01-01
  • 1970-01-01
  • 2018-11-28
  • 1970-01-01
  • 2015-11-17
  • 2021-03-08
  • 1970-01-01
  • 2020-02-18
相关资源
最近更新 更多