【问题标题】:Sum and count aggregations over Elasticsearch fields对 Elasticsearch 字段进行汇总和计数
【发布时间】:2018-06-26 19:32:42
【问题描述】:

我是 Elasticsearch 的新手,我希望对 Elasticsearch 5.x 索引中的字段执行某些聚合。我有一个索引,其中包含具有字段langs(具有嵌套结构)和docLang 的文档。这些是动态映射的字段。以下是示例文档

文档 1:

{
   "_index":"A",
   "_type":"document",
   "_id":"1",
   "_source":{
      "text":"This is a test sentence.",
      "langs":{
         "X":{
            "en":1,
            "es":2,
            "zh":3
         },
        "Y":{
            "en":4,
            "es":5,
            "zh":6
         } 
      },
      "docLang": "en"
   }
}

文档 2:

{
   "_index":"A",
   "_type":"document",
   "_id":"2",
   "_source":{
      "text":"This is a test sentence.",
      "langs":{
         "X":{
            "en":1,
            "es":2
         },
         "Y":{
            "en":3,
            "es":4
         } 
      },
      "docLang": "es"
   }
}

文档 3:

{
   "_index":"A",
   "_type":"document",
   "_id":"2",
   "_source":{
      "text":"This is a test sentence.",
      "langs":{
         "X":{
            "en":1
         },
         "Y":{
            "en":2
         } 
      },
      "docLang": "en"
   }
}

我想对 langs 字段执行求和聚合,这样对于每个键 (X/Y) 和每种语言,我都可以获得索引中所有文档的总和。另外,我想从docLang 字段中生成每种语言的文档数。

例如:对于以上 3 个文档,langs 字段的总和聚合如下所示:

"langs":{  
      "X":{  
         "en":3,
         "es":4,
         "zh":3
      },
      "Y":{  
         "en":9,
         "es":9,
         "zh":6
      }
   }

docLang 计数如下所示:

 "docLang":{
    "en" : 2,
    "es" : 1
   }

另外,由于一些生产环境限制,我无法在 Elasticsearch 中使用脚本。所以,我想知道是否可以对上述字段仅使用 field 聚合类型?

【问题讨论】:

    标签: elasticsearch elasticsearch-plugin elasticsearch-5


    【解决方案1】:
    {
      "size": 0,
      "aggs": {
        "X": {
          "nested": {
            "path": "langs.X"
          },
          "aggs": {
            "X_sum_en": {
              "sum": {
                "field": "langs.X.en"
              }
            },
            "X_sum_es": {
              "sum": {
                "field": "langs.X.es"
              }
            },
            "X_sum_zh": {
              "sum": {
                "field": "langs.X.zh"
              }
            }
          }
        },
        "Y": {
          "nested": {
            "path": "langs.Y"
          },
          "aggs": {
            "Y_sum_en": {
              "sum": {
                "field": "langs.Y.en"
              }
            },
            "Y_sum_es": {
              "sum": {
                "field": "langs.Y.es"
              }
            },
            "Y_sum_zh": {
              "sum": {
                "field": "langs.Y.zh"
              }
            }
          }
        },
        "sum_docLang": {
          "terms": {
            "field": "docLang.keyword",
            "size": 10
          }
        }
      }
    }
    

    因为你没有提到,但我认为这很重要。我将XY 设为nested 字段:

        "langs": {
          "properties": {
            "X": {
              "type": "nested",
              "properties": {
                "en": {
                  "type": "long"
                },
                "es": {
                  "type": "long"
                },
                "zh": {
                  "type": "long"
                }
              }
            },
            "Y": {
              "type": "nested",
              "properties": {
                "en": {
                  "type": "long"
                },
                "es": {
                  "type": "long"
                },
                "zh": {
                  "type": "long"
                }
              }
            }
          }
        }
    

    但是,如果您的字段根本不是 nested,而这里我的意思实际上是 Elasticsearch 中的 nested 字段类型,那么像这样的简单聚合就足够了:

    {
      "size": 0,
      "aggs": {
        "X_sum_en": {
          "sum": {
            "field": "langs.X.en"
          }
        },
        "X_sum_es": {
          "sum": {
            "field": "langs.X.es"
          }
        },
        "X_sum_zh": {
          "sum": {
            "field": "langs.X.zh"
          }
        },
        "Y_sum_en": {
          "sum": {
            "field": "langs.Y.en"
          }
        },
        "Y_sum_es": {
          "sum": {
            "field": "langs.Y.es"
          }
        },
        "Y_sum_zh": {
          "sum": {
            "field": "langs.Y.zh"
          }
        },
        "sum_docLang": {
          "terms": {
            "field": "docLang.keyword",
            "size": 10
          }
        }
      }
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2015-12-25
      • 2021-04-01
      • 2017-05-31
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-04-08
      • 2012-08-07
      相关资源
      最近更新 更多