【问题标题】:compute over results of elasticsearch aggregations计算弹性搜索聚合的结果
【发布时间】:2016-08-16 15:28:20
【问题描述】:

我有一个具有以下结构的文档:

  {
"ga:bounces": "1",
"timestamp": "20160811",
"viewId": "125287857",
"ga:percentNewSessions": "100.0",
"ga:bounceRate": "100.0",
"ga:avgSessionDuration": "0.0",
"ga:sessions": "1",
"user": "xxcgf",
"ga:pageviewsPerSession": "1.0",
"webPropertyId": "UA-80489737-1",
"ga:pageviews": "1",
"dimension": "date",
"ga:users": "1",
"accountId": "80489737"
}

我正在使用此查询应用两个聚合:

{
  "size": 0,
  "aggs": {
    "total-new-sessions": {
      "sum": {
        "script": "doc['percentNewSessions'].value/100*doc['sessions'].value"
      }
    },
    "total-sessions": {
      "sum": {
        "field": "ga:sessions"
      }
    }
  }
}

这就是我得到的输出,这正是我想要的:

{
   "took": 4,
   "timed_out": false,
   "_shards": {
   "total": 5,
   "successful": 5,
   "failed": 0
},
"hits": {
"total": 32,
"max_score": 0,
"hits": [ ]
},
"aggregations": {
"total-new-sessions": {
"value": 386.0000003814697
},
"total-sessions": {
"value": 516
  }
}

}

现在我想要的是出于某种原因将两个聚合的输出分开。我应该如何在上面的查询中做到这一点,最终输出是我唯一想要的。

更新: 我尝试使用此查询:

{
  "size": 0,
  "aggs": {
    "total-new-sessions": {
      "sum": {
        "script": "doc['ga:percentNewSessions'].value/100*doc['ga:sessions'].value"
      }
    },
    "total-sessions": {
      "sum": {
        "field": "ga:sessions"
      }
    },
    "sessions": {
      "bucket_script": {
        "buckets_path": {
          "total_new": "total-new-sessions",
          "total": "total-sessions"
        },
        "script": "total_new / total"
      }
    }
  }
}

但收到此错误:"reason": "Invalid pipeline aggregation named [sessions] of type [bucket_script]. Only sibling pipeline aggregations are allowed at the top level"

【问题讨论】:

    标签: elasticsearch elasticsearch-aggregation


    【解决方案1】:

    您可以使用bucket_script aggregation 来实现:

    {
      "size": 0,
      "aggs": {
        "all": {
          "date_histogram": {
            "field": "timestamp",
            "interval": "year"
          },
          "aggs": {
            "total-new-sessions": {
              "sum": {
                "script": "doc['percentNewSessions'].value/100*doc['sessions'].value"
              }
            },
            "total-sessions": {
              "sum": {
                "field": "ga:sessions"
              }
            },
            "ratio": {
              "bucket_script": {
                "buckets_path": {
                  "total_new": "total-new-sessions",
                  "total": "total-sessions"
                },
                "script": "total_new / total"
              }
            }
          }
        }
      }
    }
    

    【讨论】:

    • 我试过这个{ "size": 0, "aggs": { "total-new-sessions": { "sum": { "script": "doc['ga:percentNewSessions'].value/100*doc['ga:sessions'].value" } }, "total-sessions": { "sum": { "field": "ga:sessions" } }, "bucket_script": { "buckets_path": { "total_new": "total-new-sessions", "total": "total-sessions" }, "script": "total_new / total*100" } } } 但我收到错误:"reason": "Could not find aggregator type [buckets_path] in [bucket_script]
    • 你运行的是哪个版本的 ES?
    • 我的ES版本是2.2.1
    • 在我的旧查询中输入了小错误:错误是:`“原因”:“名为 [sessions] 的 [bucket_script] 类型的无效管道聚合。顶部只允许同级管道聚合级别"`
    • 我已经更新了我的答案,它应该可以工作。请注意,您需要一个父多桶聚合才能使其工作,因此我在时间戳字段上添加了一个每年的 date_histogram 父聚合。
    猜你喜欢
    • 2017-03-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-03-19
    相关资源
    最近更新 更多