【问题标题】:Elasticsearch aggregation with date_histogram gives wrong result for buckets带有 date_histogram 的 Elasticsearch 聚合为桶提供了错误的结果
【发布时间】:2015-03-07 10:06:57
【问题描述】:

我有带有时间戳的数据。我想在上面做 date_histogram。

当我运行查询时,它返回的总数为 13,这是正确的,但它在 2014-10-10 中显示了一条记录,但我在 data 中找不到该记录。

curl http://localhost:9200/test/test/_search -X POST -d '{"fields":
 ["creation_time"],
  "query" :
      {"filtered":
          {"query":
              {"match":
                  {"type": "test.type"}
              }
          }
      },
  "aggs":
      {"group_by_created_by":
          {"date_histogram":
              {"field":"creation_time", "interval": "1d"}
          }
      }
 }' | python -m json.tool
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2083  100  1733  100   350   234k  48590 --:--:-- --:--:-- --:--:--  241k
{
    "_shards": {
        "failed": 0,
        "successful": 5,
        "total": 5
    },
    "aggregations": {
        "group_by_created_at": {
            "buckets": [
                {
                    "doc_count": 12,
                    "key": 1412812800000,
                    "key_as_string": "2014-10-09T00:00:00.000Z"
                },
                {
                    "doc_count": 1,
                    "key": 1412899200000,
                    "key_as_string": "2014-10-10T00:00:00.000Z"
                }
            ]
        }
    },
    "hits": {
        "hits": [
            {
                "_id": "qk5EGDqUSoW-ckZU9bnSsA",
                "_index": "test",
                "_score": 3.730029,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T16:35:39.535389"
                    ]
                }
            },
            {
                "_id": "GnglI_3xRYii_oE5q91FUg",
                "_index": "test",
                "_score": 3.6149597,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T17:16:55.677919"
                    ]
                }
            },
            {
                "_id": "ELP1f_-IS8SJiT4i4Vh6_g",
                "_index": "test",
                "_score": 2.974081,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T01:21:21.691270"
                    ]
                }
            },
            {
                "_id": "ySlIV4vWRvm_q0-9p87dEQ",
                "_index": "test",
                "_score": 2.974081,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T01:33:51.291644"
                    ]
                }
            },
            {
                "_id": "swXVnMmJSsmNW30zeJvCoQ",
                "_index": "test",
                "_score": 2.974081,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T17:08:45.738821"
                    ]
                }
            },
            {
                "_id": "h0j6L-VGTnyChSIevtt2og",
                "_index": "test",
                "_score": 2.974081,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T22:35:16.908080"
                    ]
                }
            },
            {
                "_id": "ANoTEXIgRgml6gLD4YKtIg",
                "_index": "test",
                "_score": 2.9459102,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T01:25:18.869175"
                    ]
                }
            },
            {
                "_id": "FSCPBsogT5OXghBUmKXidQ",
                "_index": "test",
                "_score": 2.9459102,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T01:42:49.000599"
                    ]
                }
            },
            {
                "_id": "VEw6XbIySvW7h7GF7h4ynA",
                "_index": "test",
                "_score": 2.9459102,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T16:45:51.563595"
                    ]
                }
            },
            {
                "_id": "J9NfffAvRPmFxtOBZ6IsCA",
                "_index": "test",
                "_score": 2.9169223,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T01:23:30.546353"
                    ]
                }
            }
        ],
        "max_score": 3.730029,
        "total": 13
    },
    "timed_out": false,
    "took": 4
}

如果您看到上面的示例,那么10-10 上没有记录,但聚合显示该存储桶中有一条记录。

【问题讨论】:

    标签: elasticsearch date-histogram


    【解决方案1】:

    对所有匹配的文档进行聚合。

    您没有设置size,这意味着您默认点击了10 个文档。将 size 更改为 13(+),您的 2014-10-10 文档应该会显示出来。

    当您有更多结果时,手动检查所有结果会很不方便,您还可以使用top_hits 作为子聚合器来获取存储桶中的峰值(那里有一个size 选项以及)。

    【讨论】:

    • 有什么办法可以知道,哪个元素落到了哪个桶里?
    • 是的,使用top_hits 聚合,您会得到这样的结果,您可以自己选择要在每个存储桶中包含多少文档以及如何对它们进行排序。
    【解决方案2】:

    如果您计算点击次数,您会发现只有 10 个对象。这是因为,默认情况下,Elasticsearch 将只返回前十个结果命中

    但是,即使hits 中不存在,在计算聚合时也会考虑与查询匹配的所有文档。

    尝试将您的查询更新为:

    {
      "size": 13,
      "fields": ["creation_time"],
      "query" :
          {"filtered":
              {"query":
                  {"match":
                      {"type": "test.type"}
                  }
              }
          },
      "aggs":
          {"group_by_created_by":
              {"date_histogram":
                  {"field":"creation_time", "interval": "1d"}
              }
          }
     }
    

    你会看到在10-10创建的文档。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2022-01-17
      • 2021-07-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多