【问题标题】:Incorrect output result of "aggs" query“aggs”查询的输出结果不正确
【发布时间】:2018-02-10 09:26:09
【问题描述】:

我有一个查询来搜索给定日期时间窗口中的条目数(即在2017-02-17T15:00:00.0002017-02-17T16:00:00.000 之间)。当我执行这个查询时,我得到了不正确的结果(最好说结果是意外的):

POST /myindex/_search
{
  "size": 0,
  "aggs": {
    "range": {
        "date_range": {
            "field": "Datetime",
            "ranges": [
                { "to": "2017-02-17T16:00:00||-1H/H" }, 
                { "from": "2017-02-17T16:00:00||/H" } 
            ]
        }
    }
}
}

这是输出:

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 11,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "range": {
      "buckets": [
        {
          "key": "*-2017-02-17T15:00:00.000Z",
          "to": 1487343600000,
          "to_as_string": "2017-02-17T15:00:00.000Z",
          "doc_count": 0
        },
        {
          "key": "2017-02-17T16:00:00.000Z-*",
          "from": 1487347200000,
          "from_as_string": "2017-02-17T16:00:00.000Z",
          "doc_count": 0
        }
      ]
    }
  }
}

myindex 我有两个条目,其值为Datetime

2017-02-17T15:15:00.000Z
2017-02-17T15:02:00.000Z

所以,结果应该等于 2。

我不明白如何解释当前的输出。哪些字段定义了条目数?

更新:

数据结构:

PUT /myindex
{
    "mappings": {
      "intensity": {
      "_all": {
        "enabled": false
      },
        "properties": {
          "Country_Id": {
            "type":"keyword"
          },
          "Datetime": {
            "type":"date"
          }
        }
      }
    }
}

样本数据:

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 5,
    "max_score": 1,
    "hits": [
      {
        "_index": "myindex",
        "_type": "intensity",
        "_id": "4",
        "_score": 1,
        "_source": {
          "Country_Id": "1",
          "Datetime": "2017-02-18T15:01:00.000Z"
        }
      },
      {
        "_index": "myindex",
        "_type": "intensity",
        "_id": "6",
        "_score": 1,
        "_source": {
          "Country_Id": "1",
          "Datetime": "2017-03-16T16:15:00.000Z"
        }
      },
      {
        "_index": "myindex",
        "_type": "intensity",
        "_id": "1",
        "_score": 1,
        "_source": {
          "Country_Id": "1",
          "Datetime": "2017-02-17T15:15:00.000Z"
        }
      },
      {
        "_index": "myindex",
        "_type": "intensity",
        "_id": "7",
        "_score": 1,
        "_source": {
          "Country_Id": "1",
          "Datetime": "2017-03-16T16:18:00.000Z"
        }
      },
      {
        "_index": "myindex",
        "_type": "intensity",
        "_id": "3",
        "_score": 1,
        "_source": {
          "Country_Id": "1",
          "Datetime": "2017-02-17T15:02:00.000Z"
        }
      }
    ]
  }
}

我得到的答案:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 11,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "range": {
      "buckets": [
        {
          "key": "2017-02-17T15:00:00.000Z-2017-02-17T16:00:00.000Z",
          "from": 1487343600000,
          "from_as_string": "2017-02-17T15:00:00.000Z",
          "to": 1487347200000,
          "to_as_string": "2017-02-17T16:00:00.000Z",
          "doc_count": 0
        }
      ]
    }
  }
}

【问题讨论】:

  • 如果我没记错的话,第一个范围是15:00,第二个范围是16:00,所以15:15和15:02正好在中间。
  • @Val:我想得到2017-02-17T15:00:00.0002017-02-17T16:00:00.000之间的记录数。我的查询有什么问题?
  • 为什么不简单地使用date_histogram 来代替每小时间隔?
  • @Val:因为我的任务需要它。我不需要整个直方图。

标签: elasticsearch lucene elasticsearch-5


【解决方案1】:

你的范围错了,改成这样

POST /myindex/_search
{
  "size": 0,
  "aggs": {
    "range": {
        "date_range": {
            "field": "Datetime",
            "ranges": [
                { 
                   "from": "2017-02-17T16:00:00Z||-1H/H",
                   "to": "2017-02-17T16:00:00Z||/H" 
                }
            ]
        }
    }
}
}

【讨论】:

  • 我想使用-1H。有可能吗?
  • 我已经修改了答案
  • 好的,谢谢。我检查了一下,我得到了"doc_count": 0。无法理解为什么会这样。顺便说一句,你错过了“from”末尾的逗号。
  • 请再试一次,运气好吗?
  • 我测试了你的最后一个查询,但答案还是 doc_count 等于 0。请查看我的更新。我发布了一些数据。
猜你喜欢
  • 2020-12-03
  • 2020-04-19
  • 1970-01-01
  • 1970-01-01
  • 2018-08-19
  • 1970-01-01
  • 1970-01-01
  • 2020-09-15
  • 2023-03-14
相关资源
最近更新 更多