【问题标题】:Retrieve most recent versions of each document检索每个文档的最新版本
【发布时间】:2016-09-20 17:48:32
【问题描述】:

Elasticsearch 不支持版本控制,因此我使用来自这个出色答案的方法#3 自己实现了它:https://stackoverflow.com/a/8226684/4769188

现在我想检索日期范围 [from..to] 的某种类型的所有版本,并且只获取每个文档的一个最新版本。我该怎么做?

【问题讨论】:

  • 如果你已经实现了#3,那么你最新的版本只有在一个单独的索引中,对吧?如果您只关心最新版本,为什么要检索所有版本?还是您的意思是获取属于某个日期范围的所有版本,并在这些可能的旧版本中,选择最新的?
  • @jay 我的意思是获取属于某个日期范围的所有版本,并在其中选择最近的版本。

标签: elasticsearch


【解决方案1】:

看看这是否有帮助...

我索引了以下文档:

    {
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 1,
    "hits": [
      {
        "_index": "test_index",
        "_type": "test",
        "_id": "2",
        "_score": 1,
        "_source": {
          "doc_id": 123,
          "version": 2,
          "text": "Foo Bar",
          "date": "2011-09-01",
          "current": false
        }
      },
      {
        "_index": "test_index",
        "_type": "test",
        "_id": "4",
        "_score": 1,
        "_source": {
          "doc_id": 123,
          "version": 4,
          "text": "Foo Bar",
          "date": "2011-07-01",
          "current": false
        }
      },
      {
        "_index": "test_index",
        "_type": "test",
        "_id": "1",
        "_score": 1,
        "_source": {
          "doc_id": 123,
          "version": 1,
          "text": "Foo Bar",
          "date": "2011-10-01",
          "current": true
        }
      },
      {
        "_index": "test_index",
        "_type": "test",
        "_id": "3",
        "_score": 1,
        "_source": {
          "doc_id": 123,
          "version": 3,
          "text": "Foo Bar",
          "date": "2011-08-01",
          "current": false
        }
      }
    ]
  }}

使用以下查询。这应该返回文档的版本 3。 “top_hits”中的“size”参数决定了你想要的每个桶有多少个文档。 (现在它设置为 1)。

{
    "size" : 0,
    "query" : {
        "filtered" : {
            "query" : {
                "match_all" : {}
            },
            "filter" : {
                "range" : {
                    "date" : {
                        "gte" : "2011-07-02",
                        "lte" : "2011-09-01"
                    }
                }
            }
        }
    },
    "aggs" : {
        "doc_id_groups" : {
            "terms" : {
                "field" : "doc_id",
                "size" : "10",
                "order" : {
                    "top_score" : "desc"
                }
            },
            "aggs" : {
                "top_score" : {
                    "max" : {
                        "script" : "_score"
                    }
                },
                "docs" : {
                    "top_hits" : {
                        "size" : 1,
                        "sort" : {
                            "version" : {
                                "order" : "desc"
                            }
                        },
                        "fields" : ["doc_id", "version", "date"]
                    }
                }
            }
        }
    }
}
}

回复:

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "doc_id_groups": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": 123,
          "doc_count": 2,
          "docs": {
            "hits": {
              "total": 2,
              "max_score": null,
              "hits": [
                {
                  "_index": "test_index",
                  "_type": "test",
                  "_id": "3",
                  "_score": null,
                  "fields": {
                    "date": [
                      "2011-08-01"
                    ],
                    "doc_id": [
                      123
                    ],
                    "version": [
                      3
                    ]
                  },
                  "sort": [
                    3
                  ]
                }
              ]
            }
          },
          "top_score": {
            "value": 1
          }
        }
      ]
    }
  }
}

【讨论】:

  • 谢谢,应该可以了。但为什么我需要"order" : { "top_score" : "desc" }top_score 聚合?即使没有它们,我也会得到预期的结果
  • 你是对的。那种与获取最新版本无关。你可以删除它。
猜你喜欢
  • 2018-04-06
  • 2014-01-28
  • 1970-01-01
  • 1970-01-01
  • 2012-12-07
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多