【问题标题】:Elasticsearch: Generating an aggregate field during search queryElasticsearch:在搜索查询期间生成聚合字段
【发布时间】:2018-07-13 07:23:19
【问题描述】:

这里是 ES 新手。我正在尝试从 1 个索引中具有以下架构的源实现搜索引擎:

index:paper
{
"title": string,
"author": string,
"id": string,
"references": [string:another_paper.id, string:another_paper.id, ...],
"pubDate": date
}

假设我要搜索 2017 年 1 月 9 日至 2017 年 1 月 30 日期间作者为“A. Smith”的所有论文。

我将如何设计我的搜索查询以获取带有生成字段的结果,该字段说明每个文档在“引用”字段下被其他文档引用了多少次?这在ES中甚至可能吗?

执行速度并不重要,我可以容忍相对较慢的执行速度,但我不想在上传新文档时更新现有文档。

谢谢

【问题讨论】:

  • 请对您的问题中的代码片段使用代码样式
  • 添加了代码样式。谢谢。

标签: elasticsearch search aggregation


【解决方案1】:

您绝对可以根据作者姓名和日期范围获得结果。 通过此查询,您可以获得与查询匹配的文档已引用的文档数以及文档的计数。

简而言之,您可以根据其他文档获得参考文档的数量

例如,假设您索引 3 个文档

{
  "title": "title1",
  "author": "bob",
  "id": "id1",
  "references": [
    "id1",
    "id2",
    "id3"
  ],
  "pubDate": "01-01-2018"
},
{
  "title": "title2",
  "author": "harry",
  "id": "id2",
  "references": [
    "id1",
    "id3",
    "id7",
    "id8"
  ],
  "pubDate": "01-02-2018"
},
{
  "title": "title3",
  "author": "bob",
  "id": "id3",
  "references": [
    "id1",
    "id4",
    "id7",
    "id9"
  ],
  "pubDate": "01-03-2018"
}

在此之后,您可以触发查询

GET test_stackoverflow_agg/type1/_search
{
  "query": {
    "query_string": {
      "query": "author:bob AND pubDate:[2018-01-02 TO 2018-01-04]"
    }
  },
  "aggs": {
    "agg1": {
      "terms": {
        "field": "references",
        "size": 10
      }
    }
  }
}

查询部分会告诉你要过滤哪些文档和

聚合部分会告诉您要在哪个字段上获取参考字段中存在的唯一 ID 的计数

这是结果的样子

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.0460204,
    "hits": [
      {
        "_index": "test_stackoverflow_agg",
        "_type": "type1",
        "_id": "id3",
        "_score": 1.0460204,
        "_source": {
          "title": "title3",
          "author": "bob",
          "id": "id3",
          "references": [
            "id1",
            "id4",
            "id7",
            "id9"
          ],
          "pubDate": "2018-01-03"
        }
      },
      {
        "_index": "test_stackoverflow_agg",
        "_type": "type1",
        "_id": "id1",
        "_score": 1.0460204,
        "_source": {
          "title": "title1",
          "author": "bob",
          "id": "id1",
          "references": [
            "id1",
            "id2",
            "id3"
          ],
          "pubDate": "2018-01-02"
        }
      }
    ]
  },
  "aggregations": {
    "agg1": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "id1",
          "doc_count": 2
        },
        {
          "key": "id2",
          "doc_count": 1
        },
        {
          "key": "id3",
          "doc_count": 1
        },
        {
          "key": "id4",
          "doc_count": 1
        },
        {
          "key": "id7",
          "doc_count": 1
        },
        {
          "key": "id9",
          "doc_count": 1
        }
      ]
    }
  }
}

【讨论】:

    猜你喜欢
    • 2015-03-08
    • 2020-10-05
    • 2020-04-26
    • 1970-01-01
    • 1970-01-01
    • 2023-01-17
    • 2020-12-21
    • 2023-03-26
    • 1970-01-01
    相关资源
    最近更新 更多