【问题标题】:Elasticsearch - How to return distinct documents for certain fieldsElasticsearch - 如何为某些字段返回不同的文档
【发布时间】:2019-08-29 17:14:43
【问题描述】:

我有下一个 elasticsearch 查询,我需要知道如何只获得某些字段的不同结果。 (就像一个 sql distinct:SELECT DISTINCT column1 , column2, ... FROM table_name :wink:

这是我的查询

{
  "_source": ["part", "manufacturer", "shortdesc"],
  "query": {
  "match": {
       "part": "2n2222"
    }
  }
}

这是我得到的结果:

{
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "13921",
    "_score" : 207.16005,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "13923",
    "_score" : 207.16005,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "811202",
    "_score" : 202.03964,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "534059",
    "_score" : 202.03964,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "534062",
    "_score" : 202.03964,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "144303",
    "_score" : 202.03964,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "557240",
    "_score" : 202.03964,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Infineon"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "13924",
    "_score" : 201.24086,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "557235",
    "_score" : 201.24086,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "55566",
    "_score" : 201.24086,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "50873",
    "_score" : 201.24086,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "13915",
    "_score" : 199.76857,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "591924",
    "_score" : 199.76857,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "526043",
    "_score" : 199.76857,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "423282",
    "_score" : 198.89282,
    "_source" : {
      "part" : "2N2222A",
      "manufacturer" : "Microsemi Corporation"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "565951",
    "_score" : 193.51782,
    "_source" : {
      "part" : "P2N2222A",
      "manufacturer" : "ON Semiconductor"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "13920",
    "_score" : 192.1505,
    "_source" : {
      "part" : "P2N2222A",
      "manufacturer" : "ON Semiconductor"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "2885944",
    "_score" : 191.28773,
    "_source" : {
      "part" : "Q2N2222A",
      "manufacturer" : "Freescale Semiconductor"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "765656",
    "_score" : 191.28773,
    "_source" : {
      "part" : "2N2222AL",
      "manufacturer" : "Microsemi"
    }
  },
  {
    "_index" : "my_index",
    "_type" : "_doc",
    "_id" : "491090",
    "_score" : 190.78474,
    "_source" : {
      "part" : "2N2222AUB",
      "manufacturer" : "Microsemi Corporation"
    }
  }

如果记录包含相同的部件和制造商,则该记录被视为重复。我需要为这些字段获取不同的值。

非常感谢您的帮助。

【问题讨论】:

    标签: elasticsearch


    【解决方案1】:

    我相信您需要在查询中使用聚合来获得不同的对行为。有关不同值查询的示例,请参见 this

    链接问题与您的案例之间的主要区别在于您有两个字段,并且您需要所有不同的对,而不是两个字段的不同值。

    编辑: 刚刚对此进行了测试,它似乎具有您正在尝试做的行为。您可以通过删除/禁用术语聚合的 doc_count 计数并使用 _source 来优化它,就像您在问题中所做的那样。您还可以添加查询和匹配子句以过滤到给定的零件/制造商。

    EDIT2:将查询/匹配添加到问题中的请求中。

    GET YOURINDEX/_search
    {
    "query": {
        "match": {
          "part.keyword": "2n2222"
        }
      }, 
      "size": 0,
      "aggs": {
        "actions": {
          "terms": {
            "field": "part.keyword"
          },
          "aggs": {
            "emails": {
              "terms": {
                "field": "manufacturer.keyword"
              }
            }
          }
        }
      }
    }
    

    【讨论】:

    • 非常感谢朋友
    • 非常感谢你的朋友,但它还没有工作。我更新了我的问题
    • 您期望的结果是什么?上一个查询为您返回每个零件的制造商列表。因此您基本上可以解析它并从响应格式中获取数据中每个不同对的零件->制造商对(并忽略响应中的计数和其他内容)您是否希望为每个返回一个实际文档不同的值对,返回给定值对的单个文档,即使有几个相同的值?在这种情况下,我建议查看热门点击聚合。
    • 我想为零件和制造商返回不同的结果。如果记录具有相同的部件和制造商,则视为重复记录
    猜你喜欢
    • 2012-03-25
    • 2022-06-30
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多