【发布时间】:2021-07-20 12:44:01
【问题描述】:
嗯..我对 ES 非常“新手”,所以关于聚合...字典中没有任何词可以描述我的水平:p
今天我面临一个问题,我正在尝试创建一个查询,该查询应该执行类似于 SQL DISTINCT 的内容,但在过滤器之间。我有这个文档(当然是对真实情况的抽象):
{
"id": "1",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": true,
"kind": "document",
"classification": {
"id": 1,
"name": "a_name_for_id_1"
},
"structure": {
"material": "cartoon",
"thickness": 5
},
"shared": true,
"objective": "stackoverflow"
}
由于上述文档的所有数据都可能有所不同,因此我有一些可能是多余的值,例如classification.id、kind、structure.material。
因此,为了满足我的要求,我想对这 3 个字段进行“分组”,以便对每个字段进行独特的组合。如果我们再深入一点,通过以下数据,我应该得到以下可能性:
[{
"id": "1",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": true,
"kind": "document",
"classification": {
"id": 1,
"name": "a_name_for_id_1"
},
"structure": {
"material": "cartoon",
"thickness": 5
},
"shared": true,
"objective": "stackoverflow"
},
{
"id": "2",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": true,
"kind": "document",
"classification": {
"id": 2,
"name": "a_name_for_id_2"
},
"structure": {
"material": "iron",
"thickness": 3
},
"shared": true,
"objective": "linkedin"
},
{
"id": "3",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": false,
"kind": "document",
"classification": {
"id": 2,
"name": "a_name_for_id_2"
},
"structure": {
"material": "paper",
"thickness": 1
},
"shared": false,
"objective": "tiktok"
},
{
"id": "4",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": true,
"kind": "document",
"classification": {
"id": 3,
"name": "a_name_for_id_3"
},
"structure": {
"material": "cartoon",
"thickness": 5
},
"shared": false,
"objective": "snapchat"
},
{
"id": "5",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": true,
"kind": "document",
"classification": {
"id": 3,
"name": "a_name_for_id_3"
},
"structure": {
"material": "paper",
"thickness": 1
},
"shared": true,
"objective": "twitter"
},
{
"id": "6",
"createdAt": 1626783747,
"updatedAt": 1626783747,
"isAvailable": false,
"kind": "document",
"classification": {
"id": 3,
"name": "a_name_for_id_3"
},
"structure": {
"material": "iron",
"thickness": 3
},
"shared": true,
"objective": "facebook"
}
]
基于上述,我应该在“buckets”中得到以下结果:
- 记录 1 幅漫画
- 文件 2 铁
- 文档 2 纸
- 文档 3 卡通
- 文档 3 纸
- 文件 3 铁
当然,为了这个例子(为了方便起见,我还没有任何重复)
但是,除此之外,我只需要一些“预过滤器”:
- 可用的文档
isAvailable=true - 文档结构的厚度应介于 2 和 4 之间,包括:
2 >= structure.thickness >= 4 - 共享的文档
shared=true
与第一组结果相比,我应该只得到以下组合:
-
文件 1 动画片
-> not a valid result, thickness > 4 - 文件 2 铁
-
文档 2 论文
-> not a valid result, isAvailable != true -
文档 3 卡通
-> not a valid result, thickness > 4 -
文档 3 卡通
-> not a valid result, thickness < 2 -
文件 3 铁
-> not a valid result, isAvailable != true
如果您还在阅读,那么……谢谢! xD
因此,如您所见,我需要与静态模式 kind <> classification_id <> structure_material 相关的该字段的所有可能组合,这些组合与 isAvailable, thickness, shared 相关的过滤器匹配。
关于输出,点击对我来说并不重要,因为我不需要文档,而只需要组合 kind <> classification_id <> structure_material :)
感谢您的帮助:)
最大
【问题讨论】:
标签: elasticsearch elasticsearch-aggregation