【问题标题】:Search with more than one parameter over more than one field in elastic search在弹性搜索中使用多个参数搜索多个字段
【发布时间】:2020-04-18 18:55:18
【问题描述】:

如果 'Grade' = 'G6' 和 Type = 'Open' 在 SAME 受众标签中匹配,我只想返回此课程,它们必须存在于 SAME 标签中才能返回此课程。目前,如果发现 G6 和 OPEN 是不同的受众,则返回此课程,这不是我想要的。 这是不正确的,我得到的数据不正确,我需要查询以应用于每个受众,并且只有在同一受众中为真时才返回数据

这是我的 json:

{
"took": 1,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": 71,
    "max_score": 3.3118114,
    "hits": [
        {
            "_index": "courses",
            "_type": "course",
            "_id": "LBTBWdzyRw-jgiiYssjv8A",
            "_score": 3.3118114,
            "_source": {
                "id": "LBTBWdzyRw-jgiiYssjv8A",
                "title": "1503 regression testing",
                "shortDescription": "asdf",
                "description": "asdf",
                "learningOutcomes": "",
                "modules": [],
                "learningProvider": {
                    "id": "ig2-zIY_QkSpMC4O0Lm0hw",
                    "name": null,
                    "termsAndConditions": [],
                    "cancellationPolicies": []
                },
                "audiences": [
                    {
                        "id": "VfDpsS_5SXi8iZubzTkUBQ",
                        "name": "comm",
                        "areasOfWork": [
                            "Communications"
                        ],
                        "departments": [],
                        "grades": [
                            "G6"
                        ],
                        "interests": [],
                        "requiredBy": null,
                        "frequency": null,
                        "type": "OPEN",
                        "eventId": null
                    },
                    {
                        "id": "eZPPPqTqRdiDAE3xCPlJMQ",
                        "name": "analysis",
                        "areasOfWork": [
                            "Analysis"
                        ],
                        "departments": [],
                        "grades": [
                            "G6"
                        ],
                        "interests": [],
                        "requiredBy": null,
                        "frequency": null,
                        "type": "REQUIRED",
                        "eventId": null
                    }
                ],
                "preparation": "",
                "owner": {
                    "scope": "LOCAL",
                    "organisationalUnit": "co",
                    "profession": 63,
                    "supplier": ""
                },
                "visibility": "PUBLIC",
                "status": "Published",
                "topicId": ""
            }
        }
    ]
}

}

我的 ES 代码:

 BoolQueryBuilder boolQuery = boolQuery();

    boolQuery.should(QueryBuilders.matchQuery("audiences.departments.keyword", department));
    boolQuery.should(QueryBuilders.matchQuery("audiences.areasOfWork.keyword", areaOfWork));
    boolQuery.should(QueryBuilders.matchQuery("audiences.interests.keyword", interest));

    BoolQueryBuilder filterQuery = boolQuery();
    filterQuery.must(QueryBuilders.matchQuery("audiences.grades.keyword", "G6"));
    filterQuery.must(QueryBuilders.matchQuery("audiences.type", "OPEN"));

这里是索引映射:

{
  "media": {
    "aliases": {}
  },
  "courses": {
    "aliases": {}
  },
  "feedback": {
    "aliases": {}
  },
  "learning-providers": {
    "aliases": {}
  },
  "resources": {
    "aliases": {}
  },
  "courses-0.4.0": {
    "aliases": {}
  },
  ".security-6": {
    "aliases": {
      ".security": {}
    }
  },
  "payments": {
    "aliases": {}
  }
}

【问题讨论】:

  • 欢迎来到stackoverflow。请在您的问题中添加更多详细信息,以获得社区成员的快速和更好的响应。浏览How to Ask 部分并使用minimal reproducible example更新您的问题
  • 我愿意对我的问题进行任何澄清。
  • @HypeScript 你能用你的索引映射更新问题吗
  • 我已经用上面的更多代码更新了原始问题
  • 这里是映射:

标签: elasticsearch


【解决方案1】:

既然你想要你的query to apply in each audience and only return data if it is true in the same audience,你需要为audiences字段指定嵌套数据类型,否则ElasticSearch以对象的形式存储它并且它没有嵌套对象的概念,因为Elasticsearch将对象层次结构扁平化为一个简单的列表字段名称和值。您可以参考此了解更多详细信息https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html

以你的例子假设这是你的文件:

    "audiences": [
            {
                "id": "1",
                "field": "comm"
            },
           {
                "id": "2",
                "field": "arts"
           }
   ]

Elasticsearch 以以下形式展平:

{
   "audiences.id":[1,2],
   "audiences.field":[comm,arts]
}

现在如果您搜索查询说观众必须有id:1 和field:arts 那么上面的文档也将得到匹配。

所以,为了避免这种类型的对象应该定义为nested对象。 ElasticSearch 将单独存储每个对象,而不是将其展平,因此将单独搜索每个对象。

您上述文件的映射应该是:

映射

{
    "mappings": {
        "properties": {
            "shortDescription": {
                "type": "text"
            },
            "audiences": {
                "type": "nested"
            },
            "description": {
                "type": "text"
            },
            "modules": {
                "type": "text"
            },
            "preparation": {
                "type": "text"
            },
            "owner": {
                "properties": {
                    "scope": {
                        "type": "text"
                    },
                    "organisationalUnit": {
                        "type": "text"
                    },
                    "profession": {
                        "type": "text"
                    },
                    "supplier": {
                        "type": "text"
                    }
                }
            },
            "learningProvider": {
                "properties": {
                    "id": {
                        "type": "text"
                    },
                    "name": {
                        "type": "text"
                    },
                    "termsAndConditions": {
                        "type": "text"
                    },
                    "cancellationPolicies": {
                        "type": "text"
                    }
                }
            },
            "visibility": {
                "type": "text"
            },
            "status": {
                "type": "text"
            },
            "topicId": {
                "type": "text"
            }
        }
    }
}

现在,如果我们索引这个文档:

文档

{
    "shortDescription": "asdf",
    "description": "asdf",
    "learningOutcomes": "",
    "modules": [],
    "learningProvider": {
        "id": "ig2-zIY_QkSpMC4O0Lm0hw",
        "name": null,
        "termsAndConditions": [],
        "cancellationPolicies": []
    },
    "audiences": [
        {
            "id": "VfDpsS_5SXi8iZubzTkUBQ",
            "name": "comm",
            "areasOfWork": [
                "Communications"
            ],
            "departments": [],
            "grades": [
                "G6"
            ],
            "interests": [],
            "requiredBy": null,
            "frequency": null,
            "type": "OPEN",
            "eventId": null
        },
        {
            "id": "eZPPPqTqRdiDAE3xCPlJMQ",
            "name": "analysis",
            "areasOfWork": [
                "Analysis"
            ],
            "departments": [],
            "grades": [
                "G7"
            ],
            "interests": [],
            "requiredBy": null,
            "frequency": null,
            "type": "REQUIRED",
            "eventId": null
        }
    ],
    "preparation": "",
    "owner": {
        "scope": "LOCAL",
        "organisationalUnit": "co",
        "profession": 63,
        "supplier": ""
    },
    "visibility": "PUBLIC",
    "status": "Published",
    "topicId": ""
}

如果你搜索查询是这样的:

搜索查询 1

{
"query": {
    "nested": {
        "path": "audiences",
        
        "query": {
            "bool": {
                "must": [
                    {
                        "match": {
                            "audiences.type.keyword": "OPEN"
                        }
                        
                    },
                     {
                        "match": {
                            "audiences.grades.keyword": "G6"
                        }
                        
                    }
                ]
            }
        }
       
    }
}

}

结果

"hits": [
        {
            "_index": "product",
            "_type": "_doc",
            "_id": "1",
            "_score": 0.9343092,
            "_source": {
                "shortDescription": "asdf",
                "description": "asdf",
                "learningOutcomes": "",
                "modules": [],
                "learningProvider": {
                    "id": "ig2-zIY_QkSpMC4O0Lm0hw",
                    "name": null,
                    "termsAndConditions": [],
                    "cancellationPolicies": []
                },
                "audiences": [
                    {
                        "id": "VfDpsS_5SXi8iZubzTkUBQ",
                        "name": "comm",
                        "areasOfWork": [
                            "Communications"
                        ],
                        "departments": [],
                        "grades": [
                            "G6"
                        ],
                        "interests": [],
                        "requiredBy": null,
                        "frequency": null,
                        "type": "OPEN",
                        "eventId": null
                    },
                    {
                        "id": "eZPPPqTqRdiDAE3xCPlJMQ",
                        "name": "analysis",
                        "areasOfWork": [
                            "Analysis"
                        ],
                        "departments": [],
                        "grades": [
                            "G7"
                        ],
                        "interests": [],
                        "requiredBy": null,
                        "frequency": null,
                        "type": "REQUIRED",
                        "eventId": null
                    }
                ],
                "preparation": "",
                "owner": {
                    "scope": "LOCAL",
                    "organisationalUnit": "co",
                    "profession": 63,
                    "supplier": ""
                },
                "visibility": "PUBLIC",
                "status": "Published",
                "topicId": ""
            }
        }
    ]

但是现在如果您的搜索查询是:

搜索查询 2:

{
    "query": {
        "nested": {
            "path": "audiences",
            
            "query": {
                "bool": {
                    "must": [
                        {
                            "match": {
                                "audiences.type.keyword": "OPEN"
                            }
                            
                        },
                         {
                            "match": {
                                "audiences.grades.keyword": "G7"
                            }
                            
                        }
                    ]
                }
            }
           
        }
    }
}

结果:

"hits": {
        "total": {
            "value": 0,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
}

因此,简而言之,您需要更改映射和其余查询中 audiences 字段的数据类型,以便它可以搜索嵌套数据类型。

所以,而不是这个代码片段:

BoolQueryBuilder filterQuery = boolQuery();
filterQuery.must(QueryBuilders.matchQuery("audiences.grades.keyword", "G6"));
filterQuery.must(QueryBuilders.matchQuery("audiences.type", "OPEN"));
        

你应该使用这个嵌套查询:

BoolQueryBuilder filterQuery = new BoolQueryBuilder();
filterQuery.must(QueryBuilders.matchQuery("audiences.grades.keyword", "G6"));
filterQuery.must(QueryBuilders.matchQuery("audiences.type", "OPEN"));
NestedQueryBuilder nested = new NestedQueryBuilder("audiences", filterQuery, ScoreMode.None);

【讨论】:

  • 精彩的解释 - 非常清晰和说明性,谢谢!
  • @HypeScript 很高兴知道这对你有帮助。你能不能也投个赞成票。谢谢
猜你喜欢
  • 2020-09-11
  • 2021-05-18
  • 1970-01-01
  • 2015-07-25
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多