【发布时间】:2019-01-21 09:52:16
【问题描述】:
我有一种从弹性搜索中检索数百万条记录的方案。
我是弹性搜索的初学者,不能非常有效地使用弹性搜索。
我在弹性搜索中为作者模型编制索引,如下所示,我正在使用 NEST 客户端将弹性搜索与 .net 应用程序结合使用。
下面我将解释我的模型。
Author
--------------------------------
AuthorKey string
List<Study> Nested
Study
---------------------------------
PMID int
PublicationDate date
PublicationType string
MeshTerms string
Content string
我们有近 1000 万作者,每位作者完成了至少 3 项研究。
因此,弹性索引中大约有 3000 万条记录可用。
现在我想获取作者数据及其总研究数
以下是示例 JSON 数据:
{
"Authors": [
{
"AuthorKey": "Author1",
"AuthorName": "karan",
"AuthorLastName": "shah",
"Study": [
{
"PMId": 1000,
"PublicationDate": "2019-01-17T06:35:52.178Z",
"content": "this is dummy content.how can i solve this",
"MeshTerms": "karan,dharan,nilesh,manan,mehul sir,manoj",
"PublicationType": [
"ClinicalTrial",
"Medical"
]
},
{
"PMId": 1001,
"PublicationDate": "2019-01-16T05:55:14.947Z",
"content": "this is dummy content.how can i solve this",
"MeshTerms": "karan1,dharan1,nilesh1,manan1,mehul1 sir,manoj1",
"PublicationType": [
"ClinicalTrial",
"Medical"
]
},
{
"PMId": 1002,
"PublicationDate": "2019-01-15T05:55:14.947Z",
"content": "this is dummy content for record2.how can i solve
this",
"MeshTerms": "karan2,dharan2,nilesh2,manan2,mehul2 sir,manoj2",
"PublicationType": [
"ClinicalTrial1",
"Medical2"
]
},
{
"PMId": 1003,
"PublicationDate": "2011-01-15T05:55:14.947Z",
"content": "this is dummy content for record3.how can i solve this",
"MeshTerms": "karan3,dharan3,nilesh3,manan3,mehul3 sir,manoj3",
"PublicationType": [
"ClinicalTrial1",
"Medical3"
]
}
]
},
{
"AuthorKey": "Author2",
"AuthorName": "dharan",
"AuthorLastName": "shah",
"Study": [
{
"PMId": 2001,
"PublicationDate": "2011-01-16T05:55:14.947Z",
"content": "this is dummy content for author 2.how can i solve
this",
"MeshTerms": "karan1,dharan1,nilesh1,manan1,mehul1 sir,manoj1",
"PublicationType": [
"ClinicalTrial",
"Medical"
]
},
{
"PMId": 2002,
"PublicationDate": "2019-01-15T05:55:14.947Z",
"content": "this is dummy content for author 2.how can i solve
this",
"MeshTerms": "karan2,dharan2,nilesh2,manan2,mehul2 sir,manoj2",
"PublicationType": [
"ClinicalTrial1",
"Medical2"
]
},
{
"PMId": 2003,
"PublicationDate": "2015-01-15T05:55:14.947Z",
"content": "this is dummy content for record2.how can i solve
this",
"MeshTerms": "karan3,dharan3,nilesh3,manan3,mehul3 sir,manoj3",
"PublicationType": [
"ClinicalTrial1",
"Medical3"
]
}
]
},
{
"AuthorKey": "Author3",
"AuthorName": "Nilesh",
"AuthorLastName": "Mistrey",
"Study": [
{
"PMId": 3000,
"PublicationDate": "2012-01-16T05:55:14.947Z",
"content": "this is dummy content for author 2 .how can i solve
this",
"MeshTerms": "karan2,dharan2,nilesh2,manan2,mehul sir2,manoj2",
"PublicationType": [
"ClinicalTrial",
"Medical"
]
}
]
}
如何按降序检索所有作者及其研究总数?
预期输出:
{
"Authors": [
{
"AuthorKey": "Author1",
"AuthorName": "karan",
"AuthorLastName": "shah",
"StudyCount": 4
},
{
"AuthorKey": "Author2",
"AuthorName": "dharan",
"AuthorLastName": "shah",
"StudyCount": 3
},
{
"AuthorKey": "Author3",
"AuthorName": "Nilesh",
"AuthorLastName": "Mistrey",
"StudyCount": 1
}
]
}
下面是索引的映射:
{
"authorindex": {
"mappings": {
"_doc": {
"properties": {
"AuthorKey": {
"type": "keyword"
},
"AuthorLastName": {
"type": "keyword"
},
"AuthorName": {
"type": "keyword"
},
"Study": {
"type": "nested",
"properties": {
"MeshTerms": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"PMId": {
"type": "long"
},
"PublicationDate": {
"type": "date"
},
"PublicationType": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"content": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
}
【问题讨论】:
-
能否提供您正在使用的映射?您是否已经尝试解决问题?怎么样?
-
@NikolayVasiliev 我已经尝试过但没有得到如何编写查询来满足这个要求
-
请不要在 cmets 中添加类似的内容,使用 edit 链接
标签: elasticsearch querydsl elasticsearch-query