【发布时间】:2021-07-21 21:13:09
【问题描述】:
我真的希望有人可以在这里帮助我,我正在为这个问题发疯:-d
所以我有多个文档 (+100,000),如下所示:
"_id" : ObjectId("60e5ae42fcc92f14c3a41208"),
"userId" : "xxxx",
"projectCreator" : {
"userId" : "xxx|xxxx"
},
"hashTags" : [
"Spring",
"Java"
],
"projectCategories" : {
"60d76ef0597444095b8ab4b2" : "Backend",
"60d76ef0597444095b8ab232" : "Infrastructure"
},
"createdDate" : ISODate("2021-07-07T13:38:10.655Z"),
"updatedAt" : ISODate("2021-07-08T11:48:36.200Z"),
"_class" : "xxxx.model.project.Project"
}
我想要一个执行以下操作的查询:
- 从集合中的所有文档中提取所有唯一的 projectCategories 值(字符串值而不是 id)
- 计算每个值的出现次数
所以结果应该是这样的:
Backend : NUMBER OF OCCURRENCES
FrontEnd : NUMBER OF OCCURRENCES
Infrastructure: NUMBER OF OCCURRENCES
我“认为”我需要进行聚合并将值分组,然后进行计数,但老实说我无法理解这一点。
我试过这个查询:
db.projects.aggregate([ { $match: { isDeleted : {$ne: true} }},{ $match: { projectCategories: { $exists:true, $ne: null }} },{ $project: { result: { $objectToArray: "$projectCategories" } } },{ $unwind : "$result"}])
这将返回:
{ "_id" : ObjectId("60c313e2905d344c7dd117f1"), "result" : { "k" : "60d76f295974444b818ab4bc", "v" : "Apps" } }
{ "_id" : ObjectId("60c313e2905d344c7dd117f1"), "result" : { "k" : "60d76f1759744461468ab4b8", "v" : "Development Tools" } }
{ "_id" : ObjectId("60c313e2905d344c7dd117f1"), "result" : { "k" : "60d76eeb597444b9da8ab4b1", "v" : "Frontend" } }
{ "_id" : ObjectId("60cfb59f30b2647610a6c931"), "result" : { "k" : "60d76eeb597444b9da8ab4b1", "v" : "Frontend" } }
{ "_id" : ObjectId("60cfb59f30b2647610a6c931"), "result" : { "k" : "60d76ef659744422d68ab4b3", "v" : "Fullstack" } }
{ "_id" : ObjectId("60cfb69730b2647610a6c932"), "result" : { "k" : "60d76f295974444b818ab4bc", "v" : "Apps" } }
{ "_id" : ObjectId("60df83e84d8b6341d49cff4e"), "result" : { "k" : "60d76ef0597444095b8ab4b2", "v" : "Backend" } }
{ "_id" : ObjectId("60df83e84d8b6341d49cff4e"), "result" : { "k" : "60d76eeb597444b9da8ab4b1", "v" : "Frontend" } }
{ "_id" : ObjectId("60df83e84d8b6341d49cff4e"), "result" : { "k" : "60d76ef659744422d68ab4b3", "v" : "Fullstack" } }
{ "_id" : ObjectId("60e5ae42fcc92f14c3a41208"), "result" : { "k" : "60d76ef0597444095b8ab4b2", "v" : "Backend" } }
{ "_id" : ObjectId("60f0abf9f5c82b27af712ad7"), "result" : { "k" : "60d76f2559744477168ab4bb", "v" : "Games" } }
{ "_id" : ObjectId("60f0abf9f5c82b27af712ad7"), "result" : { "k" : "60d76ef659744422d68ab4b3", "v" : "Fullstack" } }
{ "_id" : ObjectId("60f68d2df9710f58c1e9c872"), "result" : { "k" : "60d76f295974444b818ab4bc", "v" : "Apps" } }
{ "_id" : ObjectId("60f68d2df9710f58c1e9c872"), "result" : { "k" : "60d76f0e5974448f038ab4b7", "v" : "Open Source" } }
{ "_id" : ObjectId("60f68d2df9710f58c1e9c872"), "result" : { "k" : "60d76eeb597444b9da8ab4b1", "v" : "Frontend" } }
所以我现在卡住的地方是我如何放松并获得如下输出:
Backend : NUMBER OF OCCURRENCES
FrontEnd : NUMBER OF OCCURRENCES
Infrastructure: NUMBER OF OCCURRENCES
有人可以帮我吗?
谢谢!
更新: 我已经设法通过这个查询关闭:
db.projects.aggregate([ { $match: { isDeleted : {$ne: true} }},{ $match: { projectCategories: { $exists:true, $ne: null }} },{ $project: { result: { $objectToArray: "$projectCategories" } } },{ $unwind : "$result"}, { $group: { _id: "$result.v", count: { $sum: 1 } } } ] )
但是现在的输出是这样的:
{ "_id" : "Development Tools", "count" : 1 }
{ "_id" : "Games", "count" : 1 }
{ "_id" : "Fullstack", "count" : 3 }
{ "_id" : "Open Source", "count" : 1 }
{ "_id" : "Frontend", "count" : 4 }
{ "_id" : "Apps", "count" : 3 }
{ "_id" : "Backend", "count" : 2 }
是否可以删除_id?
【问题讨论】:
标签: mongodb aggregation-framework