【发布时间】:2019-09-23 14:01:02
【问题描述】:
我有一个Item 集合,可以容纳数千到数十万个文档。在那个集合上,我想执行地理空间查询。使用 Mongoose,有两个选项 - find() 和聚合管道。我在下面展示了我的实现:
猫鼬模型
首先,这里是我的 Mongoose 模型的相关属性:
// Define the schema
const itemSchema = new mongoose.Schema({
// Firebase UID (in addition to the Mongo ObjectID)
owner: {
type: String,
required: true,
ref: 'User'
},
// ... Some more fields
numberOfViews: {
type: Number,
required: true,
default: 0
},
numberOfLikes: {
type: Number,
required: true,
default: 0
},
location: {
type: {
type: 'String',
default: 'Point',
required: true
},
coordinates: {
type: [Number],
required: true,
},
}
}, {
timestamps: true
});
// 2dsphere index
itemSchema.index({ "location": "2dsphere" });
// Create the model
const Item = mongoose.model('Item', itemSchema);
查找查询
// These variables are populated based on URL Query Parameters.
const match = {};
const sort = {};
// Query to make.
const query = {
location: {
$near: {
$maxDistance: parseInt(req.query.maxDistance),
$geometry: {
type: 'Point',
coordinates: [parseInt(req.query.lng), parseInt(req.query.lat)]
}
}
},
...match
};
// Pagination and Sorting
const options = {
limit: parseInt(req.query.limit),
skip: parseInt(req.query.skip),
sort
};
const items = await Item.find(query, undefined, options).lean().exec();
res.send(items);
聚合管道
假设需要计算距离:
// These variables are populated based on URL Query Parameters.
const query = {};
const sort = {};
const geoSpatialQuery = {
$geoNear: {
near: {
type: 'Point',
coordinates: [parseInt(req.query.lng), parseInt(req.query.lat)]
},
distanceField: "distance",
maxDistance: parseInt(req.query.maxDistance),
query,
spherical: true
}
};
const items = await Item.aggregate([
geoSpatialQuery,
{ $limit: parseInt(req.query.limit) },
{ $skip: parseInt(req.query.skip) },
{ $sort: { distance: -1, ...sort } }
]).exec();
res.send(items);
编辑 - 记录修改的示例
下面是一个文档示例,其中包含来自Item 集合的所有属性:
{
"_id":"5cd08927c19d1dd118d39a2b",
"imagePaths":{
"standard":{
"images":[
"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-aafe69c7-f93e-411e-b75d-319042068921-standard.jpg",
"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-397c95c6-fb10-4005-b511-692f991341fb-standard.jpg",
"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-e54db72e-7613-433d-8d9b-8d2347440204-standard.jpg",
"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-c767f54f-7d1e-4737-b0e7-c02ee5d8f1cf-standard.jpg"
],
"profile":"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-51318c32-38dc-44ac-aac3-c8cc46698cfa-standard-profile.jpg"
},
"thumbnail":"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-51318c32-38dc-44ac-aac3-c8cc46698cfa-thumbnail.jpg",
"medium":"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-51318c32-38dc-44ac-aac3-c8cc46698cfa-medium.jpg"
},
"location":{
"type":"Point",
"coordinates":[
-110.8571443,
35.4586858
]
},
"numberOfViews":0,
"numberOfLikes":0,
"monetarySellingAmount":9000,
"exchangeCategories":[
"Math"
],
"itemCategories":[
"Sports"
],
"title":"My title",
"itemDescription":"A description",
"exchangeRadius":10,
"owner":"zbYmcwsGhcU3LwROLWa4eC0RRgG3",
"reports":[],
"createdAt":"2019-05-06T19:21:13.217Z",
"updatedAt":"2019-05-06T19:21:13.217Z",
"__v":0
}
问题
基于以上,我想问几个问题。
我的普通 Mongoose 查询实现与使用聚合管道之间是否存在性能差异?
当使用带有 GeoJSON 的
2dsphere索引时,说near和geoNear与nearSphere非常相似是否正确 - 除了geoNear提供额外的数据和默认限制?也就是说,尽管有不同的单位,但两个查询 - 概念上 - 将显示某个位置的特定半径内的相关数据,尽管该字段被称为radius用于nearSphere和maxDistancenear/geoNear.以我上面的例子,如何减轻使用
skip的性能损失,但仍然能够在查询和聚合中实现分页?find()函数允许使用可选参数来确定将返回哪些字段。聚合管道采用$project阶段来执行相同的操作。是否有特定的顺序应该在管道中使用$project以优化速度/效率,或者这无关紧要?
我希望按照 Stack Overflow 规则允许这种风格的问题。谢谢。
【问题讨论】:
-
你能分享一份你收藏的样本文件吗?聚合管道是一种数据流,因此数据通过管道过滤,因此取决于您的要求。
-
@SheshanGamage 感谢您的评论。请查看更新/编辑以查看示例文档的内容。
标签: node.js mongodb mongoose mongodb-query aggregation-framework