Mongoose MongoDB 地理空间查询答案

【问题标题】：Mongoose MongoDB GeoSpatial QueryMongoose MongoDB 地理空间查询
【发布时间】：2019-09-23 14:01:02
【问题描述】：

我有一个Item 集合，可以容纳数千到数十万个文档。在那个集合上，我想执行地理空间查询。使用 Mongoose，有两个选项 - find() 和聚合管道。我在下面展示了我的实现：

猫鼬模型

首先，这里是我的 Mongoose 模型的相关属性：

// Define the schema
const itemSchema = new mongoose.Schema({
    // Firebase UID (in addition to the Mongo ObjectID)
    owner: {
        type: String,
        required: true,
        ref: 'User'
    },
    // ... Some more fields
    numberOfViews: {
        type: Number,
        required: true,
        default: 0
    },
    numberOfLikes: {
        type: Number,
        required: true, 
        default: 0
    },
    location: {
        type: {
            type: 'String',
            default: 'Point',
            required: true
        },
        coordinates: {
            type: [Number],
            required: true,
        },
    }
}, {
    timestamps: true
});

// 2dsphere index
itemSchema.index({ "location": "2dsphere" });

// Create the model
const Item = mongoose.model('Item', itemSchema);

查找查询

// These variables are populated based on URL Query Parameters.
const match = {};
const sort = {};

// Query to make.
const query = {
    location: {
        $near: {
            $maxDistance: parseInt(req.query.maxDistance),
            $geometry: {
                type: 'Point',
                coordinates: [parseInt(req.query.lng), parseInt(req.query.lat)]
            }
        }
    },
    ...match
};

// Pagination and Sorting
const options = {
    limit: parseInt(req.query.limit),
    skip: parseInt(req.query.skip),
    sort
};

const items = await Item.find(query, undefined, options).lean().exec();

res.send(items);

聚合管道

假设需要计算距离：

// These variables are populated based on URL Query Parameters.
const query = {};
const sort = {};

const geoSpatialQuery = {
    $geoNear: {
        near: { 
            type: 'Point', 
            coordinates: [parseInt(req.query.lng), parseInt(req.query.lat)] 
        },
        distanceField: "distance",
        maxDistance: parseInt(req.query.maxDistance),
        query,
        spherical: true
    }
};

const items = await Item.aggregate([
    geoSpatialQuery,
    { $limit: parseInt(req.query.limit) },
    { $skip: parseInt(req.query.skip) },
    { $sort: { distance: -1, ...sort } } 
]).exec();

res.send(items);

编辑 - 记录修改的示例

下面是一个文档示例，其中包含来自Item 集合的所有属性：

{
   "_id":"5cd08927c19d1dd118d39a2b",
   "imagePaths":{
      "standard":{
         "images":[
            "users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-aafe69c7-f93e-411e-b75d-319042068921-standard.jpg",
            "users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-397c95c6-fb10-4005-b511-692f991341fb-standard.jpg",
            "users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-e54db72e-7613-433d-8d9b-8d2347440204-standard.jpg",
            "users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-c767f54f-7d1e-4737-b0e7-c02ee5d8f1cf-standard.jpg"
         ],
         "profile":"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-51318c32-38dc-44ac-aac3-c8cc46698cfa-standard-profile.jpg"
      },
      "thumbnail":"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-51318c32-38dc-44ac-aac3-c8cc46698cfa-thumbnail.jpg",
      "medium":"users/zbYmcwsGhcU3LwROLWa4eC0RRgG3/5cd08927c19d1dd118d39a2b/images/Image-51318c32-38dc-44ac-aac3-c8cc46698cfa-medium.jpg"
   },
   "location":{
      "type":"Point",
      "coordinates":[
         -110.8571443,
         35.4586858
      ]
   },
   "numberOfViews":0,
   "numberOfLikes":0,
   "monetarySellingAmount":9000,
   "exchangeCategories":[
      "Math"
    ],
   "itemCategories":[
      "Sports"
   ],
   "title":"My title",
   "itemDescription":"A description",
   "exchangeRadius":10,
   "owner":"zbYmcwsGhcU3LwROLWa4eC0RRgG3",
   "reports":[],
   "createdAt":"2019-05-06T19:21:13.217Z",
   "updatedAt":"2019-05-06T19:21:13.217Z",
   "__v":0
}

问题

基于以上，我想问几个问题。

我的普通 Mongoose 查询实现与使用聚合管道之间是否存在性能差异？
当使用带有 GeoJSON 的 2dsphere 索引时，说 near 和 geoNear 与 nearSphere 非常相似是否正确 - 除了 geoNear 提供额外的数据和默认限制？也就是说，尽管有不同的单位，但两个查询 - 概念上 - 将显示某个位置的特定半径内的相关数据，尽管该字段被称为 radius 用于 nearSphere 和 maxDistance near/geoNear.
以我上面的例子，如何减轻使用skip 的性能损失，但仍然能够在查询和聚合中实现分页？
find() 函数允许使用可选参数来确定将返回哪些字段。聚合管道采用$project 阶段来执行相同的操作。是否有特定的顺序应该在管道中使用$project 以优化速度/效率，或者这无关紧要？

我希望按照 Stack Overflow 规则允许这种风格的问题。谢谢。

【问题讨论】：

你能分享一份你收藏的样本文件吗？聚合管道是一种数据流，因此数据通过管道过滤，因此取决于您的要求。
@SheshanGamage 感谢您的评论。请查看更新/编辑以查看示例文档的内容。

标签： node.js mongodb mongoose mongodb-query aggregation-framework

【解决方案1】：

我使用 2dsphere 索引尝试了以下查询。我使用了聚合管道
对于以下查询。

db.items.createIndex({location:"2dsphere"})

在使用聚合管道时，它为您提供了更多的结果集灵活性。此外，聚合管道将提高运行地理相关搜索的性能。

db.items.aggregate([
{
 $geoNear: {
    near: { type: "Point", coordinates: [ -110.8571443 , 35.4586858 ] },
    key: "location",
    distanceField: "dist.calculated",
    minDistance: 2, 
    query: { "itemDescription": "A description" }
 }])

关于您关于 $skip 的问题，下面的问题将使您更深入地了解 $skip 操作 $skip and $limit in aggregation framework

您可以根据需要使用 $project。在我们的案例中，我们使用超过 1000 万个数据的 $project 并没有遇到太多性能问题

【讨论】：

谢谢。 key 是什么，根据您链接的答案，使用sort 作为limit 之前的阶段是否更好？
key 是你的字段名 "location":{ "type":"Point", "coordinates":[ -110.8571443, 35.4586858 ] } 是的，最后使用 limit 总是好的管道