如何使用 MongoDB 上的嵌套数据进行双重 $lookup 聚合？答案

【问题标题】：How to make double $lookup aggregation with nested data on MongoDB?如何使用 MongoDB 上的嵌套数据进行双重 $lookup 聚合？
【发布时间】：2022-01-02 09:29:45
【问题描述】：

我有 3 个模型：

学习
词集
类别

学习模型引用了WordSet，然后WordSet引用了Category。

我知道为了正常显示数据，我使用填充。但在这种情况下，我需要一个包含许多 $lookup 的查询。

如何从 WordSet 中“填充”类别并仅显示重复次数最多的类别？

我会得到这样的回应：

"stats": [
    {
        "_id": null,
        "numberOfStudies": 4,
        "averageStudyTime": 82.5,
        "allStudyTime": 330,
        "longestStudy": 120,
        "allLearnedWords": 8
        "hardestCategory": "Work" // only this field is missing
    }
]

我试过这样做：

   const stats = await Study.aggregate([
  {
    // join User table 
    $lookup: {
      from: 'User',
      let: { userId: '$user' },
      pipeline: [
        {
          $match: { $expr: { $eq: ['$_id', '$$userId'] } },
        },
      ],
      as: 'currentUser',
    },
  },
  {
   // join WordSet table
    $lookup: {
      from: 'WordSet',
      let: { wordSetId: '$learnedWordSet' },
      pipeline: [
        {
          $match: { $expr: { $eq: ['$_id', '$$wordSetId'] } },
        },
        {
         // from this moment i'm not sure how to make it work
          $lookup: {
            from: 'Category',
            let: { categoryId: '$category' },
            pipeline: [
              {
                $match: { $expr: { $in: ['$_id', '$$categoryId'] } },
              },
            ],
            as: 'category',
          },
        },
      ],
      as: 'wordSet',
    },
  },
  { // add wordset with category? this is not working
    $addFields: {
      wordSet: {
        $arrayElemAt: ['$wordSet', 0],
      },
    },
  },
  { // search by logged user
    $match: { user: new ObjectID(currentUserId) },
  },
  { 
    $group: {
      // display statistics about user's studying
      _id: null,
      numberOfStudies: { $sum: 1 },
      averageStudyTime: { $avg: '$studyTime' },
      allStudyTime: { $sum: '$studyTime' },
      longestStudy: { $max: '$studyTime' },
      allLearnedWords: { $sum: { $size: '$learnedWords' } },
      // category: check which category is repeated the most and display it
    },
  },
]);

学习

     const studySchema = new mongoose.Schema({
  name: {
    type: String,
  },
  studyTime: {
    type: Number,
  },
  learnedWords: [String],
  notLearnedWords: [String],
  learnedWordSet: {
    type: mongoose.Schema.Types.ObjectId,
    ref: 'WordSet',
  },
  user: {
    type: mongoose.Schema.Types.ObjectId,
    ref: 'User',
  },
});

单词集

const wordSetSchema = new mongoose.Schema({
      name: {
        type: String,
      },
      category: {
        type: [
          {
            type: mongoose.Schema.Types.ObjectId,
            ref: 'Category',
            required: true,
          },
        ],
      },
    });

类别

const categorySchema = new mongoose.Schema({
  name: {
    type: String,
  },
});

【问题讨论】：

标签： node.js mongodb mongoose aggregation-framework mongoose-populate

【解决方案1】：

我不确定我是否理解正确，您可以尝试查询，我已经改进了阶段的使用，

$match 总是尝试在第一阶段使用阶段

$lookup有User集合，不需要pipeline版本，可以使用localField和foreignField属性

我认为用户文档和查找阶段没有任何用途，因为您只需要最后一个$group 阶段的统计信息。所以你可以跳过这个查找阶段

在 WordSet 查找中，
- $match你的情况
- $project 显示必填字段
- $unwind 解构 category 数组
- $group by category 并获取总数
- $sort count 按降序排列
- $limit 只获取最常用的第一个和单个元素
- $llokup 与 Category 合集
- $project 显示必填字段，获取第一个类别名称
$group舞台，hardestCategory获取$first类别名称

const stats = await Study.aggregate([
  { $match: { user: new ObjectID(currentUserId) } },
  {
    $lookup: {
      from: "User",
      localField: "user",
      foreignField: "_id",
      as: "currentUser"
    }
  },
  {
    $lookup: {
      from: "WordSet",
      let: { wordSetId: "$learnedWordSet" },
      pipeline: [
        { $match: { $expr: { $eq: ["$_id", "$$wordSetId"] } } },
        {
          $project: {
            _id: 0,
            category: 1
          }
        },
        { $unwind: "$category" },
        {
          $group: {
            _id: "$category",
            count: { $sum: 1 }
          }
        },
        { $sort: { count: -1 } },
        { $limit: 1 },
        {
          $lookup: {
            from: "Category",
            localField: "_id",
            foreignField: "_id",
            as: "category"
          }
        },
        {
          $project: {
            _id: 0,
            category: { $arrayElemAt: ["$category.name", 0] }
          }
        }
      ],
      as: "wordSet"
    }
  },
  {
    $group: {
      _id: null,
      numberOfStudies: { $sum: 1 },
      averageStudyTime: { $avg: "$studyTime" },
      allStudyTime: { $sum: "$studyTime" },
      longestStudy: { $max: "$studyTime" },
      allLearnedWords: {
        $sum: { $size: "$learnedWords" }
      },
      hardestCategory: {
        $first: {
          $first: "$wordSet.category"
        }
      }
    }
  }
])

Playground

【讨论】：

哇，谢谢你的解释。缺少一件，因为：hardestCategory 在我的回复中返回 null。你知道为什么吗？
只要确定数据库中的类别集合名称即可。
在我的数据库中，我的集合的命名方式与在 node/mongoose 中不同。在 db 上，我有：用户、研究、类别、单词集。但是在查询时，聚合我使用例如“用户”而不是“用户”。
所以是的，您需要根据您的收藏、用户、类别...在查找 from 属性中使用确切的内容
哇！你是对的。我想我必须使用第一个大写字母的型号名称。谢谢你的帮助。非常感谢。