【问题标题】:Mongo aggregate several documents into oneMongo 将多个文档聚合为一个
【发布时间】:2020-09-27 03:00:31
【问题描述】:

我正在尝试通过 mongodb.collection.aggregate() 命令合并一些复杂的文档。

假设我要合并集合的文档的 x(在以下示例中:x=2):

[
  {
    "_id": 1,
    "Data": {
      "children": {
        "1": {
          "name": "appear_only_in_first_doc",
          "cost": 1,
          "revenue": 4.5,
          "grandchildren": {
            "1t9dsqdqdvoj8pdppxjk": {
              "cost": 0,
              "revenue": 1.5
            }
          }
        },
        "2": {
          "name": "appear_in_both_docs",
          "cost": 2,
          "revenue": 7,
          "grandchildren": {
            "jesrdt5qwef2222dgt": {
              "cost": 1,
              "revenue": 3
            },
            "klh352hk5367kf": {
              "cost": 2,
              "revenue": 7
            }
          }
        }
      }
    }
  },
  {
    "_id": 2,
    "Data": {
      "children": {
        "2": {
          "name": "appear_in_both_docs___but_diff_name",
          "cost": 9,
          "revenue": 7,
          "grandchildren": {
            "aaaaaaaaa": {
              "cost": 3,
              "revenue": 2
            },
            "jesrdt5qwef2222dgt": {
              "cost": 6,
              "revenue": 5
            }
          }
        },
        "3": {
          "name": "appear_only_in_last_doc",
          "cost": 4,
          "revenue": 2,
          "grandchildren": {
            "cccccccccccc": {
              "cost": 4,
              "revenue": 2
            }
          }
        }
      }
    }
  }
]

挑战:

  1. “children”和“grandchildren”键下的键是动态的,在编写查询时是未知的。
  2. 如果子或孙子只出现在一个文档中(例如“1”、“3”、“1t9dsqdqdvoj8pdppxjk”、“klh352hk5367kf”、“aaaaaaaa”和“cccccccccccc”) - 它也应该出现在最终结果中。
  3. 如果一个孩子出现在多个文档中(例如“2”和“jesrdt5qwef2222dgt”),它应该在最终结果中显示为一个。字段“成本”和“收入”应相加,最后一个“名称”字段应取。

我见过以下解决方案:

  1. unionWith - 不相关,联合 2 个不同的集合。
  2. merge - 不相关,不能对多次出现的字段值求和(取而代之的是最后一次)。
  3. mergeObjects - 不相关,不能对多次出现的字段值求和(取而代之的是最后一次)。

最终的结果应该是这样的:

{
  "Data": {
    "children": {
      "1": {
        "name": "appear_only_in_first_doc",
        "cost": 1,
        "revenue": 4.5,
        "grandchildren": {
          "1t9dsqdqdvoj8pdppxjk": {
            "cost": 0,
            "revenue": 1.5
          }
        }
      },
      "2": {
        "name": "appear_in_both_docs___but_diff_name",
        "cost": 11,
        "revenue": 14,
        "grandchildren": {
          "aaaaaaaaa": {
            "cost": 3,
            "revenue": 2
          },
          "jesrdt5qwef2222dgt": {
            "cost": 7,
            "revenue": 8
          },
          "klh352hk5367kf": {
            "cost": 2,
            "revenue": 7
          }
        }
      },
      "3": {
        "name": "appear_only_in_last_doc",
        "cost": 4,
        "revenue": 2,
        "grandchildren": {
          "cccccccccccc": {
            "cost": 4,
            "revenue": 2
          }
        }
      }
    }
  }
}

【问题讨论】:

    标签: javascript mongodb aggregate


    【解决方案1】:

    这个过程有点长,可能会有一些简单的,我只是分享过程,

    • $projectchildren 转换为数组格式(k,v)
    • $unwind解构children数组
    • $group 按儿童键,对costrevenue 求和,最后使用$last 得到name
    • $unwind解构grandchildren数组
    • $addFieldsgrandchildren 转换为数组格式(k,v)
    • $unwind解构grandchildren数组
    db.collection.aggregate([
      { $project: { "Data.children": { $objectToArray: "$Data.children" } } },
      { $unwind: "$Data.children" },
      {
        $group: {
          _id: "$Data.children.k",
          name: { $last: "$Data.children.v.name" },
          cost: { $sum: "$Data.children.v.cost" },
          revenue: { $sum: "$Data.children.v.revenue" },
          grandchildren: { $push: "$Data.children.v.grandchildren" }
        }
      },
      { $unwind: "$grandchildren" },
      { $addFields: { grandchildren: { $objectToArray: "$grandchildren" } } },
      { $unwind: "$grandchildren" },
    
    • $group by children key 和 grandchildren key,计算孙子成本收入之和
      {
        $group: {
          _id: {
            ck: "$_id",
            gck: "$grandchildren.k"
          },
          cost: { $first: "$cost" },
          revenue: { $first: "$revenue" },
          name: { $first: "$name" },
          grandchildren_cost: { $sum: "$grandchildren.v.cost" },
          grandchildren_revenue: { $sum: "$grandchildren.v.revenue" }
        }
      },
    
    • $group by children 键并重新构造 grandchildren 数组
      {
        $group: {
          _id: "$_id.ck",
          cost: { $first: "$cost" },
          revenue: { $first: "$revenue" },
          name: { $last: "$name" },
          grandchildren: {
            $push: {
              k: "$_id.gck",
              v: {
                cost: "$grandchildren_cost",
                revenue: "$grandchildren_revenue"
              }
            }
          }
        }
      },
    
    • $group by null 并重新构造子数组并使用 $arrayToObjectgrandchildren 转换为 (k,v) 数组中的对象
      {
        $group: {
          _id: null,
          children: {
            $push: {
              k: "$_id",
              v: {
                name: "$name",
                cost: "$cost",
                revenue: "$revenue",
                grandchildren: { $arrayToObject: "$grandchildren" }
              }
            }
          }
        }
      },
    
    • $project 使用 $arrayToObjectchildren 转换为对象
      {
        $project: {
          _id: 0,
          "Data.children": { $arrayToObject: "$children" }
        }
      }
    ])
    

    Playground

    【讨论】:

    • 太棒了!你先生是一个真正的专业人士!我可以在操场上看到一切都很好,所以我将开始阅读您添加的功能 - THX!只有一件事不能正常工作:id 为“2”的孩子的成本和收入取最后一个值而不是总和(不过孙子的总和是正确的)
    • 好的抱歉,我们会尽快检查并更新您。
    猜你喜欢
    • 2018-12-24
    • 2019-05-23
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多