比较MongoDB中的两个集合答案

【问题标题】：compare two collection in mongodb比较MongoDB中的两个集合
【发布时间】：2016-04-17 20:12:25
【问题描述】：

我有两个不同的收藏书和音乐在 JSON 中。首先我举一个收藏的例子：

{
    "_id" : ObjectId("b1"),
    "author" : [
        "Mary",
    ],
    "title" : "Book1",
}
{
        "_id" : ObjectId("b2"),
        "author" : [
            "Joe",
            "Tony",
            "Mary"
        ],
        "title" : "Book2",
}
{
            "_id" : ObjectId("b3"),
            "author" : [
                "Joe",
                "Mary"
            ],
            "title" : "Book3",
}
.......

玛丽写了 3 本书，乔写了 2 本书，托尼写了 1 本书。其次我举一个音乐收藏的例子：

 {
        "_id" : ObjectId("m1"),
        "author" : [
            "Tony"
        ],
        "title" : "Music1",
    }
    {
            "_id" : ObjectId("m2"),
            "author" : [
                "Joe",
                "Tony"
            ],
            "title" : "Music2",
    }
    .......

托尼有 2 首音乐，乔有 1 首音乐，玛丽有 0 首音乐。

我希望得到更多的作者写的书多于音乐。

因此，应该考虑 Mary(3 > 0) 和 Joe(2 > 1)，而不是 Tony(1

我写了以下代码，但不知道如何比较：

db.book.aggregate([ 
     { $project:{ _id:0, author:1}},
     { $unwind:"$author" },     
     {$group:{_id:"$author", count:{$sum:1}}}  
     ]
     )

db.music.aggregate([ 
     { $project:{ _id:0, author:1}},
     { $unwind:"$author" },     
     {$group:{_id:"$author", count:{$sum:1}}}  
     ]
     )

到目前为止是正确的吗？如何进行以下比较？谢谢。

【问题讨论】：

这是also your account 还是你们两个可能一起工作？这几乎是一个exact duplicate of this now deleted question，唯一的区别是将一个集合命名为"music"（出于某种奇怪的原因），而不是"paper"，因为它最初在那里。正如最初评论的那样，这实际上只是“循环结果”来比较。如果您想要更高性能的东西，请将所有数据放在“一个”集合中。

标签： mongodb

【解决方案1】：

为了解决这个问题，我们需要使用 $out 阶段并将两个查询的结果存储在中间集合中，然后使用聚合查询将它们连接起来（$lookup）。

db.books.aggregate([{
            $project : {
                _id : 0,
                author : 1
            }
        }, {
            $unwind : "$author"
        }, {
            $group : {
                _id : "$author",
                count : {
                    $sum : 1
                }
            }
        }, {
            $project : {
                _id : 0,
                author : "$_id",
                count : 1
            }
        }, {
            $out : "bookAuthors"
        }
    ])

db.music.aggregate([{
            $project : {
                _id : 0,
                author : 1
            }
        }, {
            $unwind : "$author"
        }, {
            $group : {
                _id : "$author",
                count : {
                    $sum : 1
                }
            }
        }, {
            $project : {
                _id : 0,
                author : "$_id",
                count : 1
            }
        }, {
            $out : "musicAuthors"
        }
    ])

db.bookAuthors.aggregate([{
            $lookup : {
                from : "musicAuthors",
                localField : "author",
                foreignField : "author",
                as : "music"
            }
        }, {
            $unwind : "$music"
        }, {
            $project : {
                _id : "$author",
                result : {
                    $gt : ["$count", "$music.count"]
                },
                count : 1,
            }
        }, {
            $match : {
                result : true
            }
        }
    ])

编辑更改：

使用作者字段而不是_id

在 $project 阶段添加了嵌入文档中的逻辑语句

结果 : { $gt : ["$count", "$music.count"]

欢迎提出任何问题！玩得开心！

【讨论】：

您好，我按照您的建议，遇到了一些问题，写在您的回答中，谢谢！
我只需要得到谁写了更多的书音乐的作者数量，所以我需要在$后面加上｛$group:_id:null, number:{$sum:1}｝匹配？对吗？