【问题标题】:How to count occurence of each value in array?如何计算数组中每个值的出现次数?
【发布时间】:2013-01-14 13:47:16
【问题描述】:

我在 MongoDB 中有一个 ISSUES 数据库,其中一些问题有 cmets,它是一个数组;每个 cmets 都有一个作家。如何计算每个作家写的 cmets 的数量?

我试过了

db.test.issues.group(
{
    key = "comments.username":true;
    initial: {sum:0},
    reduce: function(doc, prev) {prev.sum +=1},
    }
);

但没有运气:(

一个样本:

{
        "_id" : ObjectId("50f48c179b04562c3ce2ce73"),
        "project" : "Ruby Driver",
        "key" : "RUBY-505",
        "title" : "GETMORE is sent to wrong server if an intervening query unpins the connection",
        "description" : "I've opened a pull request with a failing test case demonstrating the bug here: https://github.com/mongodb/mongo-ruby-driver/pull/134\nExcerpting that commit message, the issue is: If we do a secondary read that is large enough to require sending a GETMORE, and then do another query before the GETMORE, the secondary connection gets unpinned, and the GETMORE gets sent to the wrong server, resulting in CURSOR_NOT_FOUND, even though the cursor still exis ts on the server that was initially queried.",
        "status" : "Open",
        "components" : [
                "Replica Set"
        ],
        "affected_versions" : [
                "1.7.0"
        ],
        "type" : "Bug",
        "reporter" : "Nelson Elhage",
        "priority" : "major",
        "assignee" : "Tyler Brock",
        "resolution" : "Unresolved",
        "reported_on" : ISODate("2012-11-17T20:30:00Z"),
        "votes" : 3,
        "comments" : [
                {
                        "username" : "Nelson Elhage",
                        "date" : ISODate("2012-11-17T20:30:00Z"),
                        "body" : "Thinking some more"
                },
                {
                        "username" : "Brandon Black",
                        "date" : ISODate("2012-11-18T20:30:00Z"),
                        "body" : "Adding some findings of mine to this ticket."
                },
                {
                        "username" : "Nelson Elhage",
                        "date" : ISODate("2012-11-18T20:30:00Z"),
                        "body" : "I think I tracked down the 1.9 dependency."
                },
                {
                        "username" : "Nelson Elhage",
                        "date" : ISODate("2012-11-18T20:30:00Z"),
                        "body" : "Forgot to include a link"
                }
        ]
}

【问题讨论】:

    标签: arrays mongodb mapreduce


    【解决方案1】:

    您忘记了 key 值上的花括号,您需要使用 , 而不是 ; 终止该行。

    db.issues.group({
        key: {"comments.username":true},
        initial: {sum:0},
        reduce: function(doc, prev) {prev.sum +=1},
    });
    

    更新

    在意识到comments 是一个数组之后...您需要为此使用aggregate,以便您可以“展开”comments,然后对其进行分组:

    db.issues.aggregate(
        {$unwind: '$comments'},
        {$group: {_id: '$comments.username', sum: {$sum: 1}}}
    );
    

    对于问题中的示例文档,输出如下:

    {
      "result": [
        {
          "_id": "Brandon Black",
          "sum": 1
        },
        {
          "_id": "Nelson Elhage",
          "sum": 3
        }
      ],
      "ok": 1
    }
    

    【讨论】:

    • 仍然有错误:SyntaxError: missing : after property id (shell):2
    • @Ace 将key 之后的= 替换为:。查看更新的答案。
    • 谢谢约翰尼!这样就解决了问题,但结果还是不行!!它无法识别“cmets.username”,因为结果如下: [ { "cmets.username" : null, "sum" : 12 } ] 其中 12 是记录总数!
    • @Ace 我的错,我在想comments 是一个嵌入式对象,而不是一个数组。为此,您需要使用aggregate;查看更新的答案。
    • 不!还是有问题! :({ "result" : [ ], "ok" : 1 }
    【解决方案2】:

    这里只是一个冷嘲热讽的回答,以恭维@JohnnyHKs 的回答:这听起来像是您对 MongoDB 的新手,因此可能正在开发新版本的 MongoDB,如果是这种情况(如果不是,我会升级)无论是旧的 @987654322 @count 有点糟糕。一方面,它不适用于分片。

    在 MongoDB 2.2 中,您可以这样做:

    db.col.aggregate({$group: {_id: "$comments.username", count: {$sum: 1}}})
    

    或类似的东西。你可以在这里阅读更多信息:http://docs.mongodb.org/manual/applications/aggregation/

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2017-12-17
      • 1970-01-01
      • 1970-01-01
      • 2015-12-07
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-03-16
      相关资源
      最近更新 更多