【问题标题】:Mongo / Mongoose - Aggregating by DateMongo / Mongoose - 按日期聚合
【发布时间】:2017-10-07 11:15:54
【问题描述】:

我有一个 mongo/mongoose 模式,在查询时会重新运行诸如

之类的文档
{ "_id" : ObjectId("5907a5850b459d4fdcdf49ac"), "amount" : -33.3, "name" : "RINGGO", "method" : "VIS", "date" : ISODate("2017-04-26T23:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.581Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49ba"), "amount" : -61.3, "name" : "Amazon", "method" : "VIS", "date" : ISODate("2017-03-23T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.592Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49ce"), "amount" : -3.3, "name" : "Tesco", "method" : "VIS", "date" : ISODate("2017-03-15T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.601Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49cc"), "amount" : -26.3, "name" : "RINGGO", "method" : "VIS", "date" : ISODate("2017-03-16T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.600Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49f7"), "amount" : -63.3, "name" : "Sky", "method" : "VIS", "date" : ISODate("2017-03-02T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.617Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49be"), "amount" : -3.3, "name" : "RINGGO", "method" : "VIS", "date" : ISODate("2017-03-22T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.593Z"), "category" : "Not Set", "__v" : 0 }

我想写一个查询,提供每个供应商 ("name" : "Amazon") 的年、月和周支出,例如供应商 RINGGO:

  • 2017 年有 3 次支出 33.3+26.3+3.3,因此每年的总支出为 59.9
  • 2017-03 月有两次支出,总和为 26.3+3.3,因此每月总支出为 26.6
  • 每笔支出都在不同的周内,因此每周总金额将是(例如)第 12 周 26.3、第 13 周 3.3、第 15 周 33.3

我可以写一个查询比如

db.statements.aggregate(
   [        
       { $group : { _id : "$name", amount: { $push: "$amount" } } }
   ]
)

这将按供应商名称汇总所有支出 (amount),但我不知道如何按如上所述按年、月、周细分。

根据评论进行编辑 我不确定结果可能是什么形状,但理想情况下应该是下面这样:

我需要年、月、周等,以便查询可以由 url 驱动(例如domain.com/vendorname/2017domain.com/vendorname/2017/3domain.com/vendorname/2017/3/12

我还想要每年/每月/每周的个人支出和总支出,因为我想将它们打印到页面上。

{ "_id" : 
    { "year" : 2017, 
      "month" : 3, 
      "week" : 12 }, 
    "name": "RINGGO", //vendor name
    "YearlySpends":[ 33.3, 26.3, 3.3] 
    "totalYearlylyAmount" : [ 59.9] 
    "MonthlySpends":[ 26.3, 3.3] 
    "totalMonthlyAmount" : [ 26.6] 
    "WeeklylySpends":[ 3.3] 
    "totalWeeklylyAmount" : [3.3] 

}

【问题讨论】:

  • 如果您可以添加您的预期结果 JSON 来补充示例数据,那就太棒了,这样就可以使用该预期输出轻松推断聚合管道。
  • @chridam 我已经编辑了这个问题,试图让结果 JSON 看起来像什么
  • 顺便说一下33.3+26.3+3.3 = 62.9不是59.9

标签: node.js mongodb mongoose mongodb-query aggregation-framework


【解决方案1】:

一个好的方法是将聚合管道分成几个步骤,目的是计算每个组的聚合,即每年、每月和每周的聚合。

我在生成上述管道方面进行了微弱的尝试,但不确定这是否是您所追求的,但可以为您提供一些解决方案的线索,更好的是一个最佳解决方案。也许其他人可以给出更好的答案。

考虑以下未测试管道:

db.statements.aggregate([
    {
        "$group": {
            "_id": {
                "name": "$name",
                "year": { "$year": "$date" },
                "month": { "$month": "$date" },
                "week": { "$week": "$date" }
            },
            "total": { "$sum": "$amount" }
        }
    },
    {
        "$group": {
            "_id": {
                "name": "$_id.name",
                "year": "$_id.year"
            },
            "YearlySpends": { "$push": "$total" },
            "totalYearlyAmount": { "$sum": "$total" },
            "data": { "$push": "$$ROOT" }
        }
    },
    { "$unwind": "$data" },
    {
        "$group": {
            "_id": {
                "name": "$_id.name",
                "month": "$data._id.month"
            },
            "YearlySpends": { "$first": "$YearlySpends" },
            "totalYearlyAmount": { "$first": "$totalYearlyAmount" },
            "MonthlySpends": { "$push": "$data.total" },
            "totalMonthlyAmount": { "$sum": "$data.total" },
            "data": { "$push": "$data" }
        }
    },
    { "$unwind": "$data" },
    {
        "$group": {
            "_id": {
                "name": "$_id.name",
                "week": "$data._id.week"
            },
            "YearlySpends": { "$first": "$YearlySpends" },
            "totalYearlyAmount": { "$first": "$totalYearlyAmount" },
            "MonthlySpends": { "$first": "$MonthlySpends" },
            "totalMonthlyAmount": { "$first": "$totalMonthlyAmount" },
            "WeeklySpends": { "$push": "$data.total" },
            "totalWeeklyAmount": { "$sum": "$data.total" },
            "data": { "$push": "$data" }
        }
    },
    { "$unwind": "$data" },
    {
        "$group": {
            "_id": "$data._id",
            "YearlySpends": { "$first": "$YearlySpends" },
            "totalYearlyAmount": { "$first": "$totalYearlyAmount" },
            "MonthlySpends": { "$first": "$MonthlySpends" },
            "totalMonthlyAmount": { "$first": "$totalMonthlyAmount" },
            "WeeklySpends": { "$first": "$WeeklySpends" },
            "totalWeeklyAmount": { "$first": "$totalWeeklyAmount" }
        }
    }
])

样本输出

/* 1 */
{
    "_id" : {
        "name" : "Tesco",
        "year" : 2017,
        "month" : 3,
        "week" : 11
    },
    "YearlySpends" : [ 
        -3.3
    ],
    "totalYearlyAmount" : -3.3,
    "MonthlySpends" : [ 
        -3.3
    ],
    "totalMonthlyAmount" : -3.3,
    "WeeklySpends" : [ 
        -3.3
    ],
    "totalWeeklyAmount" : -3.3
}

/* 2 */
{
    "_id" : {
        "name" : "RINGGO",
        "year" : 2017,
        "month" : 4,
        "week" : 17
    },
    "YearlySpends" : [ 
        -3.3, 
        -26.3, 
        -33.3
    ],
    "totalYearlyAmount" : -62.9,
    "MonthlySpends" : [ 
        -33.3
    ],
    "totalMonthlyAmount" : -33.3,
    "WeeklySpends" : [ 
        -33.3
    ],
    "totalWeeklyAmount" : -33.3
}

/* 3 */
{
    "_id" : {
        "name" : "RINGGO",
        "year" : 2017,
        "month" : 3,
        "week" : 12
    },
    "YearlySpends" : [ 
        -3.3, 
        -26.3, 
        -33.3
    ],
    "totalYearlyAmount" : -62.9,
    "MonthlySpends" : [ 
        -3.3, 
        -26.3
    ],
    "totalMonthlyAmount" : -29.6,
    "WeeklySpends" : [ 
        -3.3
    ],
    "totalWeeklyAmount" : -3.3
}

/* 4 */
{
    "_id" : {
        "name" : "RINGGO",
        "year" : 2017,
        "month" : 3,
        "week" : 11
    },
    "YearlySpends" : [ 
        -3.3, 
        -26.3, 
        -33.3
    ],
    "totalYearlyAmount" : -62.9,
    "MonthlySpends" : [ 
        -3.3, 
        -26.3
    ],
    "totalMonthlyAmount" : -29.6,
    "WeeklySpends" : [ 
        -26.3
    ],
    "totalWeeklyAmount" : -26.3
}

/* 5 */
{
    "_id" : {
        "name" : "Sky",
        "year" : 2017,
        "month" : 3,
        "week" : 9
    },
    "YearlySpends" : [ 
        -63.3
    ],
    "totalYearlyAmount" : -63.3,
    "MonthlySpends" : [ 
        -63.3
    ],
    "totalMonthlyAmount" : -63.3,
    "WeeklySpends" : [ 
        -63.3
    ],
    "totalWeeklyAmount" : -63.3
}

/* 6 */
{
    "_id" : {
        "name" : "Amazon",
        "year" : 2017,
        "month" : 3,
        "week" : 12
    },
    "YearlySpends" : [ 
        -61.3
    ],
    "totalYearlyAmount" : -61.3,
    "MonthlySpends" : [ 
        -61.3
    ],
    "totalMonthlyAmount" : -61.3,
    "WeeklySpends" : [ 
        -61.3
    ],
    "totalWeeklyAmount" : -61.3
}

更新

如果您希望在聚合操作中包含过滤器,那么我建议您使用 $match 查询作为第一个管道阶段。但是,如果有一个初始的 $match 步骤,那么前面的步骤将略有改变,因为您将聚合过滤结果,这与最初聚合所有文档然后应用过滤器非常不同关于结果。


如果您要采用 filter-first-then-aggregate 路线,请考虑运行使用 $match 作为过滤文档的第一步的聚合操作按供应商,然后是前面的 $redact 管道步骤,以进一步过滤日期字段月份部分的文档,然后其余的将是 $group 阶段:

Statements.aggregate([
    { "$match": { "name": req.params.vendor } },
    {
        "$redact": {
            "$cond": [
                { "$eq": [{ "$month": "$date" }, parseInt(req.params.month) ]},
                "$$KEEP",
                "$$PRUNE"
            ]
        }
    },
    .....
    /*
        add the remaining pipeline steps after
    */
], function(err, data){
    if (err) throw err;
    console.log(data);
})

如果您要采用 group-first-then-filter 路线,那么过滤器将位于给出分组结果的最后一个管道之后,但应用于不同的字段作为该部分的文档流将不同于原始模式。

此路由不高效,因为您开始对集合中的所有文档进行聚合操作,然后进行过滤:

Statements.aggregate([
    .....
    /*
        place the initial pipeline steps from 
        the original query above here
    */
    .....
    { 
        "$match": { 
            "_id.name": req.params.vendor,
            "_id.month": parseInt(req.params.month)
        } 
    }
], function(err, data){
    if (err) throw err;
    console.log(data);
})

对于多个日期过滤器参数,$redact 运算符将是

{
    "$redact": {
        "$cond": [
            {
                "$and": [
                     { "$eq": [{ "$year": "$date" },  parseInt(req.params.year)  ]},
                     { "$eq": [{ "$month": "$date" }, parseInt(req.params.month) ]},
                     { "$eq": [{ "$week": "$date" },  parseInt(req.params.week)  ]}
                ]
            },
            "$$KEEP",
            "$$PRUNE"
        ]
    }
}

【讨论】:

  • 看起来棒极了,谢谢!我加,{ $match : { "_id.month" : 3, "_id.week" :12, "name" : "RINGGO" } }有效吗?这样我就可以使用req.params.vendorreq.params.month 等来驱动结果?以上三个匹配也只有_id.month_id.month rturn 成功。只要我添加"name" : "RINGGO"(即使这就是比赛中的全部内容),即使我看到有名称为 RINGGO 的记录,我也没有得到任何结果
  • 谢谢。同意最好先过滤后聚合。我尝试修改以便可以查询一年中的特定月份:"$cond": [ { "$eq": [{ "$year": "$date" }, parseInt(req.params.year) ]}, { "$eq": [{ "$month": "$date" }, 3parseInt(req.params.month) ]}, "$$KEEP", "$$PRUNE" ] 但得到了"errmsg" : "Expression $cond takes exactly 3 arguments. 4 were passed in."
  • 如果只有一年(2017 年),这会很好,因为我可以假设 wk12 是独一无二的,但是当 2018 年到来时,我如何传递所有必需的细节(我想是一周)以确保我只获取一年中特定月份或特定年份特定月份中特定周的数据?
  • @StuartBrown 更新
  • 非常感谢!这是回答得最奇葩的问题!
猜你喜欢
  • 1970-01-01
  • 2021-10-16
  • 2015-06-28
  • 2016-06-09
  • 2020-05-09
  • 1970-01-01
  • 1970-01-01
  • 2014-01-29
  • 1970-01-01
相关资源
最近更新 更多