【问题标题】:How to group data by every hour如何按每小时对数据进行分组
【发布时间】:2020-05-15 18:55:15
【问题描述】:

即使数据不存在,我如何获取按 24 小时内每小时分组的计数数据,即 IF 0 将选择 0

MonogDB 3.6

输入

[
  {
    "_id": ObjectId("5ccbb96706d1d47a4b2ced4b"),
    "date": "2019-05-03T10:39:53.108Z",
    "id": 166,
    "update_at": "2019-05-03T02:45:36.208Z",
    "type": "image"
  },
  {
    "_id": ObjectId("5ccbb96706d1d47a4b2ced4c"),
    "date": "2019-05-03T10:39:53.133Z",
    "id": 166,
    "update_at": "2019-05-03T02:45:36.208Z",
    "type": "image"
  },
  {
    "_id": ObjectId("5ccbb96706d1d47a4b2ced4d"),
    "date": "2019-05-03T10:39:53.180Z",
    "id": 166,
    "update_at": "2019-05-03T20:45:36.208Z",
    "type": "image"
  },
  {
    "_id": ObjectId("5ccbb96706d1d47a4b2ced7a"),
    "date": "2019-05-10T10:39:53.218Z",
    "id": 166,
    "update_at": "2019-12-04T10:45:36.208Z",
    "type": "image"
  },
  {
    "_id": ObjectId("5ccbb96706d1d47a4b2ced7b"),
    "date": "2019-05-03T10:39:53.108Z",
    "id": 166,
    "update_at": "2019-05-05T10:45:36.208Z",
    "type": "image"
  },
  {
    "_id": ObjectId("5ccbb96706d1d47a4b2cedae"),
    "date": "2019-05-03T10:39:53.133Z",
    "id": 166,
    "update_at": "2019-05-05T10:45:36.208Z",
    "type": "image"
  },
  {
    "_id": ObjectId("5ccbb96706d1d47a4b2cedad"),
    "date": "2019-05-03T10:39:53.180Z",
    "id": 166,
    "update_at": "2019-05-06T10:45:36.208Z",
    "type": "image"
  },
  {
    "_id": ObjectId("5ccbb96706d1d47a4b2cedab"),
    "date": "2019-05-10T10:39:53.218Z",
    "id": 166,
    "update_at": "2019-12-06T10:45:36.208Z",
    "type": "image"
  }
]

实施

db.collection.aggregate({
  $match: {
    update_at: {
      "$gte": "2019-05-03T00:00:00.0Z",
      "$lt": "2019-05-05T00:00:00.0Z"
    },
    id: {
      "$in": [
        166
      ]
    }
  }
},
{
  $group: {
    _id: {
      $substr: [
        "$update_at",
        11,
        2
      ]
    },
    count: {
      "$sum": 1
    }
  },

},
{
  $project: {
    _id: 0,
    hour: "$_id",
    count: "$count"
  }
},
{
  $sort: {
    hour: 1
  }
})

实际输出

{
    "count": 2,
    "hour": "02"
  },
  {
    "count": 1,
    "hour": "20"
  }

我的期望代码显示 24 小时事件数据为 0 或 null 并从示例“02”转换为“02 AM”,“13”转换为“01 PM”:

预期输出

  {
    "count": 0,
    "hour": "01" // 01 AM
  },
  {
    "count": 2,
    "hour": "02"
  },
  {
    "count": 0,
    "hour": "03"
  },
  {
    "count": 0,
    "hour": "04"
  },
  {
    "count": 0,
    "hour": "05"
  },
  {
    "count": 1,
    "hour": "20" // to 08 pm
  }

【问题讨论】:

  • 为什么你将日期值存储为字符串而不是正确的 Date 对象?

标签: mongodb mongoose mongodb-query aggregation-framework


【解决方案1】:

试试这个解决方案:

说明

我们按小时分组以计算上传的图片数量。

然后,我们添加额外的字段 hour 来创建时间间隔(如果您有 v4.x,则有更好的 solution)。

我们扁平化 hour 字段(将创建新文档)并拆分前 2 位以匹配 count 并拆分后 2 位以放置 AM/PM 句点。


db.collection.aggregate([
  {
    $match: {
      update_at: {
        "$gte": "2019-05-03T00:00:00.0Z",
        "$lt": "2019-05-05T00:00:00.0Z"
      },
      id: {
        "$in": [
          166
        ]
      }
    }
  },
  {
    $group: {
      _id: {
        $substr: [
          "$update_at",
          11,
          2
        ]
      },
      count: {
        "$sum": 1
      }
    }
  },
  {
    $addFields: {
      hour: [
        "0000",
        "0101",
        "0202",
        "0303",
        "0404",
        "0505",
        "0606",
        "0707",
        "0808",
        "0909",
        "1010",
        "1111",
        "1212",
        "1301",
        "1402",
        "1503",
        "1604",
        "1705",
        "1806",
        "1907",
        "2008",
        "2109",
        "2210",
        "2311"
      ]
    }
  },
  {
    $unwind: "$hour"
  },
  {
    $project: {
      _id: 0,
      hour: 1,
      count: {
        $cond: [
          {
            $eq: [
              {
                $substr: [
                  "$hour",
                  0,
                  2
                ]
              },
              "$_id"
            ]
          },
          "$count",
          0
        ]
      }
    }
  },
  {
    $group: {
      _id: "$hour",
      count: {
        "$sum": "$count"
      }
    }
  },
  {
    $sort: {
      _id: 1
    }
  },
  {
    $project: {
      _id: 0,
      hour: {
        $concat: [
          {
            $substr: [
              "$_id",
              2,
              2
            ]
          },
          {
            $cond: [
              {
                $gt: [
                  {
                    $substr: [
                      "$_id",
                      0,
                      2
                    ]
                  },
                  "12"
                ]
              },
              " PM",
              " AM"
            ]
          }
        ]
      },
      count: "$count"
    }
  }
])

MongoPlayground

【讨论】:

    【解决方案2】:

    如果您想以印度时间格式输出。然后下面的代码工作!

        const query = [
        {
            $match: {
                update_at: {
                    "$gte": ISODate("2019-05-03T00:00:00.0Z"),
                    "$lt": ISODate("2019-05-05T00:00:00.0Z")
                },
                id: {
                    "$in": [
                        166
                    ]
                }
            }
        },
        {
            $project: {
                "h": { "$hour": { date: "$update_at", timezone: "+0530" } },
            }
        },
        {
            $group:
            {
                _id: { $hour: "$h" },
                count: { $sum: 1 }
            }
        }
    ];
    

    【讨论】:

    • 这个查询在 mongodb 4.2 中测试
    【解决方案3】:

    这是您可以测试的查询,适用于 MongoDB 4.0+

    我将改进查询和更新

    const query = [{
        $match: {
            update_at: {
                "$gte": ISODate("2019-05-03T00:00:00.0Z"),
                "$lt": ISODate("2019-05-05T00:00:00.0Z")
            },
            id: {
                "$in": [
                    166
                ]
            }
        }
    },
    {
        $group: {
            _id: { $hour: "$update_at" },
            count: {
                "$sum": 1
            }
        },
    
    },
    
    {
        $addFields: {
            hourStr: { $toString: { $cond: { if: { $gte: ["$_id", 12] }, then: { $subtract: [12, { $mod: [24, '$_id'] }] }, else: "$_id" } } },
        }
    },
    {
        $project: {
            formated: { $concat: ["$hourStr", { $cond: { if: { $gt: ["$_id", 12] }, then: " PM", else: " AM" } }] },
            count: "$count",
            hour: 1,
        }
    }]
    

    【讨论】:

    • Nooo,无法运行您的查询,因为我的 mongodb 服务器版本是 3.* :(
    • "无法识别的表达式 $toInt"
    【解决方案4】:

    您应该将日期值存储为Date 对象而不是字符串。我会这样格式化:

    db.collection.aggregate(
       [
          { $match: { ... } },
          {
             $group: {
                _id: { h: { $hour: "$update_at" } },
                count: { $sum: 1 }
             }
          },
          {
             $project: {
                _id: 0,
                hour: {
                   $switch: {
                      branches: [
                         { case: { $lt: ["$_id.h", 10] }, then: { $concat: ["0", { $toString: "$_id.h" }, " AM"] } },
                         { case: { $lt: ["$_id.h", 13] }, then: { $concat: [{ $toString: "$_id.h" }, " AM"] } },
                         { case: { $lt: ["$_id.h", 22] }, then: { $concat: ["0", { $toString: { $subtract: ["$_id.h", 12] } }, " PM"] } },
                         { case: { $lt: ["$_id.h", 24] }, then: { $concat: [{ $toString: { $subtract: ["$_id.h", 12] } }, " PM"] } }
                      ]
                   }
                },
                hour24: "$_id.h",
                count: 1
             }
          },
          { $sort: { hour24: 1 } }
       ])
    

    作为非美国人,我不熟悉 AM/PM 规则,尤其是。午夜和中午,但我想你明白了原则。

    【讨论】:

    • 嗨,Dom,我的 mongodb 服务器版本是 3.*,不能使用 $toString。 "无法识别的表达式 '$toString"
    • 或许可以直接跳过它。没有我得到 "$concat 只支持字符串,不支持 int"。否则似乎有一种解决方法:stackoverflow.com/questions/33891511/…
    【解决方案5】:

    没有“神奇”的解决方案,您必须将其硬编码到您的聚合中:

    这是一个使用 Mongo v3.2+ 语法和一些 $map$filter 魔法的示例:

    db.collection.aggregate([
        {
            $match: {
                update_at: {
                    "$gte": "2019-05-03T00:00:00.0Z",
                    "$lt": "2019-05-05T00:00:00.0Z"
                },
                id: {"$in": [166]}
            }
        },
        {
            $group: {
                _id: {$substr: ["$update_at", 11, 2]},
                count: {"$sum": 1}
            }
        },
        {
            $group: {
                _id: null,
                hours: {$push: {hour: "$_id", count: "$count"}}
            }
        },
        {
            $addFields: {
                hours: {
                    $map: {
                        input: {
                            $concatArrays: [
                                "$hours",
                                {
                                    $map: {
                                        input: {
                                            $filter: {
                                                input: ["00", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23"],
                                                as: "missingHour",
                                                cond: {
                                                    $not: {
                                                        $in: [
                                                            "$$missingHour",
                                                            {
                                                                $map: {
                                                                    input: "$hours",
                                                                    as: "hourObj",
                                                                    in: "$$hourObj.hour"
                                                                }
                                                            }
                                                        ]
                                                    }
                                                }
                                            }
                                        },
                                        as: "missingHour",
                                        in: {hour: "$$missingHour", count: 0}
                                    }
                                }
                            ]
                        },
                        as: "hourObject",
                        in: {
                            count: "$$hourObject.count",
                            hour: {
                                $cond: [
                                    {$eq: [{$substr: ["$$hourObject.hour", 0, 1]}, "0"]},
                                    {$concat: ["$$hourObject.hour", " AM"]},
                                    {
                                        $concat: [{
                                            $switch: {
                                                branches: [
                                                    {case: {$eq: ["$$hourObject.hour", "13"]}, then: "1"},
                                                    {case: {$eq: ["$$hourObject.hour", "14"]}, then: "2"},
                                                    {case: {$eq: ["$$hourObject.hour", "15"]}, then: "3"},
                                                    {case: {$eq: ["$$hourObject.hour", "16"]}, then: "4"},
                                                    {case: {$eq: ["$$hourObject.hour", "17"]}, then: "5"},
                                                    {case: {$eq: ["$$hourObject.hour", "18"]}, then: "6"},
                                                    {case: {$eq: ["$$hourObject.hour", "19"]}, then: "7"},
                                                    {case: {$eq: ["$$hourObject.hour", "20"]}, then: "8"},
                                                    {case: {$eq: ["$$hourObject.hour", "21"]}, then: "9"},
                                                    {case: {$eq: ["$$hourObject.hour", "22"]}, then: "10"},
                                                    {case: {$eq: ["$$hourObject.hour", "23"]}, then: "11"},
                                                ],
                                                default: "None"
                                            }
                                        }, " PM"]
                                    }
                                ]
                            }
                        }
                    }
                }
            }
        },
        {
            $unwind: "$hours"
        },
        {
            $project: {
                _id: 0,
                hour: "$hours.hour",
                count: "$hours.count"
            }
        },
        {
            $sort: {
                hour: 1
            }
        }
    ]);
    

    $addFields 阶段的简短说明:我们首先添加我们丢失的小时数,然后合并两个数组(原始找到的小时数和“新的”丢失小时数),最后我们转换为所需的输出(“01”到“01 AM”)。

    如果您使用的是 Mongo v4+,我建议您将 $group _id 阶段更改为使用 $dateFromString 更一致。

    _id: {$hour: {$dateFromString: {dateString: "$update_at"}}}
    

    如果你这样做,你必须更新 $filter$map 部分以使用数字而不是字符串,并最终使用 $toString 转换为你想要的格式,因此 v4+ 要求。

    【讨论】:

    • 此响应时间为 12 > 是 13,14 等。我想 13 是下午 1 点,等等。怎么样?
    • 您使用的是 Mongo 4.0+ 版吗?如果是这样,只需将 $addFields 阶段中​​的最终条件更改为整数并减去 12。
    • 不,我的 mongo 版本是 3.* :(
    • 那么您将不得不对转换进行硬编码。类似于我刚刚在几个小时内所做的事情
    • 好的,我等着
    猜你喜欢
    • 2021-07-25
    • 1970-01-01
    • 1970-01-01
    • 2021-04-25
    • 2016-02-02
    • 1970-01-01
    • 2019-10-15
    • 2015-05-15
    • 1970-01-01
    相关资源
    最近更新 更多