【问题标题】:Filling Month and Year Gaps in Data填补数据中的月份和年份空白
【发布时间】:2020-09-25 08:10:24
【问题描述】:

我有一个基于日期的项目表,其中许多项目在月份和年份之间存在间隔。例如,如果 1 月份创建了一个帖子,4 月份创建了 5 个帖子,那么我将在 2 月、3 月、5 月和 6 月有间隔。我一直在四处寻找,发现要做的一件事是使用数字表,或创建一个临时月份表,然后加入其中,但我似乎仍然无法让它工作。到目前为止,这是我所拥有的:

CREATE OR REPLACE TABLE temp_months (id INT unsigned PRIMARY KEY);
INSERT INTO temp_months
VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12);

SELECT
    COUNT(p.ID) AS COUNT,
    YEAR(p.created_date) as YEAR,
    tm.id as MONTH
FROM
    temp_months tm
LEFT OUTER JOIN
    my_table p
        ON
            MONTH(p.created_date) = tm.id
WHERE
    p.company_id = 123456
GROUP BY
    MONTH, YEAR
ORDER BY
    p.created_date DESC

这给了我以下格式,有间隙(几乎就像我根本没有将它加入临时表一样)

+-------+------+-------+
| COUNT | YEAR | MONTH |
+-------+------+-------+
|     1 | 2020 |     5 |
|     3 | 2020 |     2 |
|     1 | 2020 |     1 |
|     9 | 2019 |    10 |
|     2 | 2019 |     8 |
+-------+------+-------+

我希望它喜欢做的是用空/null/0 COUNT 填补空白,例如:

+-------+------+-------+
| COUNT | YEAR | MONTH |
+-------+------+-------+
|  NULL | 2020 |     6 |
|     1 | 2020 |     5 |
|  NULL | 2020 |     4 |
|  NULL | 2020 |     3 |
|     3 | 2020 |     2 |
|     1 | 2020 |     1 |
|  NULL | 2019 |    12 |
|  NULL | 2019 |    11 |
|     9 | 2019 |    10 |
|  NULL | 2019 |     9 |
|     2 | 2019 |     8 |
|  NULL | 2019 |     7 |
+-------+------+-------+

我只是不太确定我在哪里搞砸了。

【问题讨论】:

  • @Nick 我有两个设置,一个运行10.3.14,一个运行10.3.23
  • 您需要将p.company_id = 123456 条件移动到JOIN 条件中,即LEFT OUTER JOIN my_table p ON MONTH(p.created_date) = tm.id AND p.company_id = 123456
  • 解决这个问题,你就会接近,你可能仍然无法获得正确的年份。
  • 我把它移到了那里,它仍然给我同样的结果
  • 您删除了WHERE 子句,对吗?您可以发布一些示例my_table 数据吗?

标签: mysql sql mariadb


【解决方案1】:

假设您正在使用 MariaDB...

不要乱七八糟的UNIONs,使用seq_0_to_100+ INTERVAL sea MONTH

【讨论】:

    【解决方案2】:

    这是一个查询,它将为您提供最后一个 n 个月的结果,使用递归 CTE 生成最后一个 n 个月的年/月组合,然后将这些值LEFT JOINing 到 my_table 到获取每个年/月组合的计数。此查询设置为过去 12 个月(CTE 的递归部分中的 11),要更改为 24 月份,您可以将该值更改为 23

    WITH RECURSIVE dates AS (
      SELECT MAX(created_date) AS mdate, CONCAT(LEFT(MAX(created_date), 8), '01') AS cdate
      FROM my_table
      UNION ALL
      SELECT mdate, cdate - INTERVAL 1 MONTH
      FROM dates
      WHERE cdate > mdate - INTERVAL 11 MONTH
    )
    SELECT COUNT(p.id) AS `count`, YEAR(cdate) AS yr, MONTH(cdate) AS mth
    FROM dates d
    LEFT JOIN my_table p ON p.created_date BETWEEN d.cdate AND LAST_DAY(d.cdate)
    GROUP BY cdate
    ORDER BY cdate DESC
    

    输出(用于@zedfoxus 样本数据):

    count   yr      mth
    1       2020    5
    0       2020    4
    0       2020    3
    3       2020    2
    1       2020    1
    0       2019    12
    0       2019    11
    9       2019    10
    0       2019    9
    2       2019    8
    0       2019    7
    0       2019    6
    

    Demo on dbfiddle

    该查询从表中的最大日期开始运行,要从当前日期开始运行,请将递归 CTE 更改如下:

    WITH RECURSIVE dates AS (
      SELECT CONCAT(LEFT(CURDATE(), 8), '01') AS mdate, CONCAT(LEFT(CURDATE(), 8), '01') AS cdate
      UNION ALL
      SELECT mdate, cdate - INTERVAL 1 MONTH
      FROM dates
      WHERE cdate > mdate - INTERVAL 11 MONTH
    )
    SELECT COUNT(p.id) AS `count`, YEAR(cdate) AS yr, MONTH(cdate) AS mth
    FROM dates d
    LEFT JOIN my_table p ON p.created_date BETWEEN d.cdate AND LAST_DAY(d.cdate)
    GROUP BY cdate
    ORDER BY cdate DESC
    

    Demo on dbfiddle

    【讨论】:

      【解决方案3】:

      您可以尝试更改您的 temp_months 表以包含年份,如下所示:

      create table temp_months (yr int, mth int, primary key (yr, mth));
      insert into temp_months values
      (2020, 1), (2020, 2), (2020, 3), (2020, 4), (2020, 5), (2020, 6),
      (2019, 7), (2019, 8), (2019, 9), (2019, 10), (2019, 11), (2019, 12);
      

      假设你的 my_table 是这样的,

      create table my_table (created_date date, company_id int, id int);
      insert into my_table values
      ('2020-05-01', 123456, 1),
      ('2020-02-01', 123456, 1),('2020-02-01', 123456, 1),('2020-02-01', 123456, 1),
      ('2020-01-01', 123456, 1),
      ('2019-10-01', 123456, 1),('2019-10-01', 123456, 1),('2019-10-01', 123456, 1),('2019-10-01', 123456, 1),('2019-10-01', 123456, 1),('2019-10-01', 123456, 1),('2019-10-01', 123456, 1),('2019-10-01', 123456, 1),('2019-10-01', 123456, 1),
      ('2019-08-01', 123456, 1),('2019-08-01', 123456, 1);
      

      您可以运行这种查询:

      select count(p.id), yr as year, mth as month
      from temp_months tm
      left join my_table p
        on month(created_date)=tm.mth
        and year(created_date)=tm.yr
      group by yr, mth
      order by yr desc, mth desc
      

      结果将是

      计数(p.id)|年份 |月 ----------: | ---: | ----: 0 | 2020 | 6 1 | 2020 | 5 0 | 2020 | 4 0 | 2020 | 3 3 | 2020 | 2 1 | 2020 | 1 0 | 2019 | 12 0 | 2019 | 11 9 | 2019 | 10 0 | 2019 | 9 2 | 2019 | 8 0 | 2019 | 7

      如果你想显示 NULL,你可以使用:

      with result as (
        select count(p.id) as counter, yr as year, mth as month
        from temp_months tm
        left join my_table p
          on month(created_date)=tm.mth
          and year(created_date)=tm.yr
        group by yr, mth
        order by yr desc, mth desc
      )
      select
        case when counter = 0 then NULL else counter end as counter,
        year, month
      from result;
      

      结果将是

      柜台 |年份 |月 ------: | ---: | ----: | 2020 | 6 1 | 2020 | 5 | 2020 | 4 | 2020 | 3 3 | 2020 | 2 1 | 2020 | 1 | 2019 | 12 | 2019 | 11 9 | 2019 | 10 | 2019 | 9 2 | 2019 | 8 | 2019 | 7

      示例:https://dbfiddle.uk/?rdbms=mariadb_10.4&fiddle=2ee3594614494d3397a996d7ff815859

      要手动但快速地填充 temp_months 表,我输入一年的值,如下所示:

      insert into temp_table values
      (2019, 1), (2019, 2), (2019, 3), (2019, 4), (2019, 5), (2019, 6),
      (2019, 7), (2019, 8), (2019, 9), (2019, 10), (2019, 11), (2019, 12);
      

      然后,我将其复制到文本编辑器中,查找/替换 2019 到 2020 并再次执行......等等。几秒钟之内,我就会在 temp_table 中拥有数年的数据。

      另一个选项是创建一个存储过程以根据此处的示例按需填充它:How to populate a table with a range of dates?

      【讨论】:

      • 效果很好,谢谢!我没有考虑将这些年作为临时表的一部分添加。我唯一的另一个问题是,我如何动态填写临时表,比如过去 12 或 24 个月?
      • 如果我是你,如果它是几年的数据,我会手动填充它。您可以使用 stackoverflow.com/questions/10132024/… 并对其进行修改以使用存储过程动态填充日期。
      猜你喜欢
      • 2020-12-23
      • 1970-01-01
      • 2020-01-20
      • 2016-06-25
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-05-11
      • 1970-01-01
      相关资源
      最近更新 更多