【问题标题】:How do I split start and end dates with other start and end dates?如何将开始日期和结束日期与其他开始日期和结束日期分开?
【发布时间】:2022-01-23 23:55:08
【问题描述】:

我在下面编写了两个查询,结果类似于:

表:company_changes

user_id start_at end_at company_id
189 2020-12-12 2021-03-02 88
189 2021-03-02 2050-01-01 169

表:enablement_changes

user_id start_at end_at enablement
189 2020-12-12 2021-10-15 disabled
189 2021-10-15 2050-01-01 enabled

重要的是我知道用户何时处于某个company_id 并且是enableddisabled

我想要的结果是这样的表格:

user_id start_at end_at company_id status
189 2020-12-12 2021-03-02 88 disabled
189 2021-03-02 2021-10-15 169 disabled
189 2021-10-15 2050-01-01 169 enabled

我本质上想将这些查询的结果组合在一起。 2050-01-01 是未来的任意日期。由于user_id 没有更改statuscompany_id,因此它显示为2050-01-01,因为它是用户的当前状态。

知道如何解决这个问题吗?

这里是小提琴:http://sqlfiddle.com/#!9/5c42b6

第一次在 Stackoverflow 上提问...如果我的问题格式不正确,请告诉我。

【问题讨论】:

    标签: mysql sql snowflake-cloud-data-platform


    【解决方案1】:

    如果在实践中您有更复杂的数据并且可能存在重叠的时间间隔,例如:

    表:enablement_changes

    user_id start_at end_at enablement
    189 2020-12-12 2021-10-15 disabled
    189 2020-12-20 2021-02-10 enabled
    189 2021-10-15 2050-01-01 enabled

    我推荐一个更复杂的解决方案:

    WITH _k AS (
        SELECT 1 AS n
        UNION ALL
        SELECT 2 AS n
    ), _points AS (
      SELECT user_id, CASE WHEN n = 1 THEN start_at ELSE end_at END AS date_point, n
        FROM company_changes
       CROSS JOIN _k
       UNION
      SELECT user_id, CASE WHEN n = 1 THEN start_at ELSE end_at END AS date_point, n
        FROM enablement_changes
       CROSS JOIN _k
    ), _drank AS (
      SELECT p.user_id, p.date_point, DENSE_RANK() OVER(PARTITION BY p.user_id ORDER BY p.date_point) AS dr
        FROM _points AS p
       GROUP BY p.user_id, p.date_point
    )
    SELECT d1.user_id, d1.date_point AS start_at, d2.date_point AS end_at, c.company_id, MAX(s.status) AS status -- or MIN if status disabled is stronger than enabled in the same time
      FROM _drank AS d1
      JOIN _drank AS d2 ON d1.dr = d2.dr-1 AND d1.user_id = d2.user_id
      LEFT JOIN company_changes AS c    ON d1.user_id = c.user_id AND d1.date_point < c.end_at AND c.start_at < d2.date_point 
      LEFT JOIN enablement_changes AS s ON d1.user_id = s.user_id AND d1.date_point < s.end_at AND s.start_at < d2.date_point 
     GROUP BY d1.user_id, d1.date_point, d2.date_point, c.company_id
     ORDER BY 1,2,3;
    

    db<>fiddle demo

    输出:

    【讨论】:

      【解决方案2】:

      Lukasz 解决方案很好。

      but 将匹配c 表结束时间与e 表开始时间匹配的行。通常日期时间范围希望包含 start 但不匹配 end 否则您将得到两行。

      它会错过任何在c 开始和结束于e 表行之后的连接,但您想要匹配子集。后一点取决于您是进行密集匹配(始终有行)还是稀疏匹配(有时只有行)

      第一个问题可以通过额外的检查来解决:

      SELECT c.user_id 
             ,GREATEST(c.start_at, e.start_at) AS start_at
             ,LEAST(c.end_at, e.end_at) AS end_at
             ,c.company_id 
             ,e.status
      FROM company_changes c
      JOIN enablement_changes e
        ON (c.start_at BETWEEN e.start_at AND e.end_at AND c.start_at < e.end_at
          OR c.end_at BETWEEN e.start_at AND e.end_at AND c.end_at >  e.start_at )
          AND c.user_id = e.user_id
      ORDER BY 1,2;
      

      在哪里匹配你需要的稀疏匹配。

      SELECT c.user_id 
             ,GREATEST(c.start_at, e.start_at) AS start_at
             ,LEAST(c.end_at, e.end_at) AS end_at
             ,c.company_id 
             ,e.status
      FROM company_changes c
      JOIN enablement_changes e
          ON c.user_id = e.user_id
              AND (c.end_at > e.start AND c.start_at < e.end_at)
      ORDER BY 1,2;
      

      在具有大范围的非常大的表上,后面的代码可能很昂贵

      【讨论】:

        【解决方案3】:

        使用JOINBETWEEN

        SELECT c.user_id 
               ,GREATEST(c.start_at, e.start_at) AS start_at
               ,LEAST(c.end_at, e.end_at) AS end_at
               ,c.company_id 
               ,e.status
        FROM company_changes c
        JOIN enablement_changes e
          ON (c.start_at BETWEEN e.start_at AND e.end_at
            OR c.end_at BETWEEN e.start_at AND e.end_at)
            AND c.user_id = e.user_id
        ORDER BY 1,2;
        

        db<>fiddle demo

        输出:

        【讨论】:

          猜你喜欢
          • 2012-08-19
          • 2020-10-05
          • 1970-01-01
          • 2023-03-17
          • 2021-11-12
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多