【问题标题】:SQL - Calculate percentage by group, for multiple groupsSQL - 按组计算百分比,用于多个组
【发布时间】:2022-02-23 05:41:28
【问题描述】:

我在 GBQ 中有一个表格,格式如下:

UserId  Orders  Month  
 XDT     23      1
 XDT     0       4     
 FKR     3       6
 GHR     23      4
 ...     ...    ...

显示每个用户和每个月的订单数。

我想计算有订单的用户百分比,我做了如下:

SELECT
  HasOrders,
  ROUND(COUNT(*) * 100 / CAST( SUM(COUNT(*)) OVER () AS float64), 2) Parts
FROM (
    SELECT
        *,
        CASE WHEN Orders = 0 THEN 0 ELSE 1 END AS HasOrders
    FROM `Table` ) 
GROUP BY
  HasOrders
ORDER BY
  Parts

它给了我以下结果:

HasOrders   Parts
   0         35
   1         65

我需要按月计算有订单的用户百分比,每个月 = 100%

目前为此我每月执行一次查询,这是不切实际的:

SELECT
  HasOrders,
  ROUND(COUNT(*) * 100 / CAST( SUM(COUNT(*)) OVER () AS float64), 2) Parts
FROM (
    SELECT
        *,
        CASE WHEN Orders = 0 THEN 0 ELSE 1 END AS HasOrders
    FROM `Table` ) 
WHERE Month = 1
GROUP BY
  HasOrders
ORDER BY
  Parts

有没有办法执行一次查询并得到这个结果?

HasOrders   Parts   Month
   0         25      1
   1         75      1
   0         45      2
   1         55      2
  ...       ...     ...

【问题讨论】:

    标签: sql group-by google-bigquery


    【解决方案1】:
    SELECT
        SIGN(Orders),
        ROUND(COUNT(*) * 100.000 / SUM(COUNT(*), 2) OVER (PARTITION BY Month)) AS Parts,
        Month
    FROM T
    GROUP BY Month, SIGN(Orders)
    ORDER BY Month, SIGN(Orders)
    

    Postgres 演示: https://dbfiddle.uk/?rdbms=postgres_10&fiddle=4cd2d1455673469c2dfc060eccea8020

    您已经说过让总数达到 100% 很重要,因此对于百分比恰好落在奇数倍数的情况,您可以考虑在没有订单的情况下舍入,在有订单的情况下四舍五入0.5%。或者,向偶数或最小向下舍入可能是更好的选择:

    WITH DATA AS (
        SELECT SIGN(Orders) AS HasOrders, Month,
            COUNT(*) * 10000.000 / SUM(COUNT(*)) OVER (PARTITION BY Month) AS PartsPercent
        FROM T
        GROUP BY Month, SIGN(Orders)
        ORDER BY Month, SIGN(Orders)
    )
    select HasOrders, Month, PartsPercent,
        PartsPercent - TRUNCATE(PartsPercent) AS Fraction,
        CASE WHEN HasOrders = 0
             THEN FLOOR(PartsPercent) ELSE CEILING(PartsPercent)
        END AS PartsRound0Down,
        CASE WHEN PartsPercent - TRUNCATE(PartsPercent) = 0.5
                  AND MOD(TRUNCATE(PartsPercent), 2) = 0
             THEN FLOOR(PartsPercent) ELSE ROUND(PartsPercent) -- halfway up
        END AS PartsRoundTowardEven,
        CASE WHEN PartsPercent - TRUNCATE(PartsPercent) = 0.5 AND PartsPercent < 50
             THEN FLOOR(PartsPercent) ELSE ROUND(PartsPercent) -- halfway up
        END AS PartsSmallestTowardZero
    from DATA
    

    通常不建议测试浮点值是否相等,而且我不知道 BigQuery 的 float64 将如何与 0.5 进行比较。然而,一半可以二进制表示。在突破为 101 与 99 的情况下查看这些。我无法立即访问 BigQuery,因此请注意 Postgres 的舍入行为是不同的: https://dbfiddle.uk/?rdbms=postgres_10&fiddle=c8237e272427a0d1114c3d8056a01a09

    【讨论】:

      【解决方案2】:

      考虑以下方法

      select hasOrders, round(100 * parts, 2) as parts, month from (
        select month, 
          countif(orders = 0) / count(*) `0`,
          countif(orders > 0) / count(*) `1`,
        from your_table
        group by month
      )
      unpivot (parts for hasOrders in (`0`, `1`))          
      

      输出如下

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2020-12-03
        • 1970-01-01
        • 2022-01-21
        • 2021-03-16
        • 1970-01-01
        • 1970-01-01
        • 2014-12-24
        • 1970-01-01
        相关资源
        最近更新 更多