【问题标题】:Running Count Distinct using Over Partition By使用 Over Partition By 运行 Count Distinct
【发布时间】:2019-10-22 11:33:02
【问题描述】:

我有一个数据集,其中包含随着时间的推移进行购买的用户 ID。我想显示按州和国家划分的 YTD 不同数量的已购买用户。输出将有 4 列:Country、State、Year、Month、YTD Count of Distinct Users with purchase activity。

有没有办法做到这一点?当我从视图中排除月份并进行不同计数时,以下代码有效:

Select Year, Country, State,
   COUNT(DISTINCT (CASE WHEN ActiveUserFlag > 0 THEN MBR_ID END)) AS YTD_Active_Member_Count
From MemberActivity
Where Month <= 5
Group By 1,2,3;

当用户在多个月内购买时会出现此问题,因为我无法按月汇总然后求和,因为它重复了用户计数。

出于趋势目的,我需要查看一年中每个月的年初至今计数。

【问题讨论】:

    标签: sql teradata


    【解决方案1】:

    每个会员在购买的第一个月只退货一次,按月计算,然后应用累计金额:

    select Year, Country, State, month,
       sum(cnt)
       over (partition by Year, Country, State
             order by month
             rows unbounded preceding) AS YTD_Active_Member_Count
    from
      (
        Select Year, Country, State, month,
           COUNT(*) as cnt -- 1st purchses per month
        From 
         ( -- this assumes there's at least one new active member per year/month/country
           -- otherwise there would be mising rows 
           Select *
           from MemberActivity
           where ActiveUserFlag > 0 -- only active members
             and Month <= 5
             -- and year = 2019 -- seems to be for this year only
           qualify row_number() -- only first purchase per member/year
                   over (partition by MBR_ID, year
                         order by month --? probably there's a purchase_date) = 1
         ) as dt
        group by 1,2,3,4
     ) as dt
    ;
    

    【讨论】:

    • 这完全有效!不能感谢你。我整个星期都在为此苦苦挣扎!我不熟悉限定语法,所以肯定需要阅读它以添加到我的知识库中......
    • QUALIFY 过滤 OLAP 函数的结果,类似于 WHERE 和 HAVING。它是专有的 Teradata 语法,在标准 SQL 中,您必须使用派生表(如 Gordon 的查询)嵌套它:select * from (select OLAP-function AS xx) where xx &lt;= 5
    【解决方案2】:

    统计第一个月出现的用户数:

    select Country, State, year, month,
           sum(case when ActiveUserFlag > 0 and seqnum = 1 then 1 else 0 end) as YTD_Active_Member_Count
    from (select ma.*,
                 row_number() over (partition by year order by month) as seqnum
          from MemberActivity ma
         ) ma
    where Month <= 5
    group by Country, State, year, month;
    

    【讨论】:

      猜你喜欢
      • 2021-05-06
      • 1970-01-01
      • 2021-12-15
      • 2020-11-16
      • 1970-01-01
      • 2020-05-02
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多