【问题标题】:User Life Cycle SQL Query Logic in SnowflakeSnowflake 中的用户生命周期 SQL 查询逻辑
【发布时间】:2021-08-08 05:17:42
【问题描述】:

我正在构建一个查询,以通过事件通过平台跟踪用户的生命周期。表 EVENTS 有 3 列 USER_ID、DATE_TIME 和 EVENT_NAME。下面是表格的截图,

我的查询应该返回以下结果(注册事件的第一个时间戳,然后是以下 log_in 事件的立即/下一个时间戳,最后是最终着陆页事件的立即/下一个时间戳),

以下是我的查询,

WITH FIRST_STEP AS
(SELECT 
USER_ID,
MIN(CASE WHEN EVENT_NAME = 'registered' THEN DATE_TIME ELSE NULL END) AS REGISTERED_TIMESTAMP
FROM EVENTS
GROUP BY 1
),
SECOND_STEP AS
(SELECT * FROM EVENTS
WHERE EVENT_NAME = 'log_in'
ORDER BY DATE_TIME
),
THIRD_STEP AS
(SELECT * FROM EVENTS
WHERE EVENT_NAME = 'landing_page'
ORDER BY DATE_TIME
)
SELECT
a.USER_ID,
a.REGISTERED_TIMESTAMP,
(SELECT
CASE WHEN b.DATE_TIME >= a.REGISTRATIONS_TIMESTAMP THEN b.DATE_TIME END AS LOG_IN_TIMESTAMP
FROM SECOND_STEP
LIMIT 1
),
(SELECT
CASE WHEN c.DATE_TIME >= LOG_IN_TIMESTAMP THEN c.DATE_TIME END AS LANDING_PAGE_TIMESTAMP
FROM THIRD_STEP
LIMIT 1
)
FROM FIRST_STEP AS a
LEFT JOIN SECOND_STEP AS b ON a.USER_ID = b.USER_ID
LEFT JOIN THIRD_STEP AS c ON b.USER_ID = c.USER_ID;

不幸的是,我在尝试运行查询时收到“SQL 编译错误:无法评估不支持的子查询类型”错误

【问题讨论】:

    标签: sql subquery snowflake-cloud-data-platform common-table-expression window-functions


    【解决方案1】:

    这是MATCH_RECOGNIZE 的完美用例。

    您正在寻找的模式是register anything* login anything* landing,而度量是min(iff(event_name='x', date_time, null))

    检查:

    将输出设置为one row per match

    未经测试的示例查询:

    select *
    from data
    match_recognize(
        partition by user_id
        order by date_time
        measures min(iff(event_name='register', date_time, null)) as t1
          , min(iff(event_name='log_in', date_time, null)) as t2
          , min(iff(event_name='landing_page', date_time, null)) as t3
        one row per match
        pattern(register anything* login anything* landing)
        define
            register as event_name = 'register'
            , login as event_name = 'log_in'
            , landing as event_name = 'landing_page'
    );
    

    【讨论】:

    • SELECT * FROM EVENTS MATCH_RECOGNIZE( PARTITION BY USER_ID ORDER BY DATE_TIME MEASURES MIN(IFF(EVENT_NAME='registered', DATE_TIME, NULL)) AS REGISTER_TIMESTAMP, MIN(IFF(EVENT_NAME='log_in', DATE_TIME , NULL)) AS LOG_IN_TIMESTAMP, MIN(IFF(EVENT_NAME='landing_page', DATE_TIME, NULL)) AS LANDING_PAGE_TIMESTAMP 每个匹配模式一行(STEP_1 STEP_2 STEP_3) DEFINE STEP_1 AS EVENT_NAME='registered', STEP_2 AS EVENT_NAME='log_in' , STEP_3 AS EVENT_NAME='landing_page' );
    • 有没有一种方法可以修改查询中的 PATTERN 和 DEFINE 逻辑或 MEASURES 逻辑,以从我的表中实现我想要的(返回第一个事件的第一个时间戳,然后是第二个事件的立即/下一个时间戳,然后是第三个事件的立即/下一个时间戳,依此类推)?
    • 每次运行查询时,结果集中的记录数都会不断变化
    • 我们如何处理用户流程中丢失的事件?如果通过漏斗的用户流中缺少任何事件,则查询不会返回任何结果,因为当用户流中缺少定义的事件时模式搜索失败
    • 发布一个带有示例数据和期望结果的新问题,您将得到问题所要求的答案
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-04-19
    • 2021-10-28
    • 1970-01-01
    • 1970-01-01
    • 2015-12-09
    • 1970-01-01
    相关资源
    最近更新 更多