【发布时间】:2021-08-30 11:05:42
【问题描述】:
我在雪花数据仓库中有下表:
| Client_ID | Appointment_Date | Store_ID |
|---|---|---|
| Client_1 | 1/1/2021 | Store_1 |
| Client_2 | 1/1/2021 | Store_1 |
| Client_1 | 2/1/2021 | Store_2 |
| Client_2 | 2/1/2021 | Store_1 |
| Client_1 | 3/1/2021 | Store_1 |
| Client_2 | 3/1/2021 | Store_1 |
我需要能够按Appointment_Date 的顺序计算每个Client_ID 的唯一Store_ID 数量。以下是我想要的输出:
| Customer_ID | Appointment_Date | Store_ID | Count_Different_Stores |
|---|---|---|---|
| Client_1 | 1/1/2021 | Store_1 | 1 |
| Client_2 | 1/1/2021 | Store_1 | 1 |
| Client_1 | 2/1/2021 | Store_2 | 2 |
| Client_2 | 2/1/2021 | Store_1 | 1 |
| Client_1 | 3/1/2021 | Store_1 | 2 |
| Client_2 | 3/1/2021 | Store_1 | 1 |
我会主动计算客户随时间访问的不同商店的数量。我试过了:
SELECT Client_ID, Appointment_Date, Store_ID,
DENSE_RANK() OVER (PARTITION BY CLIENT_ID, STORE_ID ORDER BY APPOINTMENT_DATE)
FROM table
产量:
| Customer_ID | Appointment_Date | Store_ID | Count_Different_Stores |
|---|---|---|---|
| Client_1 | 1/1/2021 | Store_1 | 1 |
| Client_2 | 1/1/2021 | Store_1 | 1 |
| Client_1 | 2/1/2021 | Store_2 | 2 |
| Client_2 | 2/1/2021 | Store_1 | 2 |
| Client_1 | 3/1/2021 | Store_1 | 3 |
| Client_2 | 3/1/2021 | Store_1 | 3 |
还有:
SELECT Client_ID, Store_ID,
DENSE_RANK() OVER (PARTITION BY CLIENT_ID, STORE_ID)
FROM table
--With a join back to the original table with all my needed data
产量:
| Customer_ID | Appointment_Date | Store_ID | Count_Different_Stores |
|---|---|---|---|
| Client_1 | 1/1/2021 | Store_1 | 2 |
| Client_2 | 1/1/2021 | Store_1 | 1 |
| Client_1 | 2/1/2021 | Store_2 | 1 |
| Client_2 | 2/1/2021 | Store_1 | 1 |
| Client_1 | 3/1/2021 | Store_1 | 1 |
| Client_2 | 3/1/2021 | Store_1 | 1 |
第二个更接近我需要的,但是不同店铺的排名不一定占Appointment_Date的顺序,这很关键。有时顺序正确,有时不正确。
任何见解都是有帮助的,很乐意提供更多信息。
【问题讨论】:
-
我迷路了。您从四行开始。额外的行和日期从何而来?
标签: sql database database-design snowflake-cloud-data-platform