【发布时间】:2021-08-07 09:40:11
【问题描述】:
我有一张如下所示的表格
base_data
| session_id | event_type | player_guess | correct_answer |
|---|---|---|---|
| 1 | guess | 'python' | NULL |
| 1 | guess | 'javascript' | NULL |
| 1 | guess | 'scala' | NULL |
| 1 | all_answered | NULL | ['python','javascript','hadoop'] |
| 2 | guess | 'triangle' | NULL |
| 2 | guess | 'square' | NULL |
| 2 | all_answered | NULL | ['triangle','square'] |
我正在尝试获取一个名为 was_guess_correct 的新列,定义如下:
For each session_id, match the player_guess values with data in correct_answer. Correct answer for session_id is available when event_type = 'all_answered'
结果看起来像 -
| session_id | event_type | player_guess | correct_answer | was_guess_correct |
|---|---|---|---|---|
| 1 | guess | 'python' | NULL | 1 |
| 1 | guess | 'javascript' | NULL | 1 |
| 1 | guess | 'scala' | NULL | 0 |
| 1 | all_answered | NULL | ['python','javascript','hadoop'] | 1 |
| 2 | guess | 'triangle' | NULL | 1 |
| 2 | guess | 'square' | NULL | 1 |
| 2 | all_answered | NULL | ['triangle','square'] | 1 |
all_answered 行中的值是唯一且已排序的(可以使用顺序或仅使用 IN 子句检查也可能有效)
对于 event_type all_answered 的行,was_guess_correct 列无关紧要。它可以是 1 或 0 - 任何有助于使查询更容易的值。
如何在 SQL/Presto 中计算上述列?
我想看看 - 如果可能的话,如何使用 JOIN/Unnest 以及内联(不使用 JOIN)进行计算。
【问题讨论】:
标签: sql amazon-athena presto