【问题标题】:BigQuery Nested Challenge Involving Joins and Having (or Where) ClausesBigQuery 嵌套挑战涉及连接和拥有(或在哪里)子句
【发布时间】:2015-01-09 08:19:15
【问题描述】:

我遇到了一个有点超出我能力范围的挑战,所以我打算直接加入。

我在 BigQuery 中有一个示例数据集,您可以在此处找到用于测试目的:https://bigquery.cloud.google.com/table/robotic-charmer-726:bl_test_data.complex_problem

我需要弄清楚查询我的表的 SQL 代码并执行以下操作:

通过使用以下规则进行聚合(我将从一个电子邮件地址开始,并在最后添加另一个):

作为前面的一般说明,所有内容都应设为小写,以便在聚合时 Ben=ben。

电子邮件是最广泛的聚合,按小写版本聚合。

所有这些小写电子邮件的金额相加,如下图蓝色所示。

接下来会考虑名字和姓氏,它们是根据名字和姓氏小写的总和来选择的。

注意,名字或姓氏不单独考虑。请参见下文,其中 Ben 的总和为 160,而 Kathleen 的总和仅为 150,但 Kathleen 仍被选中,因为她的全名的总和高于任何其他全名。

接下来,SELECTED NAME 的小写完整地址将根据最高总和金额进行选择。

与名称类似,完整地址将所有列一起考虑。

现在我将添加另一个电子邮件地址,我们将做同样的事情。

单独考虑每个小写电子邮件地址。我现在意识到我应该用我的照片更清楚地说明这一点,但我不想再做一遍......太多的工作。所以我希望我已经说得够清楚了。

我希望你觉得这是一个非常有趣的挑战!

【问题讨论】:

  • 什么定义了选择哪个“本”?

标签: sql join nested google-bigquery having


【解决方案1】:

可能有更简洁的方法可以做到这一点,但这将为您提供所需的答案:

    select email, first_name, last_name, address, city, state, zip, total_amount amount
from (
    select d.email email, d.first_name first_name, d.last_name last_name, d.amount amount, d.total_amount total_amount, e.address address, e.city city, e.state state, e.zip zip, row_number() over (partition by e.email order by e.amount desc) ord
    from (
        select a.email email, a.first_name first_name, a.last_name last_name, b.amount amount, c.amount total_amount
        from (
          SELECT  
            lower(email) email, lower(first_name) first_name, lower(last_name) last_name, lower(concat(first_name, last_name)) as name_group, lower(address) address, lower(city) city, lower(state) state, lower(concat(address,city,state)) as location_group, zip, sum(amount) amount 
          FROM [robotic-charmer-726:bl_test_data.complex_problem]
          group by 1,2,3,4,5,6,7,8,9
        ) a
        inner join (
          select email, first_name, last_name, name_group, amount
          from (
            select email, first_name, last_name, name_group, amount, row_number() over (partition by email order by amount desc) as ord
            from (
              select lower(email) email , lower(first_name) first_name, lower(last_name) last_name, lower(concat(first_name,last_name)) as name_group, sum(amount) amount, 
              from [robotic-charmer-726:bl_test_data.complex_problem]
              group by 1, 2, 3, 4
            )
          )
          where ord = 1
        ) b
        on a.name_group = b.name_group
        inner join (
          select lower(email) email, sum(amount) amount
          from [robotic-charmer-726:bl_test_data.complex_problem]
          group by 1
        ) c
        on a.email = c.email
        group by 1,2,3,4,5
    ) d
    inner join (
        select lower(email) email, lower(first_name) first_name, lower(last_name) last_name, lower(address) address, lower(city) city, lower(state) state, zip,lower(concat(lower(address),lower(city), lower(state), zip)) as location_group, sum(amount) amount
        from [robotic-charmer-726:bl_test_data.complex_problem]
        group by 1,2,3,4,5,6,7,8
    ) e
    on d.email = e.email and d.first_name = e.first_name and d.last_name = e.last_name
)
where ord = 1

【讨论】:

  • 天哪,吉尔。非常感谢!
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2022-06-10
  • 1970-01-01
  • 2011-07-13
  • 2019-03-21
  • 2014-07-05
  • 2011-12-11
相关资源
最近更新 更多