【发布时间】:2022-01-01 04:33:10
【问题描述】:
我在 Apache Spark 中创建了一个查询,希望获取多行客户数据并将其汇总为一行,以显示他们打开了哪些类型的产品。所以看起来像这样的数据:
Customer Product
1 Savings
1 Checking
1 Auto
最终看起来像这样:
Customer Product
1 Savings/Checking/Auto
查询当前仍有多行。我尝试了 group by,但这并没有显示客户拥有的多种产品,而是只显示一种产品。
有没有办法做到这一点是 Apache Spark 或 SQL(它真的类似于 apache)?不幸的是,我没有 MYSQL,也不认为 IT 会为我安装它。
SELECT
"ACCOUNT"."account_customerkey" AS "account_customerkey",
max(
concat(case when Savings=1 then ' Savings'end,
case when Checking=1 then ' Checking 'end,
case when CD=1 then ' CD /'end,
case when IRA=1 then ' IRA /'end,
case when StandardLoan=1 then ' SL /'end,
case when Auto=1 then ' Auto /'end,
case when Mortgage=1 then ' Mortgage /'end,
case when CreditCard=1 then ' CreditCard 'end)) AS Description
FROM "ACCOUNT" "ACCOUNT"
inner join (
SELECT
"ACCOUNT"."account_customerkey" AS "customerkey",
CASE WHEN "ACCOUNT"."account_producttype" = 'Savings' THEN 1 ELSE NULL END AS Savings,
CASE WHEN "ACCOUNT"."account_producttype" = 'Checking' THEN 1 ELSE NULL END AS Checking,
CASE WHEN "ACCOUNT"."account_producttype" = 'CD' THEN 1 ELSE NULL END AS CD,
CASE WHEN "ACCOUNT"."account_producttype" = 'IRA' THEN 1 ELSE NULL END AS IRA,
CASE WHEN "ACCOUNT"."account_producttype" = 'Standard Loan' THEN 1 ELSE NULL END AS StandardLoan,
CASE WHEN "ACCOUNT"."account_producttype" = 'Auto' THEN 1 ELSE NULL END AS Auto,
CASE WHEN "ACCOUNT"."account_producttype" = 'Mortgage' THEN 1 ELSE NULL END AS Mortgage,
CASE WHEN "ACCOUNT"."account_producttype" = 'Credit Card' THEN 1 ELSE NULL END AS CreditCard
FROM "ACCOUNT" "ACCOUNT"
)a on "account_customerkey" =a."customerkey"
GROUP BY
"ACCOUNT"."account_customerkey"
【问题讨论】:
-
您能分享一下您的预期结果吗?
标签: sql apache-spark apache-spark-sql