【问题标题】:SQL - Selecting a column value based on max value in another column and combination of values in another column - TeradataSQL - 根据另一列中的最大值和另一列中的值组合选择列值 - Teradata
【发布时间】:2018-01-02 04:26:05
【问题描述】:

我输入的 Teradata 表 accnt_pln_info 样本数据如下。

Account_number   Plan_code   Plan_Date    Base_Amount     Biz_Date
ACCT1            R           2017-JAN-01         100      2017-MAY-31
ACCT1            R           2017-JAN-11          30      2017-MAY-31
ACCT1            K           2017-JAN-22          80      2017-MAY-31
ACCT1            B           2017-JAN-13          50      2017-MAY-31
ACCT1            C           2017-JAN-18         180      2017-MAY-31
ACCT2            R           2017-JAN-12          70      2017-MAY-31
ACCT2            C           2017-JAN-02          90      2017-MAY-31
ACCT2            R           2017-JAN-08          10      2017-MAY-31
ACCT2            D           2017-JAN-02          40      2017-MAY-31
ACCT2            B           2017-FEB-24          14      2017-MAY-31
ACCT2            K           2017-FEB-12          79      2017-MAY-31

期望的输出:(对于过滤条件 Biz_Date=2017-MAY-31

Account_number   RK_Plan_Date    RK_Base_Amount   RC_Plan_Date   RC_Base_Amount
ACCT1            2017-JAN-22          80          2017-JAN-18         180
ACCT2            2017-FEB-12          79          2017-JAN-12          70    

逻辑:

Filter condition applied Biz_Date=2017-MAY-31 as table has multiple distinct biz_dates.
Group by Account_Number;  Plan_Date in (R,K), 
find the max Plan_Date and then get that rows Base_Amount; 
Plan_Date in (R,C), find the max Plan_Date and 
then get that rows Base_Amount.

例如: 对于 ACCT1 和 ('R','K') 中的 plan_code,最大 plan_date 值为 2017-JAN-22;因此需要获取该行的 Base_amount 为 80

假设:

There can be duplicates on Account_number and Plan_Code.
There will not be duplicates on Account_number, Plan_Code in (R,K) and Plan_Date.
There will not be duplicates on Account_number, Plan_Code in (R,C) and Plan_Date.
The input order in table is not necessarily the same. 

我尝试过但失败了:

SELECT ACCOUNT_NUMBER, 
MAX(CASE WHEN PLAN_DATE IN ('R','K') THEN PLAN_DATE END) MAX_RK_PLAN_DATE,
MAX(CASE WHEN PLAN_DATE IN ('R','K') AND MAX_PLAN_DATE=PLAN_DATE THEN BASE_AMOUNT END) REQUIRED_RK_AMOUNT,
MAX(CASE WHEN PLAN_DATE IN ('R','C') THEN PLAN_DATE END) MAX_RC_PLAN_DATE,
MAX(CASE WHEN PLAN_DATE IN ('R','C') AND MAX_PLAN_DATE=PLAN_DATE THEN BASE_AMOUNT END) REQUIRED_RC_AMOUNT 
FROM ACCNT_PLN_INFO;

正如预期的那样,它失败了,因为我将聚合函数嵌套到正常的 case 语句中。 我想通过将其拆分为来使用数据块

SELECT ....
(SELECT ACCOUNT_NUMBER, 'RK', 
MAX(PLAN_DATE) MAX_RK_PLAN_DATE FROM ACCNT_PLN_INFO WHERE 
PLAN_DATE IN ('R','K') 
UNION 
SELECT ACCOUNT_NUMBER, 'RC', 
MAX(PLAN_DATE) MAX_RC_PLAN_DATE FROM ACCNT_PLN_INFO WHERE 
PLAN_DATE IN ('R','C') )

并想再次从同一个表中加入外部选择。但是由于(R.K)和(R,C)的不同可能组合,我无法做到这一点。当不涉及任何组合时,我知道如何实现它。

为方便起见,我只指定了 2 个组合,有 2 个值作为 PLAN_DATE IN ('R','K');计划日期('R','C')。但实际上有 6 个组合,每个组合会有 4 个值。

我已经尽我所能来实现这一目标。但不幸的是,做不到。当我们需要多个值的组合和列值的最大值时如何选择列值。感谢您宝贵的时间。

【问题讨论】:

  • 存在多少不同的Account_number 以及Biz_Date 有多少行符合条件?

标签: sql select teradata


【解决方案1】:

您可以使用类似于您尝试应用肮脏的技巧的聚合的方法,piggybacking

您将两列合并为一个字符串,应用 MAX,然后再次剥离日期部分,例如对于 ACCT1PLAN_DATEBASE_AMOUNT 组合成一个字符串将导致:

'20170101        100'
'20170111         30'
'20170113         50'
'20170118        180'
'20170122         80' -- this will be returned by MAX

应用 max 后,您使用 SUBSTRING 再次提取两列:

   CAST(SUBSTR('2017-01-22         80', 1, 10) AS DATE)
   CAST(SUBSTR('2017-01-22         80', 11) AS INT)

当然,您必须创建一个仍然以正确方式排序的字符串,例如yyyymmdd 用于日期和固定宽度,包括数字的前导空格。

现在是剪切&粘贴&修改:

SELECT ACCOUNT_NUMBER,
   To_Date(Substr(RK, 1,8), 'yyyymmdd') AS MAX_RK_PLAN_DATE,
   Cast(Substring(RK From 9) AS INT) AS REQUIRED_RK_AMOUNT,
   To_Date(Substr(RC, 1,8), 'yyyymmdd') AS MAX_RC_PLAN_DATE,
   Cast(Substring(RC From 9) AS INT) AS REQUIRED_RC_AMOUNT
FROM 
 ( 
   SELECT ACCOUNT_NUMBER, 
      Max(CASE WHEN PLAN_code IN ('R','K') THEN To_Char(PLAN_DATE, 'yyyymmdd') || BASE_AMOUNT END) AS RK,
      Max(CASE WHEN PLAN_code IN ('R','C') THEN To_Char(PLAN_DATE, 'yyyymmdd') || BASE_AMOUNT END) AS RC
   FROM ACCNT_PLN_INFO
   WHERE  biz_date = DATE '2017-05-31'
   GROUP BY 1
 ) AS dt

【讨论】:

  • 感谢 Andrew 和 Dnoeth 提供的宝贵信息。 :)
【解决方案2】:

编辑:使用qualify重写。

您需要获取每个 plan_code 配对的最大计划日期。您可以在两个单独的派生表中执行此操作,使用qualify 获取最大计划日期的数据。然后您可以使用 account_number 将这两个结果连接在一起。

select
rk.account_number,
rk_plan_date,
rk.base_amount as rk_base_amount,
rc.rc_plan_date,
rc.base_amount as rc_base_amount
from
(
select
    ACCNT_PLN_INFO.account_number,
    ACCNT_PLN_INFO.plan_date as rk_plan_date,
    base_amount
from 
    ACCNT_PLN_INFO
where
    plan_code in ('R','K')
qualify row_number() over (partition by ACCNT_PLN_INFO.account_number order by plan_date desc) = 1
) rk
inner join 
(select
    ACCNT_PLN_INFO.account_number,
    ACCNT_PLN_INFO.plan_date as rc_plan_date,
    base_amount
from 
    ACCNT_PLN_INFO
where
    plan_code in ('R','C')
qualify row_number() over (partition by ACCNT_PLN_INFO.account_number order by plan_date desc) = 1
)RC
on RK.account_number = rc.account_number

原始(非 teradata 特定语法):

select
rk.account_number,
rk_plan_date,
rk.base_amount as rk_base_amount,
rc.rc_plan_date,
rc.base_amount as rc_base_amount
from (
    select
    ACCNT_PLN_INFO.account_number,
    ACCNT_PLN_INFO.plan_date as rk_plan_date,
    base_amount
    from 
    ACCNT_PLN_INFO
    inner join (
    select
    account_number,
    max(plan_date) as plan_date
    from
    ACCNT_PLN_INFO
    where
    plan_code in ('R','K')
    group by 1) rk
        on ACCNT_PLN_INFO.account_number = rk.account_number
        and ACCNT_PLN_INFO.plan_date = rk.plan_date
        and ACCNT_PLN_INFO.plan_code in ('R','K')
) RK
inner join (    
select
ACCNT_PLN_INFO.account_number,
ACCNT_PLN_INFO.plan_date as rc_plan_date,
base_amount
from 
ACCNT_PLN_INFO
inner join (
select
account_number,
max(plan_date) as plan_date
from
ACCNT_PLN_INFO
where
plan_code in ('R','C')
group by 1) rc
    on ACCNT_PLN_INFO.account_number = rc.account_number
    and ACCNT_PLN_INFO.plan_date = rc.plan_date
    and ACCNT_PLN_INFO.plan_code in ('C','R')
) RC
on RK.account_number = rc.account_number

【讨论】:

    猜你喜欢
    • 2021-05-25
    • 2012-05-29
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-10-05
    • 2023-03-23
    • 2022-09-29
    • 2018-08-25
    相关资源
    最近更新 更多