【问题标题】:How can I add a new calculated column using a window function to my SQL query?如何使用窗口函数向 mySQL 查询添加新的计算列?
【发布时间】:2020-12-16 19:35:54
【问题描述】:

我的数据如下所示:


Trader Name      | Currency_Code | Counterparty | Traded_Amount | Total_Traded_Volume | Baseline_Avg | Variance
Jules Winnfield  | GBP           |  GOLD        | 10000         | 30000               | 10000        | 0
Jules Winnfield  | GBP           |  BARC        | 8000          | 30000               | 11000        | -3000
Jules Winnfield  | GBP           |  JPMORG      | 12000         | 30000               | 9000         | +3000
Jules Winnfield  | EUR           |  GOLD        | 15000         | 27000               | 6000         | 21000
Jules Winnfield  | EUR           |  BARC        | 2000          | 27000               | 12500        | -10500
Jules Winnfield  | EUR           |  JPMORG      | 10000         | 27000               | 8500         | +1500

让我花一点时间简要解释一下这个数据集:

  1. 交易者在三个交易对手(例如在本例中为高盛、巴克莱和摩根大通)进行了总价值 30000 英镑的交易。
  2. 单个金额,即 £10000、£8000 和 £12000 是对单个交易本身执行的简单 sum() 聚合,其中 £30000 通过使用 OVER (PARTITION BY TRADER_NAME, CURRENCY_CODE) 的另一个聚合获得
  3. baseline_average 计算与所有其他交易对手的平均交易量 - 例如Jules 与巴克莱的交易额为 8000 英镑,与其他交易对手(高盛和摩根大通)的平均交易量为 11000 英镑。方差是 traded_amount 和 baseline_average 之间的差异。

用于生成上述输出的代码是:

SELECT 

     OT.TRADER_NAME, 
     OT.CURRENCY_CODE, 
     OT.COUNTERPARTY, 
     SUM(OT.TRADED_AMOUNT) AS TRADED_AMOUNT,
     SUM(OT.TRADED_AMOUNT) OVER (PARTITION BY OT.TRADER_NAME, OT.CURRENCY_CODE) AS TOTAL_TRADED_VOL,
     (SUM(OT.TRADED_AMOUNT) OVER (PARTITION BY OT.TRADER_NAME, OT.CURRENCY_CODE)- 
     SUM(OT.TRADED_AMOUNT))/NULLIF(SUM(1) OVER (PARTITION BY OT.TRADER_NAME, OT.CURRENCY_CODE)-1),0) 
     AS BASELINE_AVG,
     SUM(OT.TRADED_AMOUNT) - (SUM(OT.TRADED_AMOUNT) OVER (PARTITION BY OT.TRADER_NAME, 
     OT.CURRENCY_CODE)-SUM(OT.TRADED_AMOUNT))/NULLIF(SUM(1) OVER (PARTITION BY OT.TRADER_NAME, 
     OT.CURRENCY_CODE)-1),0) AS VARIANCE

FROM ORDERS_TRADES_DATA OT
GROUP BY OT.TRADER_NAME, OT.CURRENCY_CODE, OT.COUNTERPARTY, FX.FX_RATE

到目前为止一切顺利。只要我指定我感兴趣的货币,这使我能够对数据进行切片。但是,我现在想添加一个列,将交易者的整个交易量汇总为等值美元 - 本质上,每个用户的一个 traded_volume 以美元为一个窗口函数——我可以用它来分析。我将外汇汇率存储在单独的表中,并且可以应用联接。已尝试运行以下查询:

SELECT 

     OT.TRADER_NAME, 
     OT.CURRENCY_CODE, 
     OT.COUNTERPARTY, 
     SUM(OT.TRADED_AMOUNT) AS TRADED_AMOUNT,
     SUM(OT.TRADED_AMOUNT) OVER (PARTITION BY OT.TRADER_NAME, OT.CURRENCY_CODE) AS TOTAL_TRADED_VOL,
     (SUM(OT.TRADED_AMOUNT) OVER (PARTITION BY OT.TRADER_NAME, OT.CURRENCY_CODE)- 
     SUM(OT.TRADED_AMOUNT))/NULLIF(SUM(1) OVER (PARTITION BY OT.TRADER_NAME, OT.CURRENCY_CODE)-1),0) 
     AS BASELINE_AVG,
     SUM(OT.TRADED_AMOUNT) - (SUM(OT.TRADED_AMOUNT) OVER (PARTITION BY OT.TRADER_NAME, 
     OT.CURRENCY_CODE)-SUM(OT.TRADED_AMOUNT))/NULLIF(SUM(1) OVER (PARTITION BY OT.TRADER_NAME, 
     OT.CURRENCY_CODE)-1),0) AS VARIANCE,
     SUM(OT.TRADED_AMOUNT)/FX.FX_RATE AS TRADED_AMOUNT_USD,
     SUM((SUM(OT.TRADED_AMOUNT)/FX.FX_RATE) AS TOTAL_TRADED_VOL_USD,
     (SUM(OT.TRADED_AMOUNT)/FX.FX_RATE OVER (PARTITION BY OT.TRADER_NAME)- 
     SUM(OT.TRADED_AMOUNT)/FX.FX_RATE)/NULLIF(SUM(1) OVER (PARTITION BY OT.TRADER_NAME)-1),0) 
     AS BASELINE_AVG_USD,
     SUM((SUM(OT.TRADED_AMOUNT)/FX.FX_RATE) - (SUM(OT.TRADED_AMOUNT)/FX.FX_RATE OVER (PARTITION BY 
     OT.TRADER_NAME)-SUM(OT.TRADED_AMOUNT)/FX.FX_RATE)/NULLIF(SUM(1) OVER (PARTITION BY 
     OT.TRADER_NAME)-1),0) AS VARIANCE_USD

FROM ORDERS_TRADES_DATA OT
LEFT JOIN FX_RATES_TABLE FX ON OT.CURRENCY_CODE = FX.ASSET_CURRENCY_CODE
GROUP BY OT.TRADER_NAME, OT.CURRENCY_CODE, OT.COUNTERPARTY, FX.FX_RATE
     

...当我收到错误时不起作用:

无法对包含聚合或子查询的表达式执行聚合函数。

我如何在这里实现我的目标?

【问题讨论】:

    标签: sql sql-server aggregate-functions window-functions


    【解决方案1】:

    即时错误是由于分层聚合SUM 调用:SUM((SUM(OT.TRADED_AMOUNT)/FX.FX_RATE)。但是由于SELECT 包含GROUP BY 中未引​​用的非聚合列,聚合查询中缺少GROUP BY 子句会引发另一个错误。

    但是,请避免使用任何SUM() OVER(...) 窗口函数,并加入多个聚合级别(trade/currency 级别和trade/currency/counterparty 级别)。然后在没有聚合的外部查询中运行所需的计算。请注意:除以零是未定义的。

    WITH trader_curr_agg AS (
         SELECT   OT.TRADER_NAME
                , OT.CURRENCY_CODE
                , SUM(OT.TRADED_AMOUNT) AS TOTAL_TRADED_VOL
                , COUNT(*) AS TRADE_COUNTS
         FROM ORDERS_TRADES_DATA OT
         GROUP BY   OT.TRADER_NAME
                  , OT.CURRENCY_CODE
    ),  
        trader_counterparty_agg AS (
         SELECT   OT.TRADER_NAME
                , OT.CURRENCY_CODE
                , OT.COUNTERPARTY
                , SUM(OT.TRADED_AMOUNT) AS TRADED_AMOUNT
         FROM ORDERS_TRADES_DATA OT
         GROUP BY   OT.TRADER_NAME
                  , OT.CURRENCY_CODE
                  , OT.COUNTERPARTY
    )
    
    SELECT
             tcntr.TRADER_NAME
           , tcntr.CURRENCY_CODE
           , tcntr.COUNTERPARTY
    
           , tcntr.TRADED_AMOUNT
           , tcurr.TOTAL_TRADED_VOL
           , (tcurr.TOTAL_TRADED_VOL - tcntr.TRADED_AMOUNT)
                      / NULLIF(tcurr.TRADE_COUNTS-1, 0) AS BASELINE_AVG
           , (tcntr.TRADED_AMOUNT - (tcurr.TOTAL_TRADED_VOL - tcntr.TRADED_AMOUNT)) 
                      / NULLIF(tcurr.TRADE_COUNTS-1, 0) AS VARIANCE
    
           , tcntr.TRADED_AMOUNT / FX.FX_RATE AS TRADED_AMOUNT_USD
           , tcurr.TOTAL_TRADED_VOL / FX.FX_RATE AS TOTAL_TRADED_VOL_USD
           , ((tcurr.TOTAL_TRADED_VOL - tcntr.TRADED_AMOUNT) 
                      / NULLIF(tcurr.TRADE_COUNTS-1, 0)) / FX.FX_RATE AS BASELINE_AVG_USD
           , ((tcntr.TRADED_AMOUNT - (tcurr.TOTAL_TRADED_VOL - tcntr.TRADED_AMOUNT)) 
                      / NULLIF(tcurr.TRADE_COUNTS-1, 0)) / FX.FX_RATE AS VARIANCE_USD
     
    FROM trader_counterparty_agg tcntr
    INNER JOIN trader_currency_agg tcurr
        ON tcntr.TRADER_NAME = tcurr.TRADER_NAME
        AND tcntr.CURRENCY_CODE = tcurr.CURRENCY_CODE
    LEFT JOIN FX_RATES_TABLE FX 
        ON tcntr.CURRENCY_CODE = FX.ASSET_CURRENCY_CODE
    

    【讨论】:

    • 非常感谢,我会试一试 - 很抱歉我错过了编写完整的代码;我实际上使用了 Group By - 第一个查询按我提到的那样工作,但第二个查询(我正在尝试美元转换)没有......让我试试你的解决方案。
    • 明白。仔细阅读您的错误实际上是由于在您的一个计算中分层SUMSUM((SUM(OT.TRADED_AMOUNT)/FX.FX_RATE)。但是缺少GROUP BY 也是一个问题。是的,请考虑这个解决方案。您避免了许多SUM() OVER() 调用以提高可读性,甚至计算一次总和以提高效率。如果我未经测试的翻译产生问题,请根据需要调整公式。真正的区别是_USD 列除以FX_RATE
    • 对延迟的响应表示歉意,在这方面花了很多时间 - 我能够成功地调整您的解决方案,也花了一些时间将我自己的调整添加到公式中。关于可读性,您是绝对正确的,但考虑到所需的计算和报告的数量,这很可能是不可避免的!
    【解决方案2】:

    你可以这样写查询:

    SELECT
         A.TRADER_NAME, 
         A.CURRENCY_CODE, 
         A.COUNTERPARTY, 
         A.TRADED_AMOUNT,
         A.TOTAL_TRADED_VOL,
         A.BASELINE_AVG,
         A.VARIANCE,              
         A.TRADED_AMOUNT/FX.FX_RATE AS TRADED_AMOUNT_USD,
         A.TOTAL_TRADED_VOL/FX.FX_RATE AS TOTAL_TRADED_VOL_USD,
         A.BASELINE_AVG/FX.FX_RATE AS BASELINE_AVG_USD,
         A.VARIANCE/FX.FX_RATE AS VARIANCE_USD
         
    FROM   
        (SELECT 
             OT.TRADER_NAME, 
             OT.CURRENCY_CODE, 
             OT.COUNTERPARTY, 
             SUM(OT.TRADED_AMOUNT) AS TRADED_AMOUNT,
             SUM(OT.TRADED_AMOUNT) OVER (PARTITION BY OT.TRADER_NAME, OT.CURRENCY_CODE) AS TOTAL_TRADED_VOL,
             (SUM(OT.TRADED_AMOUNT) OVER (PARTITION BY OT.TRADER_NAME, OT.CURRENCY_CODE)- 
             SUM(OT.TRADED_AMOUNT))/NULLIF(SUM(1) OVER (PARTITION BY OT.TRADER_NAME, OT.CURRENCY_CODE)-1),0) 
             AS BASELINE_AVG,
             SUM(OT.TRADED_AMOUNT) - (SUM(OT.TRADED_AMOUNT) OVER (PARTITION BY OT.TRADER_NAME, 
             OT.CURRENCY_CODE)-SUM(OT.TRADED_AMOUNT))/NULLIF(SUM(1) OVER (PARTITION BY OT.TRADER_NAME, 
             OT.CURRENCY_CODE)-1),0) AS VARIANCE
    
        FROM ORDERS_TRADES_DATA O) A
    LEFT JOIN FX_RATES_TABLE FX ON FX.ASSET_CURRENCY_CODE = A.CURRENCY_CODE
    

    【讨论】:

    • 非常感谢您提出的解决方案;我现在实际上已经使用了基于 cte 的解决方案。非常感谢。
    猜你喜欢
    • 2021-09-04
    • 1970-01-01
    • 2014-03-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-03-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多