【问题标题】:SQL getting balance point of two cumulative sumsSQL获取两个累积和的平衡点
【发布时间】:2019-05-03 09:34:13
【问题描述】:

下面我提供了一个代码,它在一定程度上完成了这项有趣的任务,但由于我的 SQL 知识限制,可能设计得不好。主要问题是,查询在执行时会给出 random(更确切地说是 3 个不同的)结果。我的猜测是,嵌套查询中的行按“随机”选择的列排序,这就是最终结果不同的原因(平衡点取决于顺序)。

内部 SELECTGROUP 创建一个 r 日期列表和两个累积和,例如:

rIndex  r             TotalPerDay   CumulativeSum1  CumulativeSum2
1       02.05.2019    92,81         92,81           0
2       03.05.2019    24,81         117,61          0
3       06.05.2019    43,79         161,40          60
4       07.05.2019    78,65         240,05          120
5       09.05.2019    33,99         274,04          180
6       10.05.2019    73,22         347,26          240
7       13.05.2019    19,24         366,50          300
8       14.05.2019    150,77        517,27          360
9       15.05.2019    22,69         539,95          420
10      16.05.2019    4,96          544,91          480
11      17.05.2019    17,45         562,36          540
12      20.05.2019    27,19         589,55          600
13      21.05.2019    12,45         602,00          660
14      22.05.2019    18,08         620,08          720
15      23.05.2019    3,49          623,57          780
16      24.05.2019    10,51         634,09          840
17      27.05.2019    6,19          640,28          900
18      28.05.2019    3,01          643,29          960
19      29.05.2019    2,68          645,97          1020
20      30.05.2019    184,51        830,48          1080

对示例数据的尝试在附件中(由于下面的评论而被删除)。

在第二个嵌套的SELECT 中,我找到了一个平衡点,它是(第一个)日期,其中CumulativeSum1 > CumulativeSum2。然后我必须找到一个包含总和的天数索引(因为也有没有数据的天数),这就是最终结果;它是下面查询中最上面的SELECT

DECLARE @eDate as Date
DECLARE @DayLimit INT
SET @DayLimit = 60  -- let's assume a constant here
SET @eDate = DATEFROMPARTS('2019','05','31')

-- get balance point INDEX over non-empty days
SELECT (SELECT COUNT(cDate) FROM Calendar WHERE KindOfDay = 'BANKDAY' AND cDate BETWEEN GETDATE() AND SRC3.BalanceDate) as rIndex
FROM
    (    
    SELECT TOP 1 SRC2.rDate   -- get first balance point (date)
    FROM
        (
        SELECT 
             ROW_NUMBER() OVER (ORDER BY SRC.rDate) as RowNo
            ,SRC.rDate 
            ,SRC.TotalPerDay      -- not required for processing, included just for info and check
            ,(SELECT (SUM((eTime-ISNULL(rDura,0))/60)) FROM MyTable1 as MT WHERE MT.r <= SRC.rDate AND MT.r < @eDate)         as CumulativeSum1
            ,((SELECT COUNT(cDate) FROM Calendar WHERE KindOfDay = 'BANKDAY' AND cDate BETWEEN GETDATE() AND SRC.rDate) * @DayLimit) as CumulativeSum2
        FROM (
            SELECT   
                  CASE  
                      WHEN CAST(r as DATE) < CAST(GETDATE() as date)  
                      THEN DATEADD(dd,-1,CAST(GETDATE() as date))                
                      ELSE CAST(r as date)                           
                  END as rDate, 
                  SUM((eTime-ISNULL(rDura,0))/60) as TotalPerDay      
            FROM MyTable1 
            WHERE r < @eDate
            GROUP BY  -- group by non-empty dates, group all past dates to yesterday
                   CASE  
                       WHEN CAST(r as DATE) < CAST(GETDATE() as date)  
                      THEN DATEADD(dd,-1,CAST(GETDATE() as date))               
                      ELSE CAST(r as date)                             
                   END                                
        ) as SRC                          
        --ORDER BY rDate
        ) as SRC2  -- compiled list of sums per day
    WHERE SRC2.CumulativeSum2 > SRC2.CumulativeSum1;    -- balance condition
) as SRC3

对于明显的问题,我谦虚地征求建议:

  • 如何确保嵌套查询中的行顺序以获得可靠的结果?
  • 我的查询设计中是否存在明显错误以及如何改进?

另外,我刚刚意识到最上面的查询存在差异,我在银行日获得索引,但是索引应该在非空银行日...

一些样本数据:

-------  CALENDAR TABLE  --------------------------------------------------------------------

CREATE TABLE [dbo].[Calendar](
    [cDate] [datetime] NOT NULL,
    [KindOfDay] [varchar](10) NOT NULL
PRIMARY KEY CLUSTERED 
(
    [cDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]


INSERT INTO [dbo].[Calendar] ([cDate],[KindOfDay])  VALUES 
    ('2019-04-20 00:00:00.000', 'SATURDAY'),
    ('2019-04-21 00:00:00.000', 'SUNDAY'),
    ('2019-04-22 00:00:00.000', 'HOLIDAY'),
    ('2019-04-23 00:00:00.000', 'BANKDAY'),
    ('2019-04-24 00:00:00.000', 'BANKDAY'),
    ('2019-04-25 00:00:00.000', 'BANKDAY'),
    ('2019-04-26 00:00:00.000', 'BANKDAY'),
    ('2019-04-27 00:00:00.000', 'SATURDAY'),
    ('2019-04-28 00:00:00.000', 'SUNDAY'),
    ('2019-04-29 00:00:00.000', 'BANKDAY'),
    ('2019-04-30 00:00:00.000', 'BANKDAY'),
    ('2019-05-01 00:00:00.000', 'HOLIDAY'),
    ('2019-05-02 00:00:00.000', 'BANKDAY'),
    ('2019-05-03 00:00:00.000', 'BANKDAY'),
    ('2019-05-04 00:00:00.000', 'SATURDAY'),
    ('2019-05-05 00:00:00.000', 'SUNDAY'),
    ('2019-05-06 00:00:00.000', 'BANKDAY'),
    ('2019-05-07 00:00:00.000', 'BANKDAY'),
    ('2019-05-08 00:00:00.000', 'HOLIDAY'),
    ('2019-05-09 00:00:00.000', 'BANKDAY'),
    ('2019-05-10 00:00:00.000', 'BANKDAY'),
    ('2019-05-11 00:00:00.000', 'SATURDAY'),
    ('2019-05-12 00:00:00.000', 'SUNDAY'),
    ('2019-05-13 00:00:00.000', 'BANKDAY'),
    ('2019-05-14 00:00:00.000', 'BANKDAY'),
    ('2019-05-15 00:00:00.000', 'BANKDAY'),
    ('2019-05-16 00:00:00.000', 'BANKDAY'),
    ('2019-05-17 00:00:00.000', 'BANKDAY'),
    ('2019-05-18 00:00:00.000', 'SATURDAY'),
    ('2019-05-19 00:00:00.000', 'SUNDAY'),
    ('2019-05-20 00:00:00.000', 'BANKDAY'),
    ('2019-05-21 00:00:00.000', 'BANKDAY'),
    ('2019-05-22 00:00:00.000', 'BANKDAY'),
    ('2019-05-23 00:00:00.000', 'BANKDAY'),
    ('2019-05-24 00:00:00.000', 'BANKDAY'),
    ('2019-05-25 00:00:00.000', 'SATURDAY'),
    ('2019-05-26 00:00:00.000', 'SUNDAY'),
    ('2019-05-27 00:00:00.000', 'BANKDAY'),
    ('2019-05-28 00:00:00.000', 'BANKDAY'),
    ('2019-05-29 00:00:00.000', 'BANKDAY'),
    ('2019-05-30 00:00:00.000', 'BANKDAY'),
    ('2019-05-31 00:00:00.000', 'BANKDAY'),
    ('2019-06-01 00:00:00.000', 'SATURDAY'),
    ('2019-06-02 00:00:00.000', 'SUNDAY'),
    ('2019-06-03 00:00:00.000', 'BANKDAY'),
    ('2019-06-04 00:00:00.000', 'BANKDAY'),
    ('2019-06-05 00:00:00.000', 'BANKDAY'),
    ('2019-06-06 00:00:00.000', 'BANKDAY'),
    ('2019-06-07 00:00:00.000', 'BANKDAY'),
    ('2019-06-08 00:00:00.000', 'SATURDAY'),
    ('2019-06-09 00:00:00.000', 'SUNDAY'),
    ('2019-06-10 00:00:00.000', 'BANKDAY'),
    ('2019-06-11 00:00:00.000', 'BANKDAY'),
    ('2019-06-12 00:00:00.000', 'BANKDAY'),
    ('2019-06-13 00:00:00.000', 'BANKDAY'),
    ('2019-06-14 00:00:00.000', 'BANKDAY'),
    ('2019-06-15 00:00:00.000', 'SATURDAY'),
    ('2019-06-16 00:00:00.000', 'SUNDAY'),
    ('2019-06-17 00:00:00.000', 'BANKDAY'),
    ('2019-06-18 00:00:00.000', 'BANKDAY'),
    ('2019-06-19 00:00:00.000', 'BANKDAY'),
    ('2019-06-20 00:00:00.000', 'BANKDAY')
GO



-------  MyTable1 TABLE  --------------------------------------------------------------------

CREATE TABLE [dbo].[MyTable1](
    [ID] [int] NOT NULL,
    [rDate] [date] NOT NULL,
    [eTime] [decimal](12,6) NOT NULL,
    [rDura] [date] NULL
) 



INSERT INTO MyTable1 (ID, rDura, eTime, rDate) VALUES
    (17008431,NULL,0.1855,'2019-05-02'), 
    (17008477,NULL,0.059,'2019-05-02'), 
    (17008500,NULL,0.329667,'2019-05-02'), 
    (17090449,NULL,3.3195,'2019-05-02'), 
    (16888594,NULL,13.830667,'2019-04-26'), 
    (16888681,NULL,12.6635,'2019-04-26'), 
    (16888722,NULL,8.154667,'2019-05-07'), 
    (16888750,NULL,7.83,'2019-05-07'), 
    (16888766,NULL,5.22,'2019-05-07'), 
    (16955798,NULL,12.35,'2019-05-07'), 
    (17108201,NULL,1.669833,'2019-05-07'), 
    (17110834,NULL,2.596667,'2019-05-02'), 
    (17111001,NULL,0.814667,'2019-05-06'), 
    (16893842,NULL,1.053,'2019-05-07'), 
    (16951779,NULL,2.720833,'2019-05-03'), 
    (16951821,NULL,4.042333,'2019-05-06'), 
    (17017058,NULL,0.227333,'2019-05-02'), 
    (17017060,NULL,1.06,'2019-05-02'), 
    (17017066,NULL,1.869333,'2019-05-02'), 
    (17019289,NULL,0.835667,'2019-04-26'), 
    (17020295,NULL,3.983333,'2019-04-21'), 
    (17106404,105,3.3545,'2019-04-29'), 
    (17107843,NULL,2.815167,'2019-05-07'), 
    (16725584,NULL,0.693,'2019-04-25'), 
    (17101197,NULL,3.906667,'2019-04-30'), 
    (17101993,NULL,0.571667,'2019-05-06'), 
    (17102225,NULL,3.048833,'2019-04-30'), 
    (17102482,NULL,7.5945,'2019-05-10'), 
    (16974196,NULL,1.633333,'2019-05-06'), 
    (17113406,NULL,0.871833,'2019-05-02'), 
    (17113408,NULL,0.749833,'2019-05-02'), 
    (17113784,NULL,1.961333,'2019-05-03'), 
    (17120601,NULL,4.033333,'2019-05-06'), 
    (17120609,NULL,3.983333,'2019-05-06'), 
    (17120618,NULL,2.626667,'2019-05-06'), 
    (17120626,NULL,2.64,'2019-05-06'), 
    (17120628,NULL,3.684167,'2019-05-06'), 
    (17121720,NULL,2.235,'2019-04-30'), 
    (17058455,NULL,5.806667,'2019-04-29'), 
    (17059476,NULL,2.264833,'2019-05-22'), 
    (17059478,NULL,182.603667,'2019-05-30'), 
    (17065386,NULL,5.539667,'2019-05-10'), 
    (16927091,NULL,1.381,'2019-05-14'), 
    (16927093,NULL,112.304685,'2019-05-14'), 
    (16991456,NULL,0.931667,'2019-04-29'), 
    (17122394,NULL,1.560167,'2019-05-03'), 
    (17126711,NULL,4.046,'2019-05-03'), 
    (16935823,NULL,0.359,'2019-04-25'), 
    (17069727,NULL,1.952833,'2019-05-03'), 
    (17069870,NULL,1.742333,'2019-05-02'), 
    (17070555,NULL,5.416667,'2019-05-02'), 
    (17070557,NULL,3.894167,'2019-05-02'), 
    (17070851,NULL,2.64,'2019-04-23'), 
    (17073724,NULL,0.737667,'2019-05-03'), 
    (17074763,NULL,1.413833,'2019-05-02'), 
    (17131824,NULL,4.258,'2019-05-10'), 
    (17132133,NULL,0.257667,'2019-05-14'), 
    (17132865,NULL,2.769833,'2019-05-17'), 
    (17138082,NULL,7.866667,'2019-05-31'), 
    (17139196,NULL,5.860167,'2019-05-03'), 
    (17139200,NULL,1.479667,'2019-05-03'), 
    (16983337,NULL,2.951667,'2019-05-02'), 
    (17028542,NULL,0.680333,'2019-05-13'), 
    (16823160,NULL,5,'2019-05-06'), 
    (16823168,NULL,5,'2019-05-06'), 
    (16823182,NULL,5,'2019-05-06'), 
    (16823192,NULL,5,'2019-05-06'), 
    (16906776,NULL,0.8635,'2019-05-02'), 
    (17082286,NULL,3.333333,'2019-05-09'), 
    (17083776,NULL,2.317167,'2019-04-25'), 
    (17083778,NULL,1.447167,'2019-05-02'), 
    (17084568,NULL,0.2375,'2019-05-02'), 
    (17154415,NULL,2.64,'2019-05-14'), 
    (17154425,NULL,2.626667,'2019-05-14'), 
    (17154453,NULL,0.052,'2019-05-06'), 
    (17155029,NULL,3.256667,'2019-05-22'), 
    (17157159,NULL,1.333333,'2019-05-15'), 
    (16994233,NULL,0.252167,'2019-04-29'), 
    (17039767,NULL,1.401667,'2019-05-10'), 
    (17040346,NULL,4.021667,'2019-05-09'), 
    (17040815,NULL,1.2675,'2019-05-16'), 
    (17042063,NULL,0.213333,'2019-05-03'), 
    (17050144,NULL,0.976667,'2019-05-02'), 
    (17050150,NULL,0.837167,'2019-05-20'), 
    (17051422,NULL,1.826,'2019-05-07'), 
    (17142464,NULL,0.464333,'2019-05-06'), 
    (17145501,NULL,4.745333,'2019-06-06'), 
    (17145980,NULL,0.195167,'2019-05-07'), 
    (17145999,NULL,1.330833,'2019-05-07'), 
    (17146001,NULL,1.503833,'2019-05-06'), 
    (17146011,NULL,1.22,'2019-05-03'), 
    (17146017,NULL,0.373,'2019-05-07'), 
    (17146023,NULL,0.5745,'2019-05-03'), 
    (17146127,NULL,1.7835,'2019-05-15'), 
    (17146131,NULL,13.5595,'2019-05-07'), 
    (17152617,NULL,4.535667,'2019-05-10'), 
    (17154390,NULL,3.983333,'2019-05-14'), 
    (17154398,NULL,5.416667,'2019-05-14'), 
    (17154400,NULL,3.684167,'2019-05-14')

 GO

【问题讨论】:

  • “如何确保嵌套查询中的行顺序以获得可靠的结果?” 您不能在子查询/CTE 中对数据进行排序。如果您需要保持对订单的了解,则需要在子查询/CTE 内使用ROW_NUMBER 来为外部查询提供要订购的值。 此外,请勿提供指向 Google/One Drive 等内容的链接。许多志愿者不会点击该链接,因为我们不信任来自匿名陌生人的文件。如果您需要包含数据/DDL/DML 等,请在您的帖子中使用text 格式。谢谢。
  • 这似乎有点曲折。
  • @Larnu:附件没问题,我可以删除它。但是,我遵循了SO META 帖子中的推荐程序。我还假设这就是链接功能的用途。并且 SQL 不是直接下载的,而是由 GDrive 首先以文本文件的形式显示在屏幕上。关于ROW_NUMBER(),你一定是对的;我以为它会产生相同的效果,但没有意识到我可以在ROW_NUMBER() 中使用ORDER
  • 我添加了 ROW_NUMBER()(如上面的代码所示),但如果不能在嵌套查询中使用 ORDER BY,我不知道如何使用它。当然,添加的 RowNo 字段本身没有帮助。
  • @Oak_3260548 。 . .我认为您应该用示例数据、期望的结果和逻辑解释提出另一个问题。

标签: sql sql-server cumulative-sum balance


【解决方案1】:

建议的ROW_NUMBER() 并没有帮助解决这个问题。我不得不将任务分成两个步骤:首先,我必须设置一个变量@bDate 来存储2 个内部嵌套SELECTs 的结果,然后在单独的SELECT 步骤中找到该日期的索引。

DECLARE @eDate as Date
DECLARE @DayLimit INT
SET @DayLimit = 60  -- let's assume a constant here
SET @eDate = DATEFROMPARTS('2019','05','31')

-- get balance point INDEX over non-empty days    
SELECT TOP 1 @bDate = SRC2.rDate   -- get first balance point (date)
FROM
    (
    SELECT 
            ROW_NUMBER() OVER (ORDER BY SRC.rDate) as RowNo
        ,SRC.rDate 
        ,SRC.TotalPerDay      -- not required for processing, included just for info and check
        ,(SELECT (SUM((eTime-ISNULL(rDura,0))/60)) FROM MyTable1 as MT WHERE MT.r <= SRC.rDate AND MT.r < @eDate)         as CumulativeSum1
        ,((SELECT COUNT(cDate) FROM Calendar WHERE KindOfDay = 'BANKDAY' AND cDate BETWEEN GETDATE() AND SRC.rDate) * @DayLimit) as CumulativeSum2
    FROM (
        SELECT   
                CASE  
                    WHEN CAST(r as DATE) < CAST(GETDATE() as date)  
                    THEN DATEADD(dd,-1,CAST(GETDATE() as date))                
                    ELSE CAST(r as date)                           
                END as rDate, 
                SUM((eTime-ISNULL(rDura,0))/60) as TotalPerDay      
        FROM MyTable1 
        WHERE r < @eDate
        GROUP BY  -- group by non-empty dates, group all past dates to yesterday
                CASE  
                    WHEN CAST(r as DATE) < CAST(GETDATE() as date)  
                    THEN DATEADD(dd,-1,CAST(GETDATE() as date))               
                    ELSE CAST(r as date)                             
                END                                
    ) as SRC                          
    --ORDER BY rDate
    ) as SRC2  -- compiled list of sums per day
WHERE SRC2.CumulativeSum2 > SRC2.CumulativeSum1;    -- balance condition

-- get the index of bank day from Today
SELECT (SELECT COUNT(cDate) FROM Calendar WHERE KindOfDay = 'BANKDAY' AND cDate BETWEEN GETDATE() AND @bDate); 

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2016-12-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-07-26
    • 2021-01-17
    相关资源
    最近更新 更多