【问题标题】:SQL speed: Return each date in date range and count() for eachSQL 速度:返回日期范围内的每个日期,并为每个日期返回 count()
【发布时间】:2017-04-03 23:08:51
【问题描述】:

我的目标是返回一个日期范围内的每个日期,并计算每个日期的所有记录。

MyTable
-------------------------------
| OrderId |   DateFinalized   |
-------------------------------
|   51    | 2016-1-3 12:50:34 |
|   55    | 2016-1-4 10:01:56 |
|   73    | 2016-1-4 11:52:02 |
|   93    | 2016-1-6 01:35:16 |
|   104   | 2016-1-6 02:40:47 |
-------------------------------

挑战是也包括没有订单的日期。使用上面的MyTable,如果日期范围在2016-1-12016-1-6 之间,则所需的输出为:

---------------------
|  MyDate  | Orders |
---------------------
| 2016-1-1 |   0    |
| 2016-1-2 |   0    |
| 2016-1-3 |   1    |
| 2016-1-4 |   2    |
| 2016-1-5 |   0    |
| 2016-1-6 |   2    |
---------------------

为此,我使用此查询来选择 dates only,并在 1 秒内执行:

declare @startdate datetime = '1/1/2016';
declare @enddate datetime = '1/1/2017';

with [dates] as (
    select convert(date, @startdate) as [date] 
    union all
    select dateadd(day, 1, [date])
    from [dates]
    where [date] < @enddate 
)
select 
[date]
from [dates] 
where [date] between @startdate and @enddate
order by [date] desc
option (maxrecursion 0)

当我选择按日期分组的订单计数时,如下所示,它也只需要 1 秒

declare @startdate datetime = '2/1/2016';
declare @enddate datetime = '1/1/2017';
select 
convert(date,DATEADD(dd, DATEDIFF(dd, 0, datefinalized), 0))  as Dates,
count(OrderID) as OrderCount
from orders 
where datefinalized between @startdate and @enddate
GROUP BY DATEADD(dd, DATEDIFF(dd, 0, datefinalized), 0)
order by DATEADD(dd, DATEDIFF(dd, 0, datefinalized), 0) desc

问题是当我将这两个查询组合在一个 SQL 语句中时。 LEFT JOIN 需要 20 秒(!!!) 来执行。我尝试了一个用于咯咯笑的子查询,但在 13 秒时并没有好多少:

如何有效地加入生成的数据集?

提前感谢您的宝贵时间。

【问题讨论】:

标签: sql sql-server


【解决方案1】:

使用递归 cte 是生成日期范围的最糟糕的方法之一。与使用递归 cte 相比,使用堆叠 cte 是 much faster 来按需生成日期范围。

如果您要在多行或长时间内使用它,或者您将多次运行此类操作,您最好只创建一个DatesCalendar 表。

只需要 152kb 的内存,你可以在一个表中保存 30 年的日期,你可以像这样使用它:

/* dates table */ 
declare @fromdate date = '20000101';
declare @years    int  = 30;
/* 30 years, 19 used data pages ~152kb in memory, ~264kb on disk */
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
select top (datediff(day, @fromdate,dateadd(year,@years,@fromdate)))
    [Date]=convert(date,dateadd(day,row_number() over(order by (select 1))-1,@fromdate))
into dbo.Dates
from n as deka cross join n as hecto cross join n as kilo 
               cross join n as tenK cross join n as hundredK
order by [Date];

create unique clustered index ix_dbo_Dates_date 
  on dbo.Dates([Date]);

并像这样查询它:

select
    d.[Date]
  , OrderCount = count(o.OrderID)
from dates d
  left join orders o
    on convert(date,o.OrderDate) = d.[Date]
group by d.[Date]
order by d.[Date] desc

数字和日历表参考:


如果你真的不想要日历表,你可以只使用堆叠的 cte 部分:

declare @fromdate date = '20160101';
declare @years    int  = 1;
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, dates as (
  select top (datediff(day, @fromdate,dateadd(year,@years,@fromdate)))
      [Date]=convert(date,dateadd(day,row_number() over(order by (select 1))-1,@fromdate))
  from n as deka cross join n as hecto cross join n as kilo 
                /* cross join n as tenK cross join n as hundredK */
   order by [Date]
)
select
    d.[Date]
  , OrderCount = count(o.OrderID)
from dates d
  left join orders o
    on convert(date,o.OrderDate) = d.[Date]
group by d.[Date]
order by d.[Date] desc

【讨论】:

    猜你喜欢
    • 2017-02-13
    • 1970-01-01
    • 2018-07-31
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多