【问题标题】:Finding groups of sequential, identical values查找连续的、相同的值组
【发布时间】:2015-06-05 20:04:45
【问题描述】:

如果我有一个如下所示的表格:

+---------------------------------------------------+
| SalesPerson | SalesYear | SalesMonth | TotalSales |
+-------------+-----------+------------+------------|
| Dave        | 2011      | 1          |27          |
| Meg         | 2012      | 7          |162         |
| Randy       | 2011      | 3          |0           |
| Julio       | 2013      | 8          |15          |
| Bob         | 2014      | 12         |0           |
| Mary        | 2012      | 5          |20          |
+---------------------------------------------------+

我想找出不活动的时间段,假设销售人员至少连续三个月没有销售,我该怎么做?我不只是想要一份没有销售的所有月份的清单;我需要看到长时间不活动。多个连续的零。我想不通。

【问题讨论】:

  • 有机会提供更好的数据吗?就目前而言,每个销售人员只列出一次,因此从技术上讲,他们都符合条件。

标签: sql-server tsql sql-server-2008-r2 gaps-and-islands


【解决方案1】:

第一步是将年份和月份转换为日期时间。 接下来我们应该在SalesPerson 的part 中找到每个不活动时间段的左右边界。 然后我们对左边界和右边界进行排序以确定如何连接它们。 最后加入左右边界。

--------------creating test data------------------------------------
    declare @t table(SalesPerson varchar(max), SalesYear int, SalesMonth int, TotalSales int)

     select replicate('0', 2 - len(cast(3 as varchar))) + '3'

    insert into @t(SalesPerson, SalesYear , SalesMonth , TotalSales)
    select 'Dave', 2011, 1, 27 union all
    select 'Meg', 2012, 7, 162 union all
    select 'Randy', 2011, 3, 0 union all
    select 'Julio', 2013, 8, 15 union all
    select 'Bob', 2014, 12, 0 union all
    select 'Mary', 2012, 5, 20 union all
    select 'Mary', 2012, 6, 0 union all
    select 'Mary', 2012, 7, 0 union all
    select 'Mary', 2012, 8, 20 union all
    select 'Mary', 2012, 9, 20 
-------------------------------------------

    ;with cte as
    (
        select SalesPerson
            , cast(cast(SalesYear as varchar) + replicate('0', 2 - len(cast(SalesMonth as varchar))) + cast(SalesMonth as varchar) + '01' as datetime) as dt
            , TotalSales
            , row_number() over (partition by SalesPerson order by  cast(cast(SalesYear as varchar) + replicate('0', 2 - len(cast(SalesMonth as varchar))) + cast(SalesMonth as varchar) + '01' as datetime)) as rn
            , row_number() over (partition by SalesPerson order by  cast(cast(SalesYear as varchar) + replicate('0', 2 - len(cast(SalesMonth as varchar))) + cast(SalesMonth as varchar) + '01' as datetime) desc)  as rnd
        from @t 
    ),
    l as
    (
    select *, row_number() over(partition by SalesPerson order by dt) n
    from cte t1
    where TotalSales = 0
        and (
            exists
            (
                select * 
                from cte 
                where t1.SalesPerson = SalesPerson 
                    and dateadd(mm,-1, t1.dt) = dt
                    and TotalSales > 0
            )
            or rn = 1
            )
    ),
    r as
    (
    select *,row_number() over(partition by SalesPerson order by dt) n
    from cte t1
    where TotalSales = 0
        and (
            exists
            (
                select * 
                from cte 
                where t1.SalesPerson = SalesPerson 
                    and dateadd(mm,1, t1.dt) = dt
                    and TotalSales > 0
            )
            or rnd = 1
            )
    )
    select l.SalesPerson, l.dt as dateStart, r.dt as dateEnd
    from l 
        join r on l.n = r.n
            and l.SalesPerson = r.SalesPerson

【讨论】:

  • 这似乎没有考虑多个死期。例如,如果 Bob 从 2011 年 2 月到 2011 年 6 月,然后从 2014 年 4 月到 2014 年 7 月有四个下降的月份,则此查询只会注意到第二个时期。
  • 死期是什么意思?表中是否可能不存在同一个人的其他时期之间的时期记录?
【解决方案2】:

您对不活动期的定义有点模糊,例如您是否考虑雇用/解雇日期?以下代码实现了一种不需要递归的解释。

-- Sample data.
declare @Sales as Table (
  SalesPerson VarChar(10), SalesYear Int, SalesMonth Int, TotalSales Int );
insert into @Sales ( SalesPerson, SalesYear, SalesMonth, TotalSales ) values
  ( 'Dave', 2011, 1, 27) ,
  ( 'Meg', 2012, 7, 162 ),
  ( 'Randy', 2011, 3, 0 ),
  ( 'Julio', 2013, 8, 15 ),
  ( 'Bob', 2014, 12, 0 ),
  ( 'Mary', 2012, 5, 20 ),
  ( 'William', 2014, 1, 30 ),
  ( 'William', 2014, 2, 0 ),
  ( 'William', 2014, 4, 10 ),
  ( 'William', 2014, 6, 3 ),
  ( 'William', 2014, 7, 90 ),
  ( 'William', 2014, 12, 5 );
select * from @Sales;

-- Analyze it.
with
  -- Get only the nonzero sales rows and combine the year/month into a single integer.
  NonZeroSales as (
    select SalesPerson, SalesYear * 12 + SalesMonth as CombinedMonth, TotalSales
      from @Sales
      where TotalSales <> 0 ),
  -- Add row numbers for each sales person.
  NonZeroSalesWithRN as (
    select SalesPerson, CombinedMonth, TotalSales,
      Row_Number() over ( partition by SalesPerson order by CombinedMonth ) as RN
      from NonZeroSales )
  -- Match adjacent rows for each sales person.
  --   If there is a gap of three or more months then indicate it in a status column.
  select L.SalesPerson,
    Floor( L.CombinedMonth / 12 ) as SalesYear, L.CombinedMonth % 12 as SalesMonth,
    L.TotalSales,
    case when R.CombinedMonth - L.CombinedMonth > 3 then 'Gap > 3 Months' else 'Okay' end as SalesStatus
    from NonZeroSalesWithRN as L inner join
      NonZeroSalesWithRN as R on R.SalesPerson = L.SalesPerson and R.RN = L.RN + 1;
  -- Tip: To see what is going on, or debug, multiple CTEs replace the last select with
  --   select * from NonZeroSales
  --   select * from NonZeroSalesWithRN

【讨论】:

    【解决方案3】:

    HAVING (Transact-SQL) 是您的答案。

    https://msdn.microsoft.com/en-us/library/ms180199.aspx

    一开始看错了问题。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2022-11-22
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-05-09
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多