【问题标题】:Return duplicates by comparing each records of the same table in the sql server通过比较sql server中同一张表的每条记录返回重复
【发布时间】:2025-12-04 07:05:01
【问题描述】:

我有如下表。我想获取重复的记录。这里是条件

状态 = 1 的订阅者,即活跃,并且通过比较 start_date 和 end_date 在当前年份拥有多条记录。我在 DB 中有大约 5000 多条记录。这里展示几个示例。

id      pkg_id  start_date  end_date    status  subscriber_id
2857206 9128    8/31/2014   8/31/2015   2       3031103
2857207 9128    12/22/2015  12/22/2016  1       3031103
3066285 10308   8/5/2016    8/4/2018    1       3031103
2857206 9128    8/31/2013   8/31/2015   2       3031104
2857207 9128    10/20/2015  11/22/2016  1       3031104
3066285 10308   7/5/2016    7/4/2018    1       3031104
3066285 10308   8/5/2016    8/4/2018    2       3031105

我尝试了下面的查询,但不适用于所有记录:

SELECT  *
FROM    dbo.consumer_subsc
WHERE   status = 1
        AND YEAR(GETDATE()) >= YEAR(start_date)
        AND YEAR(GETDATE()) <= YEAR(end_date)
        AND subscriber_id IN (
        SELECT  T.subscriber_id
        FROM    ( SELECT    subscriber_id ,
                            COUNT(subscriber_id) AS cnt
                  FROM      dbo.consumer_subsc
                  WHERE     status = 1
                  GROUP BY  subscriber_id
                  HAVING    COUNT(subscriber_id) > 1
                ) T )
ORDER BY subscriber_id DESC

问题是我无法找到一种方法,可以将每一行与上述日期条件进行比较。我应该得到如下重复的结果:

id      pkg_id  start_date  end_date    status  subscriber_id
2857207 9128    12/22/2015  12/22/2016  1       3031103
3066285 10308   8/5/2016    8/4/2018    1       3031103
2857207 9128    10/20/2015  11/22/2016  1       3031104
3066285 10308   7/5/2016    7/4/2018    1       3031104

【问题讨论】:

    标签: sql sql-server sql-server-2008 sql-server-2005


    【解决方案1】:

    只需在 where 子句中删除硬编码的用户 ID 过滤器即可。以下查询将返回预期的输出。

    SELECT *
    FROM dbo.consumer_subsc
    WHERE  STATUS = 1
        AND year(getdate()) >= year(start_date)
        AND year(getdate()) <= year(end_date)
        AND subscriber_id IN (
            SELECT T.subscriber_id
            FROM (
                SELECT subscriber_id
                    ,count(subscriber_id) AS cnt
                FROM dbo.consumer_subsc
                WHERE STATUS = 1
                GROUP BY subscriber_id
                HAVING count(subscriber_id) > 1
                ) T
            )
    ORDER BY subscriber_id ,start_date
    

    【讨论】:

    • 请选择最适合您的答案。
    • 我知道这一点,但预计只有重复的记录,它给了我其他非重复的数据。
    • 删除了相同的查询我的唯一过滤器。给出重复记录和非重复记录都没有帮助。
    【解决方案2】:

    您可以使用 EXISTS:

     SELECT t.* FROM dbo.consumer_subsc t 
     WHERE EXISTS(SELECT subscriber_id 
            FROM dbo.consumer_subsc y 
            WHERE y.status=t.status
                AND y.subscriber_id = t.subscriber_id 
            GROUP BY subscriber_id HAVING COUNT(y.subscriber_id)>1) 
     AND STATUS = 1
     AND year(getdate()) >= year(start_date) 
     AND year(getdate()) <= year(end_date)
    

    【讨论】:

    • 您忘记了当前年份必须介于 start_dateend_date 之间。
    • 它给我的所有记录都没有帮助!
    【解决方案3】:
    WITH CTE (Code, DuplicateCount)
    AS
    (
        SELECT subscriber_id,
        ROW_NUMBER() OVER(PARTITION BY  subscriber_id
        ORDER BY  subscriber_id) AS DuplicateCount
        FROM dbo.consumer_subsc 
        where  subscriber_id in (3031103) 
        and status=1 and year(getdate()) >= year(start_date) 
        and year(getdate()) <= year(end_date)  
    
    )
    Select * from CTE
    

    【讨论】:

      【解决方案4】:

      下面的查询给出了接近预期的 O/P:

      SELECT A.* FROM (SELECT t.*,Row_number() OVER(partition BY t.subscriber_id ORDER BY t.subscriber_id,t.start_date) rnk  FROM dbo.consumer_subsc t 
       WHERE EXISTS(SELECT subscriber_id 
              FROM dbo.consumer_subsc y 
              WHERE y.status=t.status
                  AND y.subscriber_id = t.subscriber_id 
              GROUP BY subscriber_id HAVING COUNT(y.subscriber_id)>1) 
       AND STATUS = 1
       AND year(getdate()) >= year(start_date) 
       AND year(getdate()) <= year(end_date))A WHERE A.rnk>1
      

      【讨论】:

        最近更新 更多