【问题标题】:Only Select First Reverse Duplicate只选择第一个反向重复
【发布时间】:2026-02-06 08:20:07
【问题描述】:

我在 SQL Server 2014 数据库中有下表:

+----+-------+--------+---------+
| ID | CODE  | NUMBER | BALANCE |
+----+-------+--------+---------+
| 1  | B0001 | 122960 | 100.00  |
+----+-------+--------+---------+
| 2  | B0001 | 123168 | -100.00 |
+----+-------+--------+---------+
| 3  | B0001 | 121400 | 500.00  |
+----+-------+--------+---------+
| 4  | T0001 | 19755  | 50.00   |
+----+-------+--------+---------+
| 5  | T0001 | 19975  | -50.00  |
+----+-------+--------+---------+
| 6  | T0001 | 122202 | 50.00   |
+----+-------+--------+---------+
| 7  | T0001 | 122203 | 50.00   |
+----+-------+--------+---------+

我正在尝试选择给定代码的余额可以与另一行抵消并总计为 0 的行。例如,第 1 行和第 2 行的余额总和为 0,因此应返回。我尝试了以下查询:

SELECT T1.NUMBER
FROM TABLE T1, TABLE T2
WHERE T1.CODE = T2.CODE
AND T1.BALANCE + T2.BALANCE = 0

这适用于代码 B0001。它将返回第 1 行和第 2 行,它们相互抵消并忽略第 3 行。我遇到了代码 T0001 的问题,因为我使用的查询会将 3 个正值中的每一个与负值匹配并返回所有关联的行用那个代码。我只希望它为 T0001 返回第 4 行和第 5 行。

【问题讨论】:

  • 所以您希望 net 仅针对连续 ID 为 0 对吗?
  • 我不关心 ID。我只想要可以取消它们的 NUMBER 值。对于T0001,它返回哪个数字来取消负值并不重要,只要它只返回一对值。例如,对于 T0001,它可以返回第 4 行和第 5 行、第 5 行和第 6 行或第 5 行和第 7 行。它们都是有效的,但我只想要其中一个。
  • 请看我的回答,它提供了 Group by 子句的简单解决方法,如果能解决您的问题,请告诉我。

标签: sql sql-server join sum subquery


【解决方案1】:

试试这个:

/* DATASET MOCK-UP */
DECLARE @Data TABLE ( ID INT, CODE VARCHAR(10), NUMBER INT, BALANCE DECIMAL(18,2) );
INSERT INTO @Data ( ID, CODE, NUMBER, BALANCE ) VALUES
( 1, 'B0001', 122960 , 100.00 ),
( 2, 'B0001', 123168 , -100.00 ),
( 3, 'B0001', 121400 , 500.00 ),
( 4, 'T0001', 19755  , 50.00 ),
( 5, 'T0001', 19975  , -50.00 ),
( 6, 'T0001', 122202 , 50.00 ),
( 7, 'T0001', 122203 , 50.00 );

/*
    Return records where combined balances equal 0 by adding the
    current record's BALANCE against its previous (lag) or following (lead) balances.
*/
SELECT
    ID, CODE, NUMBER, BALANCE, ( BALANCE + LAG_BALANCE ) AS LAG_BALANCE, ( BALANCE + LEAD_BALANCE ) AS LEAD_BALANCE
FROM (
    
    SELECT
        ID,
        CODE,
        NUMBER,
        BALANCE,
        LAG ( BALANCE, 1, 0 ) OVER ( PARTITION BY CODE ORDER BY CODE, ID ) AS LAG_BALANCE, 
        LEAD ( BALANCE, 1, 0 ) OVER ( PARTITION BY CODE ORDER BY CODE, ID ) AS LEAD_BALANCE 
    FROM @Data

) AS Results
WHERE
    BALANCE + LAG_BALANCE = 0
    OR
    BALANCE + LEAD_BALANCE = 0
ORDER BY
    ID;

返回

+----+-------+--------+---------+-------------+--------------+
| ID | CODE  | NUMBER | BALANCE | LAG_BALANCE | LEAD_BALANCE |
+----+-------+--------+---------+-------------+--------------+
|  1 | B0001 | 122960 | 100.00  | 100.00      | 0.00         |
|  2 | B0001 | 123168 | -100.00 | 0.00        | 400.00       |
|  4 | T0001 |  19755 | 50.00   | 550.00      | 0.00         |
|  5 | T0001 |  19975 | -50.00  | 0.00        | 0.00         |
|  6 | T0001 | 122202 | 50.00   | 0.00        | 100.00       |
+----+-------+--------+---------+-------------+--------------+

更新: 我只想要可以取消它们的 NUMBER 值。对于T0001,它返回哪个数字来取消负值并不重要,只要它只返回一对值。例如,对于 T0001,它可以返回第 4 行和第 5 行、第 5 行和第 6 行或第 5 行和第 7 行。它们都是有效的,但我只想要其中一个。

此编辑会为每个符合“归零”条件的 CODE 返回一个 NUMBER:

SELECT
    CODE, MIN ( NUMBER ) AS MIN_NUMBER
FROM (
    
    SELECT
        ID,
        CODE,
        NUMBER,
        BALANCE,
        LAG ( BALANCE, 1, 0 ) OVER ( PARTITION BY CODE ORDER BY CODE, ID ) AS LAG_BALANCE, 
        LEAD ( BALANCE, 1, 0 ) OVER ( PARTITION BY CODE ORDER BY CODE, ID ) AS LEAD_BALANCE 
    FROM @Data

) AS Results
WHERE
    BALANCE + LAG_BALANCE = 0
    OR
    BALANCE + LEAD_BALANCE = 0
GROUP BY
    CODE
ORDER BY
    CODE;

返回

+-------+------------+
| CODE  | MIN_NUMBER |
+-------+------------+
| B0001 |     122960 |
| T0001 |      19755 |
+-------+------------+

更新 #2:

/*
    Return the first TWO rows for a CODE with BALANCEs that zero-out each other.
*/
SELECT
    ID, CODE, NUMBER, BALANCE, ( BALANCE + LAG_BALANCE ) AS LAG_BALANCE, ( BALANCE + LEAD_BALANCE ) AS LEAD_BALANCE
FROM (
    
    SELECT
        ID,
        CODE,
        NUMBER,
        BALANCE,
        LAG ( BALANCE, 1, 0 ) OVER ( PARTITION BY CODE ORDER BY CODE, ID ) AS LAG_BALANCE, 
        LEAD ( BALANCE, 1, 0 ) OVER ( PARTITION BY CODE ORDER BY CODE, ID ) AS LEAD_BALANCE,
        ROW_NUMBER() OVER ( PARTITION BY CODE ORDER BY CODE, ID ) AS CODE_ROW
    FROM @Data

) AS Results
WHERE
    CODE_ROW <= 2
    AND ( BALANCE + LAG_BALANCE = 0 OR BALANCE + LEAD_BALANCE = 0 )
ORDER BY
    ID;

返回

+----+-------+--------+---------+-------------+--------------+
| ID | CODE  | NUMBER | BALANCE | LAG_BALANCE | LEAD_BALANCE |
+----+-------+--------+---------+-------------+--------------+
|  1 | B0001 | 122960 | 100.00  | 100.00      | 0.00         |
|  2 | B0001 | 123168 | -100.00 | 0.00        | 400.00       |
|  4 | T0001 |  19755 | 50.00   | 50.00       | 0.00         |
|  5 | T0001 |  19975 | -50.00  | 0.00        | 0.00         |
+----+-------+--------+---------+-------------+--------------+

【讨论】:

  • 我不明白你在这里做什么。我只想返回 T0001 的任意两行,它们结合起来给出 0 的余额。你的结果给出了三行。你能澄清一下吗?
  • 我的结果集有两行在我提供的(更新)示例中返回,而不是三行,所以我不确定你是如何得到三行的。基本上,子查询将当前记录的 BALANCE 添加到 CODE 的相应 LAG 或 LEAD 记录中,然后外部查询使用这些记录来执行数学运算,以返回当前余额加上滞后或领先余额彼此为零的任何行。 GROUP BY CODE 和 MIN 为每个相应的 CODE 返回一个 NUMBER 值,如果我理解您的问题,那就是您正在寻找的。​​span>
  • 您说:“例如,对于 T0001,它可以返回第 4 行和第 5 行、第 5 行和第 6 行或第 5 行和第 7 行。它们都是有效的,但我只想要 一个 of them" 你真的想要前两个匹配的行吗?您的要求不是很清楚。
  • @Chesterfield 查看我的更新 #2。我想我更了解你想要什么。
  • 这行得通,但如果我将第 5 行更改为正数,将第 6 行更改为负数,那么它只会为 T0001 返回一行。我在上面使用 GMB 的答案。感谢您的帮助。
【解决方案2】:

您想匹配相反平衡的行,但每行只能匹配一次。

一个选项是先用row_number() 枚举行。然后,您可以使用自联接解决方案,在联接条件中添加行号。我更喜欢not exists——但逻辑是一样的:

with  cte as (
    select 
        t.*, 
        row_number() over(partition by code, balance order by id) rn
    from mytable t
)
select *
from cte c
where exists (
    select 1 
    from cte c1 
    where c1.code = c.code and c1.rn = c.rn and c1.balance + c.balance = 0
)
order by code, id

Demo on DB Fiddle

编号 |代码 |号码 |余额| rn -: | :---- | -----: | ------: | -: 1 | B0001 | 122960 | 100.00 | 1 2 | B0001 | 123168 | -100.00 | 1 4 | T0001 | 19755 | 50.00 | 1 5 | T0001 | 19975 | -50.00 | 1

【讨论】:

    【解决方案3】:

    类似的东西

    ;with 
    neg_cte as (select *, row_number() over(partition by code, balance order by id) rn 
                from @Data where BALANCE<0),
    pos_cte as (select *, row_number() over(partition by code, balance order by id) rn 
                from @Data where BALANCE>0)
    select * from neg_cte
    union all
    select pc.* from neg_cte nc join pos_cte pc on nc.CODE=pc.CODE
                                                and nc.BALANCE=pc.BALANCE*-1
                                                and nc.rn=pc.rn
    order by ID;
    

    结果

    ID  CODE    NUMBER  BALANCE rn
    1   B0001   122960  100.00  1
    2   B0001   123168  -100.00 1
    4   T0001   19755   50.00   1
    5   T0001   19975   -50.00  1
    

    【讨论】:

      最近更新 更多