如果您只想进行一轮替换(即 aaabbbb 变为 aabb),那么您可以使用:
CREATE OR ALTER FUNCTION dbo.RemoveDuplicates (@value varchar(200))
RETURNS VARCHAR(200)
WITH SCHEMABINDING
AS
BEGIN
DECLARE @result varchar(200) = @value;
DECLARE @i int = 65;
-- a-z is ASCII 65-90
WHILE @i < 90
BEGIN
SET @result = REPLACE(@result, CHAR(@i) + CHAR(@i), CHAR(@i));
SET @i += 1
END;
RETURN @result;
END;
GO
但您似乎需要递归替换,以便在删除之前具有相同的每个字符。
所以我们可以使用这个版本,这与另一个答案类似。
CREATE OR ALTER FUNCTION dbo.RemoveDuplicates (@value varchar(200))
RETURNS varchar(200)
WITH SCHEMABINDING
AS
BEGIN
DECLARE @c char(1);
DECLARE @cLast char(1) = LEFT(@value, 1);
DECLARE @result varchar(200) = @cLast;
DECLARE @strlen int = LEN(@value);
DECLARE @i int = 2;
WHILE (@i < @strlen)
BEGIN
SET @c = SUBSTRING(@value, @i, 1);
IF (@c <> @cLast)
SET @result += @c;
SET @i += 1
END;
RETURN @result;
END;
GO
我将它重写为内联表值函数,发现它明显更快。这里有两个版本,取决于你是否可以使用STRING_AGG
CREATE OR ALTER FUNCTION dbo.RemoveDuplicatesXML (@value varchar(200))
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN
(
WITH L1 AS (SELECT n FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) v(n)),
L2 AS (SELECT 1 n FROM L1 A CROSS JOIN L1 B),
Nums AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) rn FROM L2),
Chars AS (SELECT TOP(LEN(@value)) rn FROM Nums)
SELECT (
SELECT SUBSTRING(@value, rn, 1)
FROM Chars
WHERE rn = 1 OR SUBSTRING(@value, rn - 1, 1) <> SUBSTRING(@value, rn, 1)
ORDER BY rn
FOR XML PATH(''), TYPE
).value('text()[1]','nvarchar(max)') Result
);
GO
CREATE OR ALTER FUNCTION dbo.RemoveDuplicatesAGG (@value varchar(200))
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN
(
WITH L1 AS (SELECT n FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) v(n)),
L2 AS (SELECT 1 n FROM L1 A CROSS JOIN L1 B),
Nums AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) rn FROM L2),
Chars AS (SELECT TOP(LEN(@value)) rn FROM Nums)
SELECT STRING_AGG(SUBSTRING(@value, rn, 1), '') WITHIN GROUP (ORDER BY rn) Result
FROM Chars
WHERE rn = 1 OR SUBSTRING(@value, rn - 1, 1) <> SUBSTRING(@value, rn, 1)
);
GO
这利用Itzik Ben-Gan's famous inline tally-table method 将字符串分解为单个字符。如果字符数超过 256 个,则需要另一个 CROSS JOIN 或更多 (1)。
你有两种方法来使用它,性能应该是相同的
作为标量子查询
SELECT (SELECT * FROM RemoveDuplicatesAGG(t.MyString) Result
FROM myTable t
或APPLY
SELECT d.Result
FROM myTable t
CROSS APPLY RemoveDuplicatesAGG(t.MyString) d