【问题标题】:replace every character of a string with another character in ms server 2014在ms server 2014中用另一个字符替换字符串的每个字符
【发布时间】:2020-03-30 13:40:36
【问题描述】:

我正在使用 Microsoft SQL Server 2014,并且正在尝试更新表中的某些列。

我想用另一个字符替换字符串中的每个字符。

例如,单词:

HELLO123

我想用 T 代替 H,用 Q 代替 E,用 Y 代替 L,用 I 代替 O,用 6 代替 1,用 7 代替 2,用 8 代替 3 等等。

我不确定 Microsoft SQL Server 2014 是否支持正则表达式,即创建一个函数并循环遍历每个字符并替换在具有几百万行的表上需要很长时间。

有没有人有任何像正则表达式一样工作并且可以快速的解决方案?

谢谢

【问题讨论】:

  • 为什么不看看 EncryptByPassPhrase()
  • 我不认为有任何聪明的方法可以做到这一点。只有蛮力。
  • 正如约翰暗示的那样,您的实际目标是什么。简单的字符替换很容易反转。也许对data obfuscation 的讨论可能有用。
  • 没有特别简单或优雅的方法可以做到这一点,这就是 SQL Server 2017 添加TRANSLATE 的原因。在 2014 年,您正在研究诸如使用递归的内联表值函数之类的东西,即使应用于数百万行时也不会太快。 CLR 函数是另一种可能的替代方案,尽管它具有学习和部署曲线。

标签: sql-server sql-server-2014 scramble


【解决方案1】:

在 Sql Server 2014 上可以做这样的争夺。
即使没有 UDF 或 CLR。

这是一种在FOR XML 上使用OUTER APPLY 来展开和替换[0-9A-Za-z] 范围的字符的方法。

样本数据:

create table test 
(
  id int identity(1,1) primary key,
  col nvarchar(42)
);

insert into test (col) values
(N'HELLO 0123'),
(N'01234π56789'),
(N'abcdefghijklm>nopqrstuvwxyz'),
(N'ABCDEFGHIJKLM✓NOPQRSTUVWXYZ');

数量:

--
-- Temporary tally table with numbers
-- Will be used to unfold that characters
--
if object_id('tempdb..#nums') is not null
  drop table #nums;

create table #nums (n int primary key);

with rcte as
(
 select 1 n, max(len(col)) max_n
 from test
 union all 
 select n+1, max_n
 from rcte 
 where n <= max_n
)
insert #nums (n)
select n 
from rcte
option (maxrecursion 4000);

查询:

select t.*, a.scramble
from test t
outer apply
(
  select q.x.value('.','NVARCHAR(MAX)') as scramble
  from
  (
    select
     case 
     when substring(col,n,1) between N'0' and N'9'
     then substring(
      N'5678901234',charindex(substring(col,n,1),
      N'0123456789'),1)
     when unicode(substring(col,n,1)) between unicode(N'a') and unicode(N'z')
     then substring(
      N'nomrqputswvyzxiacbedghfjlk',charindex(substring(col,n,1),
      N'abcdefghijklmnopqrstuvwxyz'),1)
     when unicode(substring(col,n,1)) between unicode(N'A') and unicode(N'Z')
     then substring(
      N'NOMRQPUTSWVYZXIACBEDGHFJLK',charindex(substring(col,n,1),
      N'ABCDEFGHIJKLMNOPQRSTUVWXYZ'),1)
     else substring(col,n,1)
     end [text()]
    from #nums
    where n between 1 and len(col)
    order by n
    for xml path (''), type
  ) q(x)
  where q.x is not null
) a;

结果:

编号 |上校 |争夺 -: | :---------------------------- | :---------------------------- 1 |你好 0123 | TQYYI 5678 2 | 01234π56789 | 56789π01234 3 | abcdefghijklm>nopqrstuvwxyz | nomrqputswvyz>xiacbedghfjlk 4 | ABCDEFGHIJKLM✓NOPQRSTUVWXYZ | NOMRQPUTSWVYZ✓XIACBEDGHFJLK

dbfiddle here

的测试

--

针对VARCHAR 更具体的解决方案:

select t.*, a.scramble
from test t
outer apply
(
  select q.x.value('.', 'VARCHAR(MAX)') as scramble
  from
  (
    select
     case 
     when substring(col,n,1) between '0' and '9'
     then substring(
      '5678901234',charindex(substring(col,n,1),
      '0123456789'),1)
     when ascii(substring(col,n,1)) between ascii('a') and ascii('z')
     then substring(
      'nomrqputswvyzxiacbedghfjlk',charindex(substring(col,n,1),
      'abcdefghijklmnopqrstuvwxyz'),1)
     when ascii(substring(col,n,1)) between ascii('A') and ascii('Z')
     then substring(
      'NOMRQPUTSWVYZXIACBEDGHFJLK',charindex(substring(col,n,1),
      'ABCDEFGHIJKLMNOPQRSTUVWXYZ'),1)
     else substring(col,n,1)
     end
    from #nums
    where n between 1 and len(col)
    order by n
    for xml path (''), type
  ) q(x)
  where q.x is not null
) a;

或者,进行旋转争夺的解决方案:

select t.*, a.scramble
from test t
outer apply
(
  select q.x.value('.', 'VARCHAR(MAX)') as scramble
  from
  (
    select 
    case
    when substring(col,n,1) between '0' and '9'
    then char(ascii('0')+(ascii(substring(col,n,1))-ascii('0')+5)%10)
    when ascii(substring(col,n,1)) between ascii('a') and ascii('z')
    then char(ascii('a')+(ascii(substring(col,n,1))-ascii('a')+13)%26)
    when ascii(substring(col,n,1)) between ascii('A') and ascii('Z')
    then char(ascii('A')+(ascii(substring(col,n,1))-ascii('A')+13)%26)
    else substring(col,n,1)
    end
    from #nums
    where n between 1 and len(col)
    order by n
    for xml path (''), type
  ) q(x)
) a

dbfiddle here

的测试

请注意,对于 Sql Server 2017+ 解决方案,STRING_SPLIT 可以替换 FOR XML。 但是话又说回来,人们可以简单地使用TRANSLATE

示例:

UPDATE test
SET col = TRANSLATE(col, 
           '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' collate Latin1_General_CS_AS, 
           '5678901234nomrqputswvyzxiacbedghfjlkNOMRQPUTSWVYZXIACBEDGHFJLK');

【讨论】:

  • 您好 LukStorms,感谢您的帮助。我试图了解#nums 表的目的。你可以解释吗?如果我必须将相同的逻辑应用于我的数据表中的另一列,那么我是否需要创建另一个表,如#nums?
  • @yoohoo 这只是一个带有数字的简单临时计数表。它用于子字符串获取第 n 个字符。事实上,假设您知道字符串的长度不会超过 N 个字符,那么您可以创建一个只有数字的永久表。我使用的插入将其限制为列中使用的最大长度,但您可以简单地调整以便加载 f.e.里面有 1000 个数字。
  • 顺便说一句,如果您要替换同一张表上的多个列?那么也许值得努力编写一个执行类似 TRANSLATE 函数的 CLR。 F.e. this SO post.
  • @yoohoo Fyi,答案已更新。特殊 XML 字符的修复。
【解决方案2】:
--this could work if the strings are all uppercase --<-- not true
--you could use nchar & foreign characters and handle everything correctly...it is just a pain to type...
declare @s varchar(20) = 'HELLO123';

--lower case everything
select @s = lower(@s)

--handle numbers with non printable characters  --> number to char() --> char() to new number
select @s = replace(replace(replace(replace(replace(replace(@s, '1', char(1)), '2', char(2)), '3', char(3)), char(1), '6'), char(2), '7'), char(3), '8')


--handle letters with case sensitive replacement (using a CS collation)
--(all letters are lowercased)lowercase letter --> new uppercase letter
select replace(replace(replace(replace(@s collate SQL_Latin1_General_CP1_CS_AS, 'h' , 'T'), 'e', 'Q'), 'l', 'Y'),  'o', 'I'); 



SELECT UPPER(REPLACE(REPLACE(LOWER('HELLO') collate SQL_Latin1_General_CP1_CS_AS, 'h','E'),'e','Q'));


GO

/*
SELECT dbo.shiftchars('0123456789::ABCDEFGHIJKLMNOPQRSTUVWXYZ::abcdefghijklmnopqrstuvwxyz');

just for fun *using a single replace()*
not for millions of rows
suitable for standard latin alphanumeric
extended ascii & non printable chars are not handled correctly.
*/

CREATE FUNCTION dbo.shiftchars(@s VARCHAR(8000))
RETURNS VARCHAR(8000)
WITH SCHEMABINDING, RETURNS NULL ON NULL INPUT
AS
BEGIN
        SELECT @s = REPLACE(@s, f collate SQL_Latin1_General_CP1_CS_AS , t)
        FROM
        (
        VALUES
            (1, '1', CHAR(1)),
            (1, '2', CHAR(2)),
            (1, '3', CHAR(3)),
            (1, '4', CHAR(4)),
            (1, '5', CHAR(5)),
            (1, '6', CHAR(6)),
            (1, '7', CHAR(7)),
            (1, '8', CHAR(8)),
            (1, '9', CHAR(9)),
            (1, '0', CHAR(254)),
            (1, 'A', CHAR(128)),
            (1, 'B', CHAR(129)),
            (1, 'C', CHAR(130)),
            (1, 'D', CHAR(131)),
            (1, 'E', CHAR(132)),
            (1, 'F', CHAR(133)),
            (1, 'G', CHAR(134)),
            (1, 'H', CHAR(135)),
            (1, 'I', CHAR(136)),
            (1, 'J', CHAR(137)),
            (1, 'K', CHAR(138)),
            (1, 'L', CHAR(139)),        
            (1, 'M', CHAR(140)),
            (1, 'N', CHAR(141)),
            (1, 'O', CHAR(142)),        
            (1, 'P', CHAR(143)),
            (1, 'Q', CHAR(144)),
            (1, 'R', CHAR(145)),        
            (1, 'S', CHAR(146)),
            (1, 'T', CHAR(147)),    
            (1, 'U', CHAR(148)),        
            (1, 'V', CHAR(149)),
            (1, 'W', CHAR(150)),
            (1, 'X', CHAR(151)),        
            (1, 'Y', CHAR(152)),
            (1, 'Z', CHAR(153)),
            (1, 'a', CHAR(154)),
            (1, 'b', CHAR(155)),
            (1, 'c', CHAR(156)),
            (1, 'd', CHAR(157)),
            (1, 'e', CHAR(158)),
            (1, 'f', CHAR(159)),
            (1, 'g', CHAR(160)),
            (1, 'h', CHAR(161)),
            (1, 'i', CHAR(162)),
            (1, 'j', CHAR(163)),
            (1, 'k', CHAR(164)),
            (1, 'l', CHAR(165)),        
            (1, 'm', CHAR(166)),
            (1, 'n', CHAR(167)),
            (1, 'o', CHAR(168)),        
            (1, 'p', CHAR(169)),
            (1, 'q', CHAR(170)),
            (1, 'r', CHAR(171)),        
            (1, 's', CHAR(172)),
            (1, 't', CHAR(173)),    
            (1, 'u', CHAR(174)),        
            (1, 'v', CHAR(175)),
            (1, 'w', CHAR(176)),
            (1, 'x', CHAR(177)),        
            (1, 'y', CHAR(178)),
            (1, 'z', CHAR(179)),
            --------------------
            (2, CHAR(1), '6'),
            (2, CHAR(2), '7'),
            (2, CHAR(3), '8'),
            (2, CHAR(4), '9'),
            (2, CHAR(5), '0'),
            (2, CHAR(6), '1'),
            (2, CHAR(7), '2'),
            (2, CHAR(8), '3'),
            (2, CHAR(9), '4'),
            (2, CHAR(254), '5'),
            (2, CHAR(128), 'M'),
            (2, CHAR(129), 'N'),
            (2, CHAR(130), 'O'),
            (2, CHAR(131), 'P'),
            (2, CHAR(132), 'Q'),
            (2, CHAR(133), 'R'),
            (2, CHAR(134), 'S'),
            (2, CHAR(135), 'T'),
            (2, CHAR(136), 'U'),
            (2, CHAR(137), 'V'),
            (2, CHAR(138), 'W'),
            (2, CHAR(139), 'X'),            
            (2, CHAR(140), 'Y'),
            (2, CHAR(141), 'Z'),
            (2, CHAR(142), 'A'),            
            (2, CHAR(143), 'B'),
            (2, CHAR(144), 'C'),
            (2, CHAR(145), 'D'),            
            (2, CHAR(146), 'E'),
            (2, CHAR(147), 'F'),        
            (2, CHAR(148), 'G'),            
            (2, CHAR(149), 'H'),
            (2, CHAR(150), 'I'),
            (2, CHAR(151), 'J'),            
            (2, CHAR(152), 'K'),
            (2, CHAR(153), 'L'),
            (2, CHAR(154), 'm'),
            (2, CHAR(155), 'n'),
            (2, CHAR(156), 'o'),
            (2, CHAR(157), 'p'),
            (2, CHAR(158), 'q'),
            (2, CHAR(159), 'r'),
            (2, CHAR(160), 's'),
            (2, CHAR(161), 't'),
            (2, CHAR(162), 'u'),
            (2, CHAR(163), 'v'),
            (2, CHAR(164), 'w'),
            (2, CHAR(165), 'x'),            
            (2, CHAR(166), 'y'),
            (2, CHAR(167), 'z'),
            (2, CHAR(168), 'a'),            
            (2, CHAR(169), 'b'),
            (2, CHAR(170), 'c'),
            (2, CHAR(171), 'd'),            
            (2, CHAR(172), 'e'),
            (2, CHAR(173), 'f'),        
            (2, CHAR(174), 'g'),            
            (2, CHAR(175), 'h'),
            (2, CHAR(176), 'i'),
            (2, CHAR(177), 'j'),            
            (2, CHAR(178), 'k'),
            (2, CHAR(179), 'l')                 
        ) AS v(o, f, t)
        ORDER BY o;

    RETURN (@s);

END

【讨论】:

    【解决方案3】:

    如果您没有注意到,REPLACE 的问题是您需要嵌套这些值,但是,因为您嵌套的类似 REPLACE(REPLACE('HELLO','H','E'),'E','Q') 的内容会返回 'QQLLO' 而不是 'EQLLO'。如 cmets 中所述,SQL Server 2017 引入了TRANSLATE,它只会处理一个字符一次,但是,由于您使用的是 2014,因此您不能使用它 (TRANSLATE('HELLO','HE','EQ'))。

    可以做的是创建一个查找表,然后将数据拆分为字符并重新构建它。对于大量数据,这不会很快,不,它不会变得更快;但它会“完成工作”:

    --Create a table for the Cipher characters
    CREATE TABLE dbo.CharCipher (InputChar char(1) NOT NULL,
                                 OutputChar char(1) NOT NULL);
    GO
    
    --Add a Clustered Primary Key
    ALTER TABLE dbo.CharCipher ADD CONSTRAINT PK_CharCipher PRIMARY KEY CLUSTERED (InputChar);
    GO
    
    --Ensure that the Output character us unique too    
    CREATE UNIQUE NONCLUSTERED INDEX UX_CipherOutput ON dbo.CharCipher (OutputChar);
    GO
    
    --Add your Ciphers
    INSERT INTO  dbo.CharCipher (InputChar,
                                 OutputChar)
    VALUES ('H','T'),
           ('E','Q'),
           ('L','Y'),
           ('O','I'),
           ('1','6'),
           ('2','7'),
           ('3','8');
    GO
    
    --Create a Sample table
    CREATE TABLE dbo.YourTable (YourString varchar(15));
    INSERT INTO dbo.YourTable (YourString)
    VALUES('HELLO123');
    GO
    
    --And now the "Mess"... I mean solution
    WITH N AS(
        SELECT N
        FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
    Tally AS(
        SELECT TOP (8000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
        FROM N N1, N N2, N N3, N N4)
    SELECT YT.YourString,
           (SELECT ISNULL(CC.OutputChar,V.YourChar)
            FROM Tally T
                 CROSS APPLY (VALUES(CONVERT(char(1),SUBSTRING(YT.YourString,T.I,1))))V(YourChar)
                 LEFT JOIN dbo.CharCipher CC ON V.YourChar = CC.InputChar
            WHERE T.I <= LEN(YT.YourString)
            ORDER BY T.I
            FOR XML PATH(''),TYPE).value('.','varchar(8000)') AS NewString
    FROM dbo.YourTable YT;
    
    GO
    
    --Clean up
    DROP TABLE dbo.YourTable;
    DROP TABLE dbo.CharCipher;
    

    【讨论】:

    • 嗨拉穆,感谢您的帮助。你能简要解释一下这个查询吗?例如,N 查询在 wIth 子句中做了什么?您创建了 10 个空值,为什么是 10 个而不是更多或更少?计数查询是做什么的?我看到了前 8000。为什么 8000 而不是更多或更少,你加入了 N 4 次,为什么要加入 4 次?请提供一些解释。提前致谢
    • 我的别名 @yoohoo 中没有 mThe "Numbers" or "Tally" Table: What it is and how it replaces a loop。 8,000,因为 varchar 最多可以容纳(在使用 MAX 之前,这会导致性能损失)。
    猜你喜欢
    • 2021-06-15
    • 1970-01-01
    • 2013-12-03
    • 2019-05-14
    • 2012-01-01
    • 1970-01-01
    • 2022-07-07
    • 1970-01-01
    • 2021-12-04
    相关资源
    最近更新 更多