【问题标题】:Replace whitespaces using PHP preg_replace function ignoring quoted strings使用忽略引用字符串的 PHP preg_replace 函数替换空格
【发布时间】:2010-05-24 09:00:32
【问题描述】:

看下面的字符串

SELECT
    column1 ,
    column2, column3
FROM
    table1
WHERE
    column1 = 'text, "FROM" \'from\\\' x' AND
    column2 = "sample text 'where' \"where\\\" " AND
    ( column3 = 5 )

我需要从字符串中转义不必要的空白字符,例如:

  • 、( ) 等的开始结束位置删除空格
  • 删除换行符 (\r\n) 和制表符 (\t)

但有一件事。删除过程无法从引用的字符串中删除空格,例如:

  • 'text, "FROM" \'from\\' x'
  • "示例文本 'where' \"where\\" "

等等

我需要使用PHP函数:preg_replace($pattern, $replacement, $string);

那么 $pattern$replacement 的值是多少,其中 $string 的值是给定的 SQL

【问题讨论】:

    标签: php string whitespace pattern-matching preg-replace


    【解决方案1】:

    单个正则表达式模式和替换字符串字符串将不起作用。您的第一步可能是对输入字符串进行标记:首先尝试匹配 cmets 和字符串文字,然后尝试匹配空白字符,最后尝试匹配非空格字符。快速演示:

    $text = <<<BLOCK
    SELECT
        column1 ,
        column2, column3
    FROM
        table1
    -- a comment line ' " ...
    WHERE
        column1 = 'text, "FROM" \\'from\\\\\\' x' AND
        column2 = "sample text 'where' \\"where\\\\\\" " AND
        ( column3 = 5 )
    BLOCK;
    
    echo $text . "\n\n";
    
    preg_match_all('/
        --[^\r\n]*                # a comment line
        |                         # OR
        \'(?:\\\\.|[^\'\\\\])*\'  # a single quoted string
        |                         # OR
        "(?:\\\\.|[^"\\\\])*"     # a double quoted string
        |                         # OR
        `[^`]*`                   # a string surrounded by backticks
        |                         # OR
        \s+                       # one or more space chars
        |                         # OR
        \S+                       # one or more non-space chars
    /x', $text, $matches);
    
    print_r($matches);
    

    产生:

    SELECT
        column1 ,
        column2, column3
    FROM
        table1
    -- a comment line ' " ...
    WHERE
        column1 = 'text, "FROM" \'from\\\' x' AND
        column2 = "sample text 'where' \"where\\\" " AND
        ( column3 = 5 )
    
    Array
    (
        [0] => Array
            (
                [0] => SELECT
                [1] => 
    
                [2] => column1
                [3] =>  
                [4] => ,
                [5] => 
    
                [6] => column2,
                [7] =>  
                [8] => column3
                [9] => 
    
                [10] => FROM
                [11] => 
    
                [12] => table1
                [13] => 
    
                [14] => -- a comment line ' " ...
                [15] => 
    
                [16] => WHERE
                [17] => 
    
                [18] => column1
                [19] =>  
                [20] => =
                [21] =>  
                [22] => 'text, "FROM" \'from\\\' x'
                [23] =>  
                [24] => AND
                [25] => 
    
                [26] => column2
                [27] =>  
                [28] => =
                [29] =>  
                [30] => "sample text 'where' \"where\\\" "
                [31] =>  
                [32] => AND
                [33] => 
    
                [34] => (
                [35] =>  
                [36] => column3
                [37] =>  
                [38] => =
                [39] =>  
                [40] => 5
                [41] =>  
                [42] => )
            )
    
    )
    

    然后您可以遍历标记化的 $matches 数组并替换您认为合适的空间匹配。

    但正如您可能在我已删除的评论中看到的那样,更好的选择是使用一些专用的 SQL 解析器来执行此标记化:我不精通 SQL,但我很确定我上面的演示很容易被破坏.

    【讨论】:

    • @dear bart,如果我想在我的列表中包含`...` 类型引号,我必须在上面的列表中添加什么额外的模式。请注意,`...` 引号内不会有转义(\'\"),就像其他引号一样。即 'sdfsa\'sadfsaf', "asfasf\"sdfsf"
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-08-08
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多