正则表达式 - 匹配 [[[ 和 ]]] 之间任何内容的倍数答案

【问题标题】：Regex - Match multiples of anything between [[[ and ]]]正则表达式 - 匹配 [[[ 和 ]]] 之间任何内容的倍数
【发布时间】：2012-09-10 13:42:41
【问题描述】：

我需要使用正则表达式匹配 [[[ 和 ]]] 之间的任何内容。然后我需要将括号之间的所有值放入一个数组中。

示例文本：

here is some 'test text [[[media-2 large right]]], [[[image-0 large left]]] the another token [[[image-1]]

从上面的文字中我需要匹配前两个：

1, [[[media-2 large right]]]
2, [[[image-0 large left]]]

但不是最后一个，因为它最后只有两个 [。

【问题讨论】：

标签： php regex

【解决方案1】：

这会检查：

[[[
后跟：
1. 除了]之外的任何东西-或-
2. 一到两个]，后面没有]
后跟]]]

preg_match_all('/\[\[\[(?:(?:[^\]]*|]{1,2}(?!]))*)]]]/', $string, $matches);
print_r($matches[0]);

这个正则表达式的好处是匹配三括号包装内的]（例如[[[foo]bar]]]。

注意：] 不需要转义，字符类除外。

【讨论】：

【解决方案2】：

一个通用的解决方案是这个：

\[{3}(?=.*?\]{3}(?!\]))((?:(?!\]{3}(?!\])).)*)

上面写着

\[{3}         # 3 opening square brackets
(?=           # begin positive look-ahead ("followed by..."
  .*?\]{3}    #   ...3 closing brackets, anywhere ahead (*see explanation below)
  (?!\])      #   negative look-ahead: no more ] after the 3rd one
)             # end positive look-ahead
(             # begin group 1
  (?:         #   begin non-matching group (for atomic grouping)
    (?!       #     begin negative look-ahead ("not followed by"):
      \]{3}   #       ...3 closing square brackets
      (?!\])  #       negative look-ahead: no more ] after the 3rd one
    )         #     end negative look-ahead
    .         #     the next character is valid, match it
  )           #   end non-matching group
)             # end group 1 (will contain the wanted substring)

肯定的前瞻是一个保障条款，当长输入字符串中没有"]]]" 时，它允许表达式快速失败。

一旦确定"]]]" 将在字符串中的某个点跟随，否定的前瞻确保表达式正确匹配这样的字符串：

[[[foo [some text] bar]]]
                 ^
                 +-------- most of the other solutions would stop at this point

此表达式检查每个字符是否跟随三个]，因此在此示例中它将包括" bar"。

表达式的"no more ] after the 3rd one" 部分确保匹配不会过早结束，因此在这种情况下：

[[[foo [some text]]]]

匹配仍然是"foo [some text]"。
没有它，表达式会过早停止 ("foo bar [some text")。

副作用是我们不需要实际匹配"]]]"，因为正向预测清楚地表明它们在那里。我们只需要匹配它们，负前瞻就可以很好地做到这一点。

请注意，如果您的输入包含换行符，则需要在“dotall”模式下运行表达式。

另请参阅：http://rubular.com/r/QFo9jHEh9d

【讨论】：

如果[some text] 和]]] 之间没有文本，它确实会中断，它会去除some text 上的尾随]
@sixeightzero 谢谢。虽然我必须说 bfrohs 的解决方案也很棒。
谢谢 - 非常感谢您花时间为我分解正则表达式。这个和 bfrohs 解决方案都对我有用。

【解决方案3】：

更安全的解决方案：

\[{3}[^\]]+?\]{3}

【讨论】：

[[[foo]bar]]] 将不匹配。

【解决方案4】：

我认为这可行：

\[\[\[(.*)\]\]\]

但这可能是新的方法:)

【讨论】：

这似乎匹配所有内容，而不是将它们分开/分组。即一些文本 [[[media-2 large right]]] 和更多纯文本 [[[image-0 large left]]] blah blah 它匹配：[[[media-2 large right]]] 和更多纯文本 [ [[image-0 大左]]]
我在gskinner.com/RegExr 的测试似乎可以工作，但我不是 regExpert...

【解决方案5】：

如果您的字符串始终遵循该格式，subject、size、position，您可以使用这个：

$string = "here is some 'test text [[[media-2 right]]], [[[image-0]]] the another [[[image-1 left large]]] and token [[[image-1]]";

preg_match_all('/[\[]{3}(.*?)(.*?)?(.*?)?[\]]{3}/', $string, $matches);
print_r($matches);

【讨论】：

有时不需要/提供大小、位置