【问题标题】:Can a regex search inside an already matched pattern?正则表达式可以在已经匹配的模式中搜索吗?
【发布时间】:2016-06-19 13:35:58
【问题描述】:

假设有如下的 WordPress 简码内容-

Some content here
[shortcode_1 attr1="val1" attr2="val2"]

    [shortcode_2 attr3="val3" attr4="val4"]

        Some text

    [/shortcode_2]

[/shortcode_1]
Some more content here

我的问题是假设我匹配简码模式,以便获得输出 [shortcode_1]....[/shortcode_1]。但是我可以在同一次运行中使用相同的正则表达式模式获得 [shortcode_2]...[/shortcode_2] 还是必须使用第一次运行的输出再次运行它?

【问题讨论】:

  • 取决于 RegEx 引擎,但我会查看 int 组和条件/可选组 ...

标签: php regex wordpress


【解决方案1】:

说明

您可以创建几个捕获组。一个用于整个比赛,第二个用于从属比赛。当然,这种方法确实有其局限性,并且可能会在一些非常复杂的边缘情况下陷入困境。

(\[shortcode_1\s[^\]]*].*?(\[shortcode_2\s.*?\[\/shortcode_2\]).*?\[\/shortcode_1\])

示例

现场演示

https://regex101.com/r/bQ0vV2/1

示例文本

[shortcode_1 attr1="val1" attr2="val2"]

    [shortcode_2 attr3="val3" attr4="val4"]

        Some text

    [/shortcode_2]

[/shortcode_1]

示例匹配

捕获组 1 获得 shortcode_1 捕获组 2 获得 shortcode_2

1.  [0-139] `[shortcode_1 attr1="val1" attr2="val2"]

    [shortcode_2 attr3="val3" attr4="val4"]

        Some text

    [/shortcode_2]

[/shortcode_1]`
2.  [45-123]    `[shortcode_2 attr3="val3" attr4="val4"]

        Some text

    [/shortcode_2]`

说明

NODE                     EXPLANATION
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    \[                       '['
----------------------------------------------------------------------
    shortcode_1              'shortcode_1'
----------------------------------------------------------------------
    \s                       whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
    [^\]]*                   any character except: '\]' (0 or more
                             times (matching the most amount
                             possible))
----------------------------------------------------------------------
    ]                        ']'
----------------------------------------------------------------------
    .*?                      any character (0 or more times (matching
                             the least amount possible))
----------------------------------------------------------------------
    (                        group and capture to \2:
----------------------------------------------------------------------
      \[                       '['
----------------------------------------------------------------------
      shortcode_2              'shortcode_2'
----------------------------------------------------------------------
      \s                       whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
      .*?                      any character (0 or more times
                               (matching the least amount possible))
----------------------------------------------------------------------
      \[                       '['
----------------------------------------------------------------------
      \/                       '/'
----------------------------------------------------------------------
      shortcode_2              'shortcode_2'
----------------------------------------------------------------------
      \]                       ']'
----------------------------------------------------------------------
    )                        end of \2
----------------------------------------------------------------------
    .*?                      any character (0 or more times (matching
                             the least amount possible))
----------------------------------------------------------------------
    \[                       '['
----------------------------------------------------------------------
    \/                       '/'
----------------------------------------------------------------------
    shortcode_1              'shortcode_1'
----------------------------------------------------------------------
    \]                       ']'
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------

【讨论】:

  • 感谢您提供如此详细的描述。但是,它并没有解决所提出的问题。这实际上是考虑到短代码名称是 shortcode_1shortcode_2。我要求提供任何显然可以被捕获组接收的短代码名称。这里还要考虑的是只有一级子简码。我试图为任意数量的子短代码找到任何匹配项。
  • 您最初的问题是suppose I match the shortcode pattern such that I get the output [shortcode_1]....[/shortcode_1]. But can I get the [shortcode_2]...[/shortcode_2] using the same regex pattern in the same run or do I have to run it again using the output from the first run。这听起来像您在问如何在同一捕获中获取外部 shortcode_1 和内部 shortcode_2。您能否编辑问题和示例文本以涵盖您正在寻找的内容以及所需的匹配项?
猜你喜欢
  • 2012-03-13
  • 1970-01-01
  • 1970-01-01
  • 2012-01-17
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2010-09-13
相关资源
最近更新 更多