最多匹配 n 个字符答案

【问题标题】：Match at most n characters最多匹配 n 个字符
【发布时间】：2020-03-05 21:39:42
【问题描述】：

我正在尝试创建一个最多匹配 7 个组的正则表达式。

((X:){1,6})((:Y){1,6})

X:X:X:X:X::Y:Y             This should match
X:X:X:X:X:X::Y:Y           This should not match.

https://regex101.com/r/zxfAB7/16

有没有办法做到这一点？我需要捕获组 $1 和 $3
我正在使用 C++17 正则表达式。

【问题讨论】：

也许正则表达式是错误的工具。
你有没有让正则表达式工作（例如使用 regex101）？如果你有，那么你的 C++ 程序有什么问题？请刷新how to ask good questions，以及this question checklist。请不要忘记如何创建minimal reproducible example 向我们展示。

标签： c++ regex

【解决方案1】：

如果支持正向预读，您可以使用正向预读来断言X: 或:Y 的重复次数不是8 次。

为防止出现空匹配，您可以使用肯定的前瞻来检查是否至少有 1 个匹配。

然后使用 2 个捕获组，在第一个组中重复 0 次以上匹配 X:，在另一组中重复 0 次以上匹配 :Y。

^(?=(?:X:|:Y))(?!(?:(?:X:|:Y)){8})((?:X:)*)((?::Y)*)$

^ 字符串开始
(?= 正向前瞻，断言右边是
- (?:X:|:Y) 匹配 X: 或 :Y
)关闭正向预测
(?! 负前瞻，断言不是 8 次匹配 X: 或 :Y
- (?:(?:X:|:Y)){8}
) 关闭负前瞻
((?:X:)*) 捕获组 1 匹配 0+ 次 X:
((?::Y)*) 捕获组 2 匹配 0+ 次 :Y
$字符串结束

Regex demo

【讨论】：

"如果支持正向前瞻，..." - 你知道支持什么。 OP 正在使用 C++17 std::regex。
@JesperJuhl 也许我错过了，但你在哪里看到的？
谢谢。我真的很喜欢你对正则表达式的透彻解释。

【解决方案2】：

正如 Ulrich 所说，仅使用正则表达式可能不是解决方案。我会建议您以下几点：

Replace all X (occuring 1 to 6 times) by an empty string
Replace all Y (occuring 1 to 6 times) by an empty string
Use regex for determining if any X is still present
Use regex for determining if any Y is still present

如果所有X 或Y 仅出现1 到6 次，则找不到X 或Y（返回match），否则返回no match。

【讨论】：

考虑X:X:X:X:X:X:Y:Y:Y:Y:Y:Y: 6X 和 6Y 删除它们后不再有 X 和 Y 但有 12 个，他们最多需要 7 个。

【解决方案3】：

虽然已经有一个公认的答案，但我想展示一个超简单直接的解决方案。用 C++17 测试。以及完整的运行源码。

由于我们谈论的是最多 7 个组，我们可以简单地将它们全部列出并“或”它们。这可能是很多文本和复杂的 DFA。但它应该可以工作。

找到匹配后，我们定义一个向量并将所有数据/组放入其中并显示所需的结果。这很简单：

请看：

#include <iostream>
#include <string>
#include <iterator>
#include <vector>
#include <regex>

std::vector<std::string> test{
    "X::Y",
    "X:X::Y",
    "X:X::Y:Y",
    "X:X:X::Y:Y",
    "X::Y:Y:Y:Y:Y",
    "X:X:X:X:X::Y:Y",
    "X:X:X:X:X:X::Y:Y"
};

const std::regex re1{ R"((((X:){1,1}(:Y){1,6})|((X:){1,2}(:Y){1,5})|((X:){1,3}(:Y){1,4})|((X:){1,4}(:Y){1,3})|((X:){1,5}(:Y){1,2})|((X:){1,6}(:Y){1,1})))" };
const std::regex re2{ R"(((X:)|(:Y)))" };

int main() {
    std::smatch sm;
    // Go through all test strings
    for (const std::string s : test) {
        // Look for a match
        if (std::regex_match(s, sm, re1)) {
            // Show succes message
            std::cout << "Match found for  -->  " << s << "\n";
            // Get all data (groups) into a vector
            std::vector<std::string> data{ std::sregex_token_iterator(s.begin(), s.end(),re2,1),  std::sregex_token_iterator() };
            // Show desired groups
            if (data.size() >= 6) {
                std::cout << "Group 1: '" << data[0] << "'   Group 6: '" << data[5] << "'\n";
            }
        }
        else {
            std::cout << "**** NO match found for  -->  " << s << "\n";
        }
    }
    return 0;
}

【讨论】：