【问题标题】:How to read RegEx Captures in C#如何在 C# 中读取正则表达式捕获
【发布时间】:2015-02-11 05:21:12
【问题描述】:

我开始编写一本 C# 书籍,并决定将 RegEx 加入其中,让枯燥的控制台练习变得更有趣。我想要做的是在控制台中向用户询问他们的电话号码,对照正则表达式进行检查,然后捕获数字,以便我可以按照我想要的方式对其进行格式化。除了 RegEx 捕获部分之外,我已经完成了所有工作。如何将捕获值放入 C# 变量中?

还可以随时更正任何代码格式或变量命名问题。

static void askPhoneNumber()
{
    String pattern = @"[(]?(\d{3})[)]?[ -.]?(\d{3})[ -.]?(\d{4})";

    System.Console.WriteLine("What is your phone number?");
    String phoneNumber = Console.ReadLine();

    while (!Regex.IsMatch(phoneNumber, pattern))
    {
        Console.WriteLine("Bad Input");
        phoneNumber = Console.ReadLine();
    }

    Match match = Regex.Match(phoneNumber, pattern);
    Capture capture = match.Groups.Captures;

    System.Console.WriteLine(capture[1].Value + "-" + capture[2].Value + "-" + capture[3].Value);
}

【问题讨论】:

    标签: c# regex console


    【解决方案1】:

    C# 正则表达式 API 可能会让人很困惑。有groupscaptures

    • 一个group代表一个捕获组,用于从文本中提取子串
    • 如果组出现在量词内,则每个组可以有多个捕获

    层次结构是:

    • 匹配
        • 捕获

    (一个match可以有多个group,每个group可以有多个capture)

    例如:

    Subject: aabcabbc
    Pattern: ^(?:(a+b+)c)+$
    

    在此示例中,只有一个组:(a+b+)。该组位于量词内,并匹配两次。它生成两个捕获aababb

    aabcabbc
    ^^^ ^^^
    Cap1  Cap2
    

    当组不在量词内时,它只会生成一次捕获。在您的情况下,您有 3 个组,每个组捕获一次。您可以使用 match.Groups[1].Valuematch.Groups[2].Valuematch.Groups[3].Value 来提取您感兴趣的 3 个子字符串,而完全无需使用 capture 概念。

    【讨论】:

    • 不会是 match.Groups[0].Value 1, 2 因为基于 0 的索引吗?
    • @CausingUnderflowsEverywhere 索引 0 处的组代表整个匹配。捕获组从索引 1 开始。
    【解决方案2】:

    比赛结果可能很难理解。我编写这段代码是为了帮助我理解发现了什么以及在哪里。目的是可以将输出的片段(来自标有//** 的行)复制到程序中,以使用匹配中找到的值。

    public static void DisplayMatchResults(Match match)
    {
        Console.WriteLine("Match has {0} captures", match.Captures.Count);
    
        int groupNo = 0;
        foreach (Group mm in match.Groups)
        {
            Console.WriteLine("  Group {0,2} has {1,2} captures '{2}'", groupNo, mm.Captures.Count, mm.Value);
    
            int captureNo = 0;
            foreach (Capture cc in mm.Captures)
            {
                Console.WriteLine("       Capture {0,2} '{1}'", captureNo, cc);
                captureNo++;
            }
            groupNo++;
        }
    
        groupNo = 0;
        foreach (Group mm in match.Groups)
        {
            Console.WriteLine("    match.Groups[{0}].Value == \"{1}\"", groupNo, match.Groups[groupNo].Value); //**
            groupNo++;
        }
    
        groupNo = 0;
        foreach (Group mm in match.Groups)
        {
            int captureNo = 0;
            foreach (Capture cc in mm.Captures)
            {
                Console.WriteLine("    match.Groups[{0}].Captures[{1}].Value == \"{2}\"", groupNo, captureNo, match.Groups[groupNo].Captures[captureNo].Value); //**
                captureNo++;
            }
            groupNo++;
        }
    }
    

    一个使用这个方法的简单例子,给定这个输入:

    Regex regex = new Regex("/([A-Za-z]+)/(\\d+)$");
    String text = "some/directory/Pictures/Houses/12/apple/banana/"
                + "cherry/345/damson/elderberry/fig/678/gooseberry");
    Match match = regex.Match(text);
    DisplayMatchResults(match);
    

    输出是:

    Match has 1 captures
      Group  0 has  1 captures '/Houses/12'
           Capture  0 '/Houses/12'
      Group  1 has  1 captures 'Houses'
           Capture  0 'Houses'
      Group  2 has  1 captures '12'
           Capture  0 '12'
        match.Groups[0].Value == "/Houses/12"
        match.Groups[1].Value == "Houses"
        match.Groups[2].Value == "12"
        match.Groups[0].Captures[0].Value == "/Houses/12"
        match.Groups[1].Captures[0].Value == "Houses"
        match.Groups[2].Captures[0].Value == "12"
    

    假设我们要在上面的文本中找到上面正则表达式的所有匹配项。然后我们可以在代码中使用MatchCollection,例如:

    MatchCollection matches = regex.Matches(text);
    for (int ii = 0; ii < matches.Count; ii++)
    {
        Console.WriteLine("Match[{0}]  // of 0..{1}:", ii, matches.Count-1);
        RegexMatchDisplay.DisplayMatchResults(matches[ii]);
    }
    

    由此产生的输出是:

    Match[0]  // of 0..2:
    Match has 1 captures
      Group  0 has  1 captures '/Houses/12/'
           Capture  0 '/Houses/12/'
      Group  1 has  1 captures 'Houses'
           Capture  0 'Houses'
      Group  2 has  1 captures '12'
           Capture  0 '12'
        match.Groups[0].Value == "/Houses/12/"
        match.Groups[1].Value == "Houses"
        match.Groups[2].Value == "12"
        match.Groups[0].Captures[0].Value == "/Houses/12/"
        match.Groups[1].Captures[0].Value == "Houses"
        match.Groups[2].Captures[0].Value == "12"
    Match[1]  // of 0..2:
    Match has 1 captures
      Group  0 has  1 captures '/cherry/345/'
           Capture  0 '/cherry/345/'
      Group  1 has  1 captures 'cherry'
           Capture  0 'cherry'
      Group  2 has  1 captures '345'
           Capture  0 '345'
        match.Groups[0].Value == "/cherry/345/"
        match.Groups[1].Value == "cherry"
        match.Groups[2].Value == "345"
        match.Groups[0].Captures[0].Value == "/cherry/345/"
        match.Groups[1].Captures[0].Value == "cherry"
        match.Groups[2].Captures[0].Value == "345"
    Match[2]  // of 0..2:
    Match has 1 captures
      Group  0 has  1 captures '/fig/678/'
           Capture  0 '/fig/678/'
      Group  1 has  1 captures 'fig'
           Capture  0 'fig'
      Group  2 has  1 captures '678'
           Capture  0 '678'
        match.Groups[0].Value == "/fig/678/"
        match.Groups[1].Value == "fig"
        match.Groups[2].Value == "678"
        match.Groups[0].Captures[0].Value == "/fig/678/"
        match.Groups[1].Captures[0].Value == "fig"
        match.Groups[2].Captures[0].Value == "678"
    

    因此:

        matches[1].Groups[0].Value == "/cherry/345/"
        matches[1].Groups[1].Value == "cherry"
        matches[1].Groups[2].Value == "345"
        matches[1].Groups[0].Captures[0].Value == "/cherry/345/"
        matches[1].Groups[1].Captures[0].Value == "cherry"
        matches[1].Groups[2].Captures[0].Value == "345"
    

    matches[0]matches[2] 也是如此。

    【讨论】:

      【解决方案3】:
      string pattern = @"[(]?(\d{3})[)]?[ -.]?(\d{3})[ -.]?(\d{4})";
      
      System.Console.WriteLine("What is your phone number?");
      string phoneNumber = Console.ReadLine();
      
      while (!Regex.IsMatch(phoneNumber, pattern))
      {
          Console.WriteLine("Bad Input");
          phoneNumber = Console.ReadLine();
      }
      
      var match = Regex.Match(phoneNumber, pattern);
      if (match.Groups.Count == 4)
      {
          System.Console.WriteLine("Number matched : "+match.Groups[0].Value);
          System.Console.WriteLine(match.Groups[1].Value + "-" + match.Groups[2].Value + "-" + match.Groups[3].Value);
      }
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2010-11-25
        • 1970-01-01
        • 2019-10-05
        • 2021-08-02
        • 2016-10-09
        • 2014-01-02
        相关资源
        最近更新 更多