【发布时间】:2019-09-27 15:49:51
【问题描述】:
我正在尝试编写一个正则表达式,它可用于在字符串中查找日期,该日期可能前面(或后面)有空格、数字、文本、行尾等。表达式应该处理 US日期格式是
1) 月份名称日、年 - 即 2019 年 1 月 10 日或
2) mm/dd/yy - 即 11/30/19
我找到了月份名称,日期年份
(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4}
(感谢 Veverke Regex to match date like month name day comma and year
这适用于 mm/dd/yy(以及 m/d/y 的各种组合)
(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2}
(感谢 Steven Levithan 和 Jan Goyvaerts https://www.oreilly.com/library/view/regular-expressions-cookbook/9781449327453/ch04s04.html
我尝试过这样组合它们
((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4})|((1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2})
当我在输入字符串“Paid on 1/1/2019”中搜索“on [regex above]”时,它确实找到了日期,但没有找到“on”这个词。如果我只使用
就可以找到字符串(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2}
谁能看出我做错了什么?
编辑
我正在使用下面的 c# .net 代码:
string stringToSearch = "Paid on 1/1/2019";
string searchPattern = @"on ((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4})|((1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2})";
var match = Regex.Match(stringToSearch, searchPattern, RegexOptions.IgnoreCase);
string foundString;
if (match.Success)
foundString= stringToSearch.Substring(match.Index, match.Length);
例如
string searchPattern = @"on ((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4})|((1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2})";
stringToSearch = "Paid on Jan 1, 2019";
found = "on Jan 1, 2019" -- worked as expected, found the word "on" and the date
stringToSearch = "Paid on 1/1/2019";
found = "1/1/2019" -- did not work as expected, found the date but did not include the word "on"
如果我反转模式
string searchPattern = @"on ((1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2})|((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4})"";
stringToSearch = "Paid on Jan 1, 2019";
found = "Jan 1, 2019" -- did not work as expected, found the date but did not include the word "on"
stringToSearch = "Paid on 1/1/2019";
found = "on 1/1/2019" -- worked as expected, found the word "on" and the date
谢谢
【问题讨论】:
-
正则表达式很好。如果您使用的是 java,请将您的代码链接给我。
-
从正则表达式的差异来看,您必须将所有反斜杠加倍:
\>\\(在字符串文字中,\\用于表示一个反斜杠)。编程语言是什么? -
对不起@sln,我的问题不准确。正则表达式确实找到了日期,但“on”这个词不是结果的一部分。我正在使用 c# .net。我将编辑我的问题以澄清。谢谢
-
感谢@Emma 的建议,包括输入和输出。这是我的第一篇文章(我已经阅读了很多其他文章),并感谢改进/澄清我的问题的建议。
标签: c# regex regex-lookarounds regex-group regex-greedy