【发布时间】:2018-02-22 15:07:24
【问题描述】:
我有一个具有这种结构的文本:
Text Starts
23/01/2018
Something here. It was a crazy day.
Believe me.
02/02/2018
Another thing happens.
Some Delimiter
20/02/2017
Text here
21/02/2017
Another text.
Here.
End Section
...text continues...
还有一个正则表达式,用于匹配(日期,文本)组,直到 python 中的Some Delimiter:
result = re.findall(r"(\d{2}\/\d{2}\/\d{4}\n)(.*?)(?=\n\d{2}\/\d{2}\/\d{4}|\nSome Delimiter)", text, re.DOTALL)
结果:
>>> print(result)
[('23/01/2018\n', 'Something here. It was a crazy day. \nBelieve me.'),
('02/02/2018\n', 'Another thing happens.'),
('20/02/2017\n', 'Text here')]
它得到分隔符之后的下一个组。
如何获取分隔符之前的所有组?
【问题讨论】:
-
您的正则表达式返回多个匹配项。你只想要一场比赛吗?分隔符之前的那个?
-
@WiktorStribiżew 它不起作用。
标签: python regex python-3.x regex-lookarounds