遇到特定单词时拆分字符串答案

【问题标题】：Splitting a string when a particular word is encountered遇到特定单词时拆分字符串
【发布时间】：2014-02-14 08:03:38
【问题描述】：

我对 python 和整个编程还很陌生。只是关于学习我的ABC。比方说，我有一个这样的字符串。

s = "DEALER:'S up, Bubbless? BUBBLES: Hey. DEALER: Well, there you go. JUNKIE: Well, what you got?DEALER: I got some starters.";

我希望字符串在遇到结尾带有大写和冒号（:) 的单词时结束。然后创建一个新字符串来存储另一个字符串。对于上面的字符串，我会得到

 s1 = "DEALER:'S up, Bubbless?"
    s2 = "BUBBLES: Hey."
    s3 = "DEALER: Well, there you go."

这是我的正则表达式代码

mystring = """
DEALER: 'S up, Bubbless?
BUBBLES: Hey.
DEALER: Well, there you go.
JUNKIE: Well, what you got?
DEALER: I got some starters. """

#[A-Z]+:.*?(?=[A-Z]+:|$)

#p = re.compile('([A-Z]*):')
p = re.compile('[A-Z]+:.*?(?=[A-Z]+:|$)')
s = set(p.findall(mystring))

我将如何遍历它以获取每个字符串？它只获取第一个字符串（即 DEALER: 'S up, Bubbless?）并停止。对不起，如果我听起来有点无能为力。对编程有点陌生。边学边学

【问题讨论】：

标签： python regex string string-matching

【解决方案1】：

由于是多行字符串，所以需要使用re.DOTALL选项，像这样

p = re.compile('[A-Z]+:.*?(?=[A-Z]+:|$)', re.DOTALL)

输出

set(["DEALER: 'S up, Bubbless?\n",
     'JUNKIE: Well, what you got?\n',
     'DEALER: Well, there you go.\n',
     'DEALER: I got some starters. ',
     'BUBBLES: Hey.\n'])

引用re.DOTALL docs，

制作'.'特殊字符完全匹配任何字符，包括新队;没有这个标志，'.'将匹配除换行符以外的任何内容。

因此，如果没有该选项，.*? 将与 \n 不匹配。这就是为什么其他字符串都没有匹配到的原因。

【讨论】：