我可以像这样做一个 findall 正则表达式吗？答案

【问题标题】：Can I do a findall regular expression like this?我可以像这样做一个 findall 正则表达式吗？
【发布时间】：2014-10-22 03:38:28
【问题描述】：

所以我需要在像这样的行之后获取数字

<div class="gridbarvalue color_blue">79</div>

和

<div class="gridbarvalue color_red">79</div>

有没有办法可以在findAll('div', text=re.recompile('<>)) 中找到带有gridbarvalue color_<red or blue> 的标签？

我正在使用 Beautifulsoup。

也很抱歉，如果我没有把我的问题说清楚，我对此非常缺乏经验。

【问题讨论】：

你的意思是要提取号码79？
是的，就像我说的那样有多行，我想抓住数字。
Check out this part of the documentation
你想让它得到号码79对吗？
您应该能够使用 BeautifulSoup 来做到这一点，而无需正则表达式。查看已经发布的数千个问题，了解使用正则表达式解析 HTML 不好的原因。

标签： python regex beautifulsoup findall

【解决方案1】：

class 是 Python 关键字，因此 BeautifulSoup 要求您在将其用作关键字参数时在其后添加下划线

>>> soup.find_all('div', class_=re.compile(r'color_(?:red|blue)'))
[<div class="gridbarvalue color_blue">79</div>, <div class="gridbarvalue color_red">79</div>]

要匹配文本，请使用

>>> soup.find_all('div', class_=re.compile(r'color_(?:red|blue)'), text='79')
[<div class="gridbarvalue color_blue">79</div>, <div class="gridbarvalue color_red">79</div>]

【讨论】：

【解决方案2】：

import re
elems = soup.findAll(attrs={'class' : re.compile("color_(blue|red)")})
for each e in elems:
    m = re.search(">(\d+)<", str(e))
    print "The number is %s" % m.group(1)

【讨论】：

我认为你应该使用e.strip()而不是使用正则表达式来提取。