Python - 循环遍历 HTML 标签并使用 IF答案

【问题标题】：Python - Looping through HTML Tags and using IFPython - 循环遍历 HTML 标签并使用 IF
【发布时间】：2015-10-22 02:55:37
【问题描述】：

我正在使用 python 从网页中提取数据。该网页有一个重复出现的 html div 标签，其中包含 class= "result"，其中包含其他数据（例如位置、组织等）。我能够使用漂亮的汤成功地循环浏览 html，但是当我添加一个条件时，例如某个单词（例如，'NHS'）是否存在于片段中，它不会返回任何内容 - 尽管我知道某些片段包含它。这是代码：

soup = BeautifulSoup(content)
details = soup.findAll('div', {'class': 'result'})

for detail in details:
    if 'NHS' in detail:
        print detail

希望我的问题有意义...

【问题讨论】：

这个NHS 存在于哪里？是在正文部分吗？显示您正在谈论的 html 的示例。
detail 是 BS tag 对象的实例。要检查文本中是否存在某些内容，请尝试使用 if 'NHS' in detail.text

标签： python html string web beautifulsoup

【解决方案1】：

findAll 返回标签列表，而不是字符串。也许将它们转换为字符串？

s = "<p>golly</p><p>NHS</p><p>foo</p>"
soup = BeautifulSoup(s)
details = soup.findAll('p')
type(details[0])    # prints: <class 'BeautifulSoup.Tag'>

您正在标签中寻找一个字符串。最好在字符串中寻找一个字符串...

for detail in details:
    if 'NHS' in str(detail):
        print detail

【讨论】：