Python beautifulsoup 按行打印#答案

【问题标题】：Python beautifulsoup print by line #Python beautifulsoup 按行打印#
【发布时间】：2018-01-19 13:21:00
【问题描述】：

好的，所以我目前正在使用 python beautifulsoup 从 html 文件中输出特定的行，因为 html 包含多个相同的 div 类，它会输出每个包含相同类的 div，例如

内容：

<div class=border>aaaa</a>
<div class=border>example</a>
<div class=border>runrunrun</a>

输出：

<div class=border>aaaa</a>
<div class=border>example</a>
<div class=border>runrunrun</a>

现在我只想要 #2 的 div 类边框，

<div class=border>example</a>

现在如果我在 chrome 中查看源代码，它将以数字行显示内容，因此第 1 行将包含

<div class=border>aaaa</a>

& 第 2 行将包含

<div class=border>example</a>

是否可以使用漂亮的汤通过编号线输出？

【问题讨论】：

使用soup.find_all('div', {'class':'border'}) 并选择您需要的项目。
这必须手动完成..我希望它自动完成，另外还有 100 个具有相同名称的相同元素..有 100 个相同的请求..我必须这样做100 次大声笑。
我不是这个意思。例如：如果您需要第二个“div”，请使用：soup.find_all('div', {'class':'border'})[1]
尝试在我的脚本中实现这个但有问题stackoverflow.com/questions/45629540/…

标签： python beautifulsoup

【解决方案1】：

find_all 返回一个列表，因此您可以使用[1] 对其进行索引以获取第二个元素。

from bs4 import BeautifulSoup

html_doc = """<div class=border>aaaa</a>
<div class=border>example</a>
<div class=border>runrunrun</a>"""

soup = BeautifulSoup(html_doc, 'html.parser')

soup.find_all(class_="border")[1]

<div class="border">example</div>

【讨论】：

我已经尝试在我的脚本中实现这个，但是遇到了问题stackoverflow.com/questions/45629540/…

【解决方案2】：

如果您的列表包含由 soup.find_all 生成的 200 个元素...如果该列表称为 div_list，您可以执行索引循环（您需要索引 1、4、7 等...）

count = 1
while True:
    try:
        print(div_list[count])
        count+=3
    except:
    # happens because of index error
        break

甚至更短：

count = 1
while count<= len(div_list):
    print(div_list[count])
    count+=3

【讨论】：