提取 tr beautifulsoup 中的特定文本答案

【问题标题】：Extract specific text in tr beautifulsoup提取 tr beautifulsoup 中的特定文本
【发布时间】：2021-02-18 11:33:35
【问题描述】：

我一直坚持使用 beautifulsoup 从 html 代码中获取信息。我通过执行以下步骤提取了下面的 HTML 片段：

result = requests.get(url, headers = headers)
soup = BeautifulSoup(result.text, 'lxml')
tably = soup.find("table", id="table4")
last_row = tably.findAll('tr')[-1]

现在，我想获得以下输出：

Classification: Mass murderer
Characteristics: Militant Al-Takfir wa al-Hijran (Renunciation and Exile) faction
Number of victims: 23

示例 HTML：

    <tr>
    <td style="font-size: 8pt; color: #000000" width="100%">
    <style color="#000000" face="Verdana">
                  Classification: <b>Mass murderer</b></font></td>
                </tr>
                <tr>
                  <td width="100%" style="font-size: 8pt; color: #000000">
                                             
                  <style="font-size: 8pt" color="#000000" face="Verdana">
                  Characteristics:&nbsp;<b>Militant Al-Takfir wa
            al-Hijran </b>(Renunciation and Exile)<b> faction</b></font></td>
                </tr>
                <tr>
                  <td width="100%" style="font-size: 8pt; color: #000000">
                                             
                  <style="font-size: 8pt" color="#000000" face="Verdana">
                  Number of victims:&nbsp;<b>23</b></font></td>
                </tr>
                </font>

【问题讨论】：

标签： web-scraping beautifulsoup tags

【解决方案1】：

你可能想试试这个：

import requests
from bs4 import BeautifulSoup
from tabulate import tabulate


headers = {
    "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.152 Safari/537.36"
}

page = requests.get("https://murderpedia.org/male.A/a/abbas.htm", headers=headers).text
table = BeautifulSoup(page, "html5lib").find("table", {"id": "table4"})

output = [
             " ".join(i.getText(strip=True).split()).split(":") for i
             in table.find_all("td") if i.getText(strip=True)
         ][:9]

print(tabulate(output))

输出：

-----------------  --------------------------------------------------------------
Classification     Mass murderer
Characteristics    Militant Al-Takfir wa al-Hijran(Renunciation and Exile)faction
Number of victims  23
Date of murders    December 8,2000
Date of birth      1967
Victims profile    Maleworshippers
Method of murder   Shooting(Kalashnikov assault rifle)
Location           Omdurman, Sudan
Status             Shot to death by police

【讨论】：