用python和beautifulsoup抓取没有id的表答案

【问题标题】：Scraping a table without id with python and beautifulsoup用python和beautifulsoup抓取没有id的表
【发布时间】：2025-12-20 11:35:16
【问题描述】：

我目前正在创建一个电报机器人。我现在想添加命令/drop Sorties，但我需要bs4 从this 页面中抓取表格。

机器人应该回答类似

Rifle Riven Mod   Rare (6.79%)
Ayatan Anasa Sculpture    Uncommon (28.00%)
4000 Endo Uncommon (12.10%)
etc etc etc..

我应该在代码中定义一些内容，以便仅在该定义的页面中查找用户输入，并使用他在该页面中找到的下一个表进行回复。

来自上面提供的链接的示例 html

<h3 id="sortieRewards">Sorties:</h3>
<table><tbody><tr><th colspan="2">Sortie</th></tr><tr><td>Rifle Riven Mod</td><td>Rare (6.79%)</td></tr><tr><td>Ayatan Anasa Sculpture</td><td>Uncommon (28.00%)</td></tr><tr><td>4000 Endo</td><td>Uncommon (12.10%)</td>

即使用户的输入是Sortie 而不是Sorties:，机器人也应该回复表格的内容

【问题讨论】：

欢迎来到 SO。我尽力弄清楚你问题的本质。它仍然有点不连贯，但这是我能做的。因此，请考虑根据我的输入重写您的问题。请尽量使您的问题简短，同时仍包含所有相关信息。
感谢@LonelyNeuron，如果我写了这么多，对不起！这是我的第一次尝试，但结果为空from bs4 import BeautifulSoup import urllib2 wiki = "https://www.warframe.com/repos/hnfvc0o3jnfvc873njb03enrf56.html" header = {'User-Agent': 'Mozilla/5.0'} #Needed to prevent 403 error on Wikipedia req = urllib2.Request(wiki,headers=header) page = urllib2.urlopen(req) soup = BeautifulSoup(page) table = soup.find_all('Sorties') print table 这是用于本地测试，我稍后可以将其改编为电报机器人
@SHADOWSLIFER 问题不是你写的太多，而是你写的东西不相关或重复
@SHADOWSLIFER 请不要在 cmets 中发布代码。如果您想为您的问题添加一些内容，请使用edit 按钮。如果您有新问题，请咨询ask a new question。

标签： python html python-2.7 beautifulsoup

【解决方案1】：

soup = BeautifulSoup(page, 'lxml')

sorties_header = soup.find('h3',{'id':'sortieRewards'})
sorties_table = sorties_header.find_next('table')

# First row is header. We need to skip it
for sortie in sorties_table.find_all('tr')[1:]:
    data = sortie.find_all('td')
    item = data[0].text
    drop_rate = data[1].text
    print(item,drop_rate)

输出是

Rifle Riven Mod Rare (6.79%)
Ayatan Anasa Sculpture Uncommon (28.00%)
4000 Endo Uncommon (12.10%)

【讨论】：

谢谢！！它没有问题！我现在应该找到一种在该代码中定义用户输入的方法，以搜索用户搜索的下一个表。我该怎么做？
抓取和读取用户数据并根据该数据采取行动是不同的事情。抱歉，我不知道如何为您提供帮助。