使用 BeautifulSoup 从网站获取表格答案

【问题标题】：Fetch Table from Website using BeautifulSoup使用 BeautifulSoup 从网站获取表格
【发布时间】：2020-05-09 22:19:43
【问题描述】：

使用 Python，我正在尝试抓取网站并尝试获取一些值。在这种情况下，我想抢一张桌子。这是有问题的具体网站：

http://wotvffbe.gamea.co/c/5vdp3v91

当试图抓取它时，我试图在数据表中抓取这些值：

我正在使用 BeautifulSoup 筛选值。我想要一种方法来抓取它们，以便通过某种形式的参考来发现它们。我之前能够获取这些值，但是当移动到下一个站点时，它们不在同一个位置。因此，我想要一种通过参考而不是位置来发现它的方法。

感谢您的意见。

另外，如果您想测试其他网站，我正在测试以下网站：

http://wotvffbe.gamea.co/c/v89gxxuy

http://wotvffbe.gamea.co/c/yhb5ucqz

http://wotvffbe.gamea.co/c/yju5zfhe

【问题讨论】：

标签： python-3.x web-scraping beautifulsoup python-requests

【解决方案1】：

import requests
from bs4 import BeautifulSoup

url = 'http://wotvffbe.gamea.co/c/5vdp3v91'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')

table = soup.select_one('th:contains("Cost")').find_parent('table')
d = dict([(th.text, td.text) for th, td in zip(table.select('th'), table.select('td'))])

# pretty print it to screen:
from pprint import pprint
pprint(d)

打印：

{'AP': '110',
 'Attack': '225',
 'Cost': '80',
 'Dexterity': '168',
 'HP': '2079',
 'Jump': '2',
 'Luck': '149',
 'Magic ': '64',
 'Move': '3',
 'Range': '1',
 'Speed': '62',
 'TP': '117'}

【讨论】：

非常感谢！
是否需要升级或安装才能运行此代码？我之前遇到过这个错误，但不知道该怎么办：只实现了以下伪类：nth-of-type.
@ChaseSariaslani 是的，您使用的是旧版本的 BeautifulSoup。我正在使用beautifulsoup4==4.9.0
我能够让它工作。我不了解机制，但通过安装它，我获得了在 vs Code 中使用的 BeautifulSoup 的最新安装： conda install -c anaconda beautifulsoup4 然后您的代码成功运行。
@AndrejKesely，您是如何在页面中找到 th:contains("Cost") 值的？也许一点解释会极大地帮助其他人阅读答案