【发布时间】:2023-10-09 01:19:01
【问题描述】:
我正在尝试使用 python3.8.7 和 BeautifulSoup4.9.3 解析一个充满表格的网页,以便我可以在电报频道上显示它。我可以从网页中获取所有必需的表格,但在这些表格的深处有td 标签,其中包含img 标签和需要用p 标签替换的星号src。这些是到目前为止的代码:
import pickle
import bs4 as bs
v_file = open('data/pickled_data/pickled_v', 'rb')
v_pickled = pickle.load(v_file)
v_soup = bs.BeautifulSoup(v_pickled.content, "html5lib")
all_tbls = v_soup.find_all('table')
我已经尝试替换图像——也就是star_image——如下,但它返回AttributeError: 'NoneType' object has no attribute 'replace_with':
url_2_check = "https://i.imgur.com/ffIvqVj.png"
for table in all_tbls:
for tr in table.find_all('tr'):
for td in table.find_all('td'):
for star_image in td.find_all('img'):
if star_image['src'] == url_2_check:
p_tag = v_soup.new_tag('p')
p_tag.string = ":star:"
td.star_image.replace_with(p_tag)
然后我尝试如下,但它返回ValueError: Cannot replace one element with another when the element to be replaced is not part of a tree:
for table in all_tbls:
for tr in table.find_all('tr'):
for td in table.find_all('td'):
for star_image in td.find_all('img'):
if star_image['src'] == url_2_check:
p_tag = v_soup.new_tag('p')
p_tag.string = ":star:"
td.replace_with(p_tag)
我似乎无法弄清楚我做错了什么,有人可以帮忙吗?
谢谢。
【问题讨论】:
-
可以分享网址吗?
-
@AndrejKesely xxviptips.blogspot.com
-
你想用数字替换星星吗?
-
不,我想将其替换为:star:
标签: python python-3.x beautifulsoup html-parsing