【发布时间】:2021-07-30 10:24:10
【问题描述】:
我有这个:
<span class="ld-nowrap"> 20th century’s </span>
我想得到这个:
<em> 20th century’s </em>
使用 python 3 和 BeautifulSoap
有什么想法吗?
【问题讨论】:
标签: python html css beautifulsoup
我有这个:
<span class="ld-nowrap"> 20th century’s </span>
我想得到这个:
<em> 20th century’s </em>
使用 python 3 和 BeautifulSoap
有什么想法吗?
【问题讨论】:
标签: python html css beautifulsoup
你可以用.replace_with()替换汤里面的标签:
from bs4 import BeautifulSoup
html_doc = """
<span class="ld-nowrap"> 20th century’s </span>
"""
soup = BeautifulSoup(html_doc, "html.parser")
# 1. find the <span> tag to replace:
span = soup.find("span", class_="ld-nowrap")
# 2. create new <em> tag with the same contents as <span>
em = soup.new_tag("em")
em.contents = span.contents
# 3. replace the tag inside the tree
span.replace_with(em)
print(soup)
打印:
<em> 20th century’s </em>
编辑:替换多个标签:
from bs4 import BeautifulSoup
html_doc = """
<span class="ld-nowrap"> 20th century’s </span>
<span class="ld-nowrap"> 21th century’s </span>
<span> No replace </span>
<span class="ld-nowrap"> 22th century’s </span>
"""
soup = BeautifulSoup(html_doc, "html.parser")
for span in soup.find_all("span", class_="ld-nowrap"):
em = soup.new_tag("em")
em.contents = span.contents
span.replace_with(em)
print(soup)
打印:
<em> 20th century’s </em>
<em> 21th century’s </em>
<span> No replace </span>
<em> 22th century’s </em>
【讨论】:
你是这个意思吗?
soup = '<span class="ld-nowrap"> 20th century’s </span>'
for x in soup.find_all('span', class_= 'ld-nowrap'):
print('<em>'+x.text+'</em>')
【讨论】: