BeautifulSoup是一个模块,该模块用于接收一个HTML或XML字符串,然后将其进行格式化,之后便可以使用他提供的方法进行快速查找指定元素,从而使得在HTML或XML中查找指定元素变得简单。

 1 from bs4 import BeautifulSoup
 2  
 3 html_doc = """
 4 <html><head><title>The Dormouse's story</title></head>
 5 <body>
 6 asdf
 7     <div class="title">
 8         <b>The Dormouse's story总共</b>
 9         <h1>f</h1>
10     </div>
11 <div class="story">Once upon a time there were three little sisters; and their names were
12     <a  class="sister0" >Els<span>f</span>ie</a>,
13     <a href="http://example.com/lacie" class="sister" >Lacie</a> and
14     <a href="http://example.com/tillie" class="sister" >Tillie</a>;
15 and they lived at the bottom of a well.</div>
16 ad<br/>sf
17 <p class="story">...</p>
18 </body>
19 </html>
20 """
21  
22 soup = BeautifulSoup(html_doc, features="lxml")
23 # 找到第一个a标签
24 tag1 = soup.find(name='a')
25 # 找到所有的a标签
26 tag2 = soup.find_all(name='a')
27 # 找到id=link2的标签
28 tag3 = soup.select('#link2')
简单示例

相关文章:

  • 2022-12-23
  • 2021-06-30
  • 2022-01-13
  • 2021-08-07
  • 2022-12-23
  • 2022-12-23
猜你喜欢
  • 2022-12-23
  • 2022-02-28
  • 2021-12-22
  • 2021-12-17
  • 2022-01-18
  • 2021-11-09
  • 2022-12-23
相关资源
相似解决方案