【问题标题】:bs4 How can I extract the text within <p> tagbs4 如何提取 <p> 标签中的文本
【发布时间】:2021-05-20 07:07:42
【问题描述】:

我正在https://coinmarketcap.com/currencies/bitcoin/ 上练习解析,我真的很想知道,我如何在这个确切的&lt;p&gt; 标记中提取文本,因为有很多我想要的只有一个人的信息。感谢您的帮助。

import requests as r
from bs4 import BeautifulSoup

def find_info(self):
    api = r.get(self.url) #url is above in the description
    soup = BeautifulSoup(api.text, "html.parser")   
    soup.find_all('p')

    # and here I'm stuck.
    # I need to get the text from the chunk of HTML below.

    <p>
     <strong>
      Bitcoin price today
     </strong>
     is ₽3.795.164 RUB with a 24-hour trading volume of ₽6.527.780.409.893 RUB. Bitcoin is down,12% in the last 24 hours. The current CoinMarketCap ranking is #1, with a market cap of ₽70.707.857.530.563 RUB. It has a circulating supply of 18.631.043 BTC coins and a max. supply of 21.000.000 BTC coins.
    </p>

我以不同的方式尝试过,但在许多 p 标签中,我不知道如何获得这个确切的标签。

【问题讨论】:

    标签: python html beautifulsoup html-parsing


    【解决方案1】:
    【解决方案2】:

    使用css selector 获取您想要的段落。

    方法如下:

    import requests
    from bs4 import BeautifulSoup
    
    page = requests.get("https://coinmarketcap.com/currencies/bitcoin/").content
    print(BeautifulSoup(page, "html.parser").select_one('.about___1OuKY p').getText())
    

    输出:

    Bitcoin price today is $51,393.64 USD with a 24-hour trading volume of $88,784,693,272 USD. Bitcoin is up 4.87% in the last 24 hours. The current CoinMarketCap ranking is #1, with a market cap of $957,517,202,639 USD. It has a circulating supply of 18,631,043 BTC coins and a max. supply of 21,000,000 BTC coins.
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-09-07
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2014-10-04
      相关资源
      最近更新 更多