【问题标题】:Extracting class inside a div returns None in Python Beautifulsoup在 div 中提取类在 Python Beautifulsoup 中返回 None
【发布时间】:2023-09-25 18:59:01
【问题描述】:

下面的 sn-p 并没有真正显示预期的数据,因为它返回 None。任何有关如何正确执行此操作的想法和意见都会非常有帮助。

from bs4 import BeautifulSoup
from urllib import request
from urllib.request import Request, urlopen

url = "https://bscscan.com/block/9478762"
headers = {"User-Agent": "Mozilla/5.0"}

req = Request(url, headers=headers)
html = urlopen(req).read()
soup = BeautifulSoup(html, "html.parser")

blockheight = soup.find('div', attrs={'class': 'font-weight-sm-bold mr-2'})
print ("Block Height: ", blockheight)

blocktimestamp = soup.find('div', attrs={'class': 'far fa-clock small mr-1'})
print ("Timestamp ",blocktimestamp)

blocktransactions = soup.find('div', class_ = 'u-label u-label--value u-label--primary rounded my-1')
print ("Transactions ", blocktransactions)

电流输出:

    Block Height:   None
    Timestamp:      None
    Transactions:   None

想要的输出:

    Block Height:   9478762
    Timestamp:      Jul-25-2021 11:43:52 PM +UTC
    Transactions:   223 -> transactions https://bscscan.com/txs?block=9478762
                    37 -> contract internal transactions https://bscscan.com/txsInternal?block=9478762
    Validated by:   0xb218c5d6af1f979ac42bc68d98a5a0d796c6ab01

【问题讨论】:

    标签: python python-3.x beautifulsoup python-requests webrequest


    【解决方案1】:

    我希望这会有所帮助:

    from bs4 import BeautifulSoup
    from urllib import request
    from urllib.request import Request, urlopen
    
    url = "https://bscscan.com/block/9478762"
    headers = {"User-Agent": "Mozilla/5.0"}
    
    req = Request(url, headers=headers)
    html = urlopen(req).read()
    soup = BeautifulSoup(html, "html.parser")
    
    blockheight = soup.find('span', attrs={'class': 'font-weight-sm-bold mr-2'}).contents[0]
    print ("Block Height: ", str(blockheight).replace("\n", ""))
    
    blocktimestamp = soup.find('i', attrs={'class': 'far fa-clock small mr-1'}).next_sibling
    print ("Timestamp: ",str(blocktimestamp).replace("\n", ""))
    
    blocktransactions = soup.find('a', class_ = 'u-label u-label--value u-label--primary rounded my-1').contents[0]
    print ("Transactions: ", blocktransactions)
    

    输出:

    Block Height:   9478762
    Timestamp:  2 hrs 35 mins ago (Jul-25-2021 11:43:52 PM +UTC) 
    Transactions:  223 transactions
    

    【讨论】:

    • 我设法进行了一些编辑,但我仍然缺少 href 部分的数据。不过我会等待更多的想法。
    • 我设法进行了编辑,并按照我想要的方式呈现了结果。
    最近更新 更多