【问题标题】:Unable to grab div class with BeautifulSoup无法使用 BeautifulSoup 抓取 div 类
【发布时间】:2020-07-13 22:04:34
【问题描述】:

我正在尝试通过类来获取 div,但无论出于何种原因我都不能。有一个 id,但每个产品都不同。如何成功抓取

这是我的代码;

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = "https://meqasa.com/apartments-for-sale-in-Accra"

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html, "html.parser")

containers = page_soup.findAll("div", {"class": "row mqs-prop-inner-wrap with featt"})
print(len(containers))

【问题讨论】:

    标签: python web-scraping beautifulsoup findall


    【解决方案1】:

    您可以使用 CSS 选择器div[id^="feli"]。这将选择所有id= 以“feli”开头的<div> 标签。

    例如:

    import requests
    from bs4 import BeautifulSoup
    
    
    url = 'https://meqasa.com/apartments-for-sale-in-Accra'
    soup = BeautifulSoup(requests.get(url).content, 'html.parser')
    
    for item in soup.select('div[id^="feli"]'):
        print(item.h2.get_text(strip=True))
        print('https://meqasa.com' + item.a['href'])
        print('-' * 80)
    

    打印:

    3 bedroom apartment for sale at Community 25, Tema, Tema, Greater Accra Region
    https://meqasa.com/3-bedroom-apartment-for-sale-in-Community 25, Tema, Tema, Greater Accra Region, Ghana-unit-1411
    --------------------------------------------------------------------------------
    2 bedroom apartment for sale at Sakumono
    https://meqasa.com/2-bedroom-apartment-for-sale-in-Sakumono-unit-1385
    --------------------------------------------------------------------------------
    1 bedroom apartment for sale at East Legon
    https://meqasa.com/1-bedroom-apartment-for-sale-in-East Legon-unit-1383
    --------------------------------------------------------------------------------
    2 bedroom apartment for sale at Accra
    https://meqasa.com/2-bedroom-apartment-for-sale-in-Accra-unit-1408
    --------------------------------------------------------------------------------
    1 bedroom apartment for sale at Sakumono
    https://meqasa.com/1-bedroom-apartment-for-sale-in-Sakumono-unit-1363
    --------------------------------------------------------------------------------
    
    ... and so on.
    

    【讨论】:

    • 嗨,我很难理解为什么你可以使用 h2。我尝试了 p 并得到了价格,但是当我检查 html 时,我找不到名称的 h2 和价格的 p。我想用同样的格式来抓取日期
    猜你喜欢
    • 2020-11-10
    • 2015-03-02
    • 2021-11-01
    • 2023-03-24
    • 2018-07-31
    • 2021-06-07
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多