【发布时间】:2021-02-05 08:02:07
【问题描述】:
我正在尝试从网站 daraz.pk 抓取数据,这是我迄今为止在 jupyter notebook 中编写的代码:
import requests
from bs4 import BeautifulSoup as soup
from time import sleep
#url of the website we want to scrape which in this case is the url of daraz.pk for swimsuits
my_url = "https://www.daraz.pk/catalog/?spm=a2a0e.home.search.1.35e349376res9Z&q=swimsuits&_keyori=ss&from=search_history&sugg=swimsuits_0_1"
page = requests.get(my_url)
pagesrc = soup(page.text, 'html.parser')
#making a container to save all the data in
container = pagesrc.find('div', {'class':'c1_t2i'})
#our gallery is the product-item
gallery = container.find_all('div', {'class':'c2prKC'})
sleep(1)
这是我得到的错误:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-7-4fa8d4bc2410> in <module>
6 container = pagesrc.find('div', {'class':'c1_t2i'})
7 #our gallery is the product-item
----> 8 gallery = container.find_all('div', {'class':'c2prKC'})
9 sleep(1)
AttributeError: 'NoneType' object has no attribute 'find_all'
我对网络抓取非常陌生,我试图遵循堆栈溢出的答案,该答案出现在同一主题的另一个问题中,但没有帮助。这是问题
Python error: 'NoneType' object has no attribute 'find_all'
我们将不胜感激!
【问题讨论】:
标签: html python-3.x web-scraping beautifulsoup python-requests