【问题标题】:python web scraping soup.find give emptypython web抓取soup.find给空
【发布时间】:2021-07-29 05:53:43
【问题描述】:

为什么 a 和价格是空的,即使汤有数据也不能从汤中提取值

'''

import requests
from bs4 import BeautifulSoup
import csv
import re

r = requests.get('https://www.daraz.pk/catalog/?q=iphone&_keyori=ss&from=input&spm=a2a0e.home.search.go.35e34937c3qzgp')
soup = BeautifulSoup(r.text, 'html.parser')
products=[] #List to store name of the product
prices=[] #List to store price of the product
ratings=[] #List to store rating of the product

for item in soup.find_all(class_='c3gUW0'):
    prices.append(item.text)

a=soup.find(class_='c3gUW0')
a**strong text**

'''

【问题讨论】:

    标签: python web beautifulsoup screen-scraping


    【解决方案1】:

    网站是动态加载的,因此requests 模块不支持它。但是,数据以 JSON 格式嵌入到网站中,您可以使用内置的re(正则表达式)模块找到数据,并将其转换为 Python 字典(dict) 使用内置的json模块。

    例如,打印名称和价格:

    import re
    import json
    import requests
    from bs4 import BeautifulSoup
    
    URL = "https://www.daraz.pk/catalog/?q=iphone&_keyori=ss&from=input&spm=a2a0e.home.search.go.35e34937c3qzgp"
    soup = BeautifulSoup(requests.get(URL).content, "html.parser")
    
    data = json.loads(re.search(r"window.pageData=({.*})", str(soup)).group(1))
    for d in data["mods"]["listItems"]:
        print("Description:", d["name"])
        print("Price:", d["priceShow"])
        print('-' * 80)
    

    输出(截断):

    Description: Apple iPhone 12 - 64GB - PTA Approved
    Price: Rs. 215,000
    --------------------------------------------------------------------------------
    Description: iPhone 12 - 128GB - PTA Approved
    Price: Rs. 230,000
    --------------------------------------------------------------------------------
    Description: iPhone 8 Plus - 5.5" Display - 3GB RAM + 64/256GB ROM - Phone Only
    Price: Rs. 66,999
    --------------------------------------------------------------------------------
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2020-08-04
      • 2018-06-13
      • 2021-04-20
      • 2021-03-16
      • 2017-05-24
      • 2021-06-20
      • 2017-11-23
      • 2022-01-24
      相关资源
      最近更新 更多