【发布时间】:2019-06-02 04:50:49
【问题描述】:
我正在尝试以 CSV 格式从 json 格式导出数据,但没有得到任何结果。 下面是代码
import requests
from bs4 import BeautifulSoup
import json
import re
url = "https://www.daraz.pk/catalog/?q=dell&_keyori=ss&from=input&spm=a2a0e.home.search.go.35e34937qjElRf"
page = requests.get(url)
print(page.status_code)
print(page.text)
soup = BeautifulSoup(page.text, 'html.parser')
print(soup.prettify())
alpha = soup.find_all('script',{'type':'application/ld+json'})
jsonObj =`json.loads(alpha[1].text)`
for item in jsonObj['itemListElement']:
name = item['name']
price = item['offers']['price']
currency = item['offers']['priceCurrency']
availability = item['offers']['availability'].split('/')[-1]
availability = [s for s in re.split("([A-Z][^A-Z]*)", availability) if s]
availability = ' '.join(availability)
print('Availability: %s Price: %0.2f %s Name: %s' %(availability,float(price), currency,name))
这是我试图以 CSV 格式导出数据但没有以 CSV 格式获得结果的代码
创建要写入的文件,添加标题行
outfile = open('products.csv','w', newline='')
writer = csv.writer(outfile)
writer.writerow(["name", "offers", "price", "priceCurrency", "availability" ])
outfile.close()
alpha = soup.find_all('script',{'type':'application/ld+json'})
jsonObj = json.loads(alpha[1].text)
for item in jsonObj['itemListElement']:
name = item['name']
price = item['offers']['price']
currency = item['offers']['priceCurrency']
availability = item['offers']['availability'].split('/')[-1]
availability = [s for s in re.split("([A-Z][^A-Z]*)", availability) if s]
availability = ' '.join(availability)
【问题讨论】:
-
本页没有script type='application/ld+json'标签
-
所有产品数据,即名称、价格、货币、可用性都在脚本中。
-
试试下面的网址:daraz.pk/catalog/…
标签: python-3.x web-scraping beautifulsoup export-to-csv