【发布时间】:2020-09-06 16:36:19
【问题描述】:
在 Stackoverflow 的帮助下,我能够想出刮板。该代码返回一个零件编号列表及其对应的价格。
第 1 部分价格1
第 2 部分价格2
...
...
合作伙伴价格
但是,该网站似乎只允许 200 个请求 - 当我将限制提高到 200+ 时,我会收到错误消息:“raise JSONDecodeError("Expecting value", s, err.value) from None JSONDecodeError: Expecting value"。
我只想知道有没有办法避免这个错误?如果不是,我可以每次将 start:0 提高 200,但是由于我很容易拥有 100k+ 个项目,因此效率不会很高..有没有办法可以循环限制和启动功能?
请查看下面的代码,感谢您的帮助!
import requests
# import pprint # to format data on screen `pprint.pprint()
import pandas as pd
# --- fucntions ---
def get_data(query):
"""Get data from server"""
payload = {
# "facets":[{
# "name":"OEM",
# "value":"GE%20Healthcare"
# }],
"facets":[],
"facilityId": 38451,
"id_ins": "a2a3d332-73a7-4194-ad87-fe7412388916",
"limit": 200,
"query": query,
"referer": "/catalog/Service",
"start": 0,
# "urlParams":[{
# "name": "OEM",
# "value": "GE Healthcare"
# }],
"urlParams":[]
}
r = requests.post('https://prodasf-vip.partsfinder.com/Orion/CatalogService/api/v1/search', json=payload)
data = r.json()
return data
all_queries = ['GE Healthcare']
for query in all_queries:
#print('\n--- QUERY:', query, '---\n')
data = get_data(query)
Part_Num = []
Vendor_Item_Num = []
price = []
for item in data['products']:
if not item['options']:
Part_Num.append([])
Vendor_Item_Num.append([])
price.append([])
else:
all_prices = [option['price'] for option in item['options']]
all_vendor = [option['price'] for option in item['options']]
all_part_num = item['partNumber']
Part_Num.append(all_part_num)
Vendor_Item_Num.append(all_vendor)
price.append(all_prices)
list_of_dataframes = [pd.DataFrame(Part_Num),pd.DataFrame(price)]
pd.concat(list_of_dataframes, axis=1).to_csv(r'C:\Users\212677036\Documents\output7.csv')
【问题讨论】:
-
您每天可以有 200 个请求,或者您最多可以同时有 200 个请求?
-
我相信是并行的
标签: python json pandas web-scraping request