【问题标题】:Python Scraping AJAX Post RequestPython 抓取 AJAX 发布请求
【发布时间】:2021-03-24 16:54:53
【问题描述】:
import requests
import json
import csv
import pandas as pd 
import time
from bs4 import BeautifulSoup
from requests import Session

url = 'https://www.agathaparis.com/ajax.V1.php/en_US/Rbs/Storelocator/Store/'

payload={"websiteId":603593,"sectionId":603593,"pageId":868982,"data":{"currentStoreId":0,"distanceUnit":"kilometers","distance":"50kilometers","coordinates":{"latitude":48.856614,"longitude":2.3522219},"commercialSign":0},"dataSets":"coordinates,address,card,allow","URLFormats":"canonical,contextual","visualFormats":"original,listItem","pagination":"0,50","referer":"https://www.agathaparis.com/our-stores.html"}

s=requests.Session()

s.get('https://www.agathaparis.com/our-stores.html')

headers={
    'Content-Type': 'application/json',
    'Accept': 'application/json, text/plain, */*',
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'en-gb',
    'Host': 'www.agathaparis.com',
    'Origin': 'https://www.agathaparis.com',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.1 Safari/605.1.15',
    'Connection': 'keep-alive',
    'Referer': 'https://www.agathaparis.com/our-stores.html',
    'Content-Length': '407',
    'Cookie': '_fbp=fb.1.1607609084947.2075070555; _ga=GA1.2.964068958.1607609084; _gid=GA1.2.1470390017.1607868080; _gat_UA-33249847-1=1; rbsWebsiteTrackerHasConsent=true; rbsWebsiteTrackerHasConsentGdpr=%7B%22technical%22%3Atrue%2C%22analytics%22%3Atrue%2C%22advertising%22%3Atrue%7D; PHPSESSID=n4uf5tfuf96k141vemo5s9g99g',

}

resp = s.post(url,data=payload,headers=headers)

我正在尝试通过此发布请求提取商店列表。我不明白我错过了什么。提前感谢您的帮助

【问题讨论】:

    标签: python ajax post web-scraping python-requests


    【解决方案1】:

    您的主要错误是您发布错误的Content-Type。您需要发布 JSON 而不是 application/x-www-form-urlencoded:

    headers={
        'Content-Type': 'application/json',
        'Accept': 'application/json, text/plain, */*',
        'Accept-Encoding': 'gzip, deflate, br',
        'Accept-Language': 'en-gb',
        'Host': 'www.agathaparis.com',
        'Origin': 'https://www.agathaparis.com',
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.1 Safari/605.1.15',
        # 'Connection': 'keep-alive',
        'Referer': 'https://www.agathaparis.com/our-stores.html',
        'x-http-method-override': 'GET',
        # 'Content-Length': '407',
        # 'Cookie': '_fbp=fb.1.1607609084947.2075070555; _ga=GA1.2.964068958.1607609084; _gid=GA1.2.1470390017.1607868080; _gat_UA-33249847-1=1; rbsWebsiteTrackerHasConsent=true; rbsWebsiteTrackerHasConsentGdpr=%7B%22technical%22%3Atrue%2C%22analytics%22%3Atrue%2C%22advertising%22%3Atrue%7D; PHPSESSID=n4uf5tfuf96k141vemo5s9g99g',
    
    }
    
    resp = s.post(url, json=payload, headers=headers)
    

    【讨论】:

    • 非常感谢!你摇滚:)。你怎么知道你需要添加 'x-http-method-override': 'GET' ?我们什么时候需要放置 json=payload 而不是 data=payload?非常感谢您的帮助
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2017-07-25
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多