获取所需数据:
查看页面源码,提交数据的表单(邮编)为:
<form action="/ecom/account/sign-in" method="post">
<input type="hidden" name="form" value="ZipCode" />
<div class="field id-ZipCode">
<input data-val="true" data-val-required="Zip Code is required." data-val-sdcexactlength="Zip Code must be 5 characters long." data-val-sdcexactlength-max="5" data-val-sdcexactlength-min="5" data-val-sdcnumeric="Zip Code must contain numeric characters only." data-val-sdcnumeric-pattern="^[0-9]*$" id="Register_ZipCode" maxlength="5" name="Register.ZipCode" placeholder="Enter Your Zip Code" type="text" value="" />
</div>
<div class="btn btn-getstarted submit btn-round " onclick="javascript:trackLinkZipGetStarted();">
<input class="submit btn-round " name="Browse" type="submit" value=" Get Started "></input>
</div>
</form>
我删除了一些 <div> 标签,因为它们无关紧要。
从这个表格中,我们需要的信息是:
URL = 'https://shop.jewelosco.com/ecom/account/sign-in'
-
method="post" 意味着我们必须使用 requests.post()
data = {'form': 'ZipCode', 'Register.ZipCode': '60637', 'Browse': ' Get Started '}
(注意:您必须使用name作为键和value作为值来提供表单数据中<input>标签中包含的所有值。)
发送数据:
发送邮政编码的代码:
data = {'form': 'ZipCode', 'Register.ZipCode': '60637', 'Browse': ' Get Started '}
with requests.Session() as s:
r = s.post('https://shop.jewelosco.com/ecom/account/sign-in', data=data)
如果您检查响应历史记录和当前 url,您会看到它被重定向到 https://shop.jewelosco.com/ecom/home,这是我们要从中获取数据的 url。
>>> r.status_code
200
>>> r.url
https://shop.jewelosco.com/ecom/home
>>> r.history
[<Response [302]>]
要检查我们是否已成功发布此数据,您可以使用:
>>> 'Top Offers & Shopping Tools' in r.text
True
搜索项目:
现在我们已成功发布邮政编码,您可以使用此 Session 对象 (s) 搜索您想要的任何内容。
完整代码:
data = {'form': 'ZipCode', 'Register.ZipCode': '60637', 'Browse': ' Get Started '}
with requests.Session() as s:
s.post('https://shop.jewelosco.com/ecom/account/sign-in', data=data)
r = s.get('https://shop.jewelosco.com/ecom/search?source=searchBox&searchTerm=chicken')
print('Perdue Chicken Ground Fresh - 16 Oz' in r.text)
# prints 'True'