【发布时间】:2019-05-06 09:10:36
【问题描述】:
[在此处输入图片描述][1]作为标题,python requests 无法获取网站内容,但postman 可以使用
我曾尝试将网站与邮递员连接并获取网页内容,
但是当我执行下面的邮递员生成的代码时我无法实现它,相反,我在 python 2 和 3 中收到错误500 状态代码。
导入请求
url = "https://www.screener.reuters.wallst.com/Stock/US/ResultsTable"
querystring = {"returnChoice":"","page":"2","sortBy":"RCCMultexCompanyName","sortDir":"A","quickscreen":"gaarp","criteria":"%5B%22StockUniverse%22%2C%22PriceEarnings%22%2C%22EPSGrowthRate%22%2C%22Region%22%2C%22SectorAndIndustry%22%2C%22PEGRatio%22%5D","Currency":"USD","PEGRatio":"%7B%22view%22%3A%22button%22%2C%22button_inputs%22%3A%5B%5D%2C%22range_inputs%22%3A%22LSS%7C1%22%7D","SectorAndIndustry":"%7B%22industries%22%3A%5B%2257111%22%2C%2257112%22%2C%2257121%22%2C%2257131%22%2C%2257132%22%2C%2257211%22%2C%2257212%22%5D%7D","Region":"%7B%22countries%22%3A%5B%22TW%22%5D%7D","EPSGrowthRate":"%7B%22view%22%3A%22button%22%2C%22button_inputs%22%3A%5B%5D%2C%22range_inputs%22%3A%22GTR%7C15%22%7D","PriceEarnings":"%7B%22view%22%3A%22button%22%2C%22button_inputs%22%3A%5B%5D%2C%22range_inputs%22%3A%22GEQ%7C0%7CLEQ%7C15%22%7D","StockUniverse":"%7B%22button_inputs%22%3A%5B%22LIKE%7CUnited%2BStates%22%2C%22NOTLIKE%7CUnited%2BStates%22%5D%7D","OriginalCurrency":"USD%0A"}
headers = {
'cache-control': "no-cache",
}
response = requests.request("GET", url, headers=headers, params=querystring)
print(response.text)
我除了得到200个状态码,但实际上得到了500个。 这很奇怪,因为邮递员可以得到正确的结果而python不能,即使我已经填写了标题。
headers = {
'Accept-Encoding': "gzip, deflate, br",
'Accept-Language': "zh-TW,zh;q=0.9,en-US;q=0.8,en;q=0.7",
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.90 Safari/537.36',
'Content-Type': "application/x-www-form-urlencoded",
'Accept': "application/json",
'X-Requested-With': "XMLHttpRequest",
'Connection': "keep-alive",
'cache-control': "no-cache",
}
The below url can get content if you connect it with browser or postman but can't get if you connect it with python reqeusts or urllib.
https://www.screener.reuters.wallst.com/Stock/US/ResultsTable?returnChoice=&page=2&sortBy=RCCMultexCompanyName&sortDir=A&quickscreen=gaarp&criteria=%5B%22StockUniverse%22%2C%22PriceEarnings%22%2C%22EPSGrowthRate%22%2C%22Region%22%2C%22SectorAndIndustry%22%2C%22PEGRatio%22%5D&Currency=USD&PEGRatio=%7B%22view%22%3A%22button%22%2C%22button_inputs%22%3A%5B%5D%2C%22range_inputs%22%3A%22LSS%7C1%22%7D&SectorAndIndustry=%7B%22industries%22%3A%5B%2257111%22%2C%2257112%22%2C%2257121%22%2C%2257131%22%2C%2257132%22%2C%2257211%22%2C%2257212%22%5D%7D&Region=%7B%22countries%22%3A%5B%22TW%22%5D%7D&EPSGrowthRate=%7B%22view%22%3A%22button%22%2C%22button_inputs%22%3A%5B%5D%2C%22range_inputs%22%3A%22GTR%7C15%22%7D&PriceEarnings=%7B%22view%22%3A%22button%22%2C%22button_inputs%22%3A%5B%5D%2C%22range_inputs%22%3A%22GEQ%7C0%7CLEQ%7C15%22%7D&StockUniverse=%7B%22button_inputs%22%3A%5B%22LIKE%7CUnited%2BStates%22%2C%22NOTLIKE%7CUnited%2BStates%22%5D%7D&OriginalCurrency=USD
And below is postman screenshot
[1]: https://i.stack.imgur.com/DO8ev.png
【问题讨论】:
-
网站本身好像有问题
-
我不确定该 url 是否为内部 URL,但在访问您指定的 url 时出现站点错误。你从哪里运行 python 脚本?还尝试从您运行脚本的位置为该 url 执行 nslookup
-
我尝试使用 python2 和在线(repl.it/languages/python3)在 pycharm 上运行脚本 奇怪的是我可以使用 postman 获取内容 postman 和没有标题的 pycharm 有什么不同?
-
我认为原因是 cookie。我无法在另一台计算机和邮递员上重做。问题是为什么邮递员不能得到正确的结果,有没有邮递员可以在没有标头和没有环境的情况下保存cookie?
标签: python python-requests postman