我在下面找到了 2 个问题,但只有一个解决方案
- 是被拒绝的代理服务器。
- 您需要对服务器进行身份验证,以防它响应 403 禁止
使用 urllib
from urllib import request as urlrequest
proxy_host = '23.107.176.36:32180'
url = "https://www.kbb.com/gmc/canyon-extended-cab/2018/"
req = urlrequest.Request(url)
# req.set_proxy(proxy_host, 'https')
page = urlrequest.urlopen(req)
print(req)
> urllib.error.HTTPError: HTTP Error 403: Forbidden
使用请求
import requests
url = "https://www.kbb.com/gmc/canyon-extended-cab/2018/"
res = requests.get(url)
print(res)
# >>> <Response [403]>
使用 PostMan
编辑解决方案
设置超时垃圾更长的时间。但是 我不得不重试几次,因为代理有时只是不响应
import urllib.request
proxy_host = '23.107.176.36:32180'
url = "https://www.kbb.com/gmc/canyon-extended-cab/2018/"
proxy_support = urllib.request.ProxyHandler({'https' : proxy_host})
opener = urllib.request.build_opener(proxy_support)
urllib.request.install_opener(opener)
res = urllib.request.urlopen(url, timeout=1000) # Set
print(res.read())
结果
b'<!doctype html><html lang="en"><head><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta charset="utf-8"><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=5,minimum-scale=1"><meta http-equiv="x-dns-prefetch-control" content="on"><link rel="dns-prefetch preconnect" href="//securepubads.g.doubleclick.net" crossorigin><link rel="dns-prefetch preconnect" href="//c.amazon-adsystem.com" crossorigin><link .........
使用请求
import requests
proxy_host = '23.107.176.36:32180'
url = "https://www.kbb.com/gmc/canyon-extended-cab/2018/"
# NOTE: we need a loger timeout for the proxy t response and set verify sale for an ssl error
r = requests.get(url, proxies={"https": proxy_host}, timeout=90000, verify=False) # Timeout are in milliseconds
print(r.text)