【问题标题】:Not receiving response from HTTPS site using request module使用请求模块未收到来自 HTTPS 站点的响应
【发布时间】:2016-06-08 09:20:27
【问题描述】:

我正在尝试访问

https://www.exploit-db.com/remote

使用 python 的请求模块,但没有从页面获取响应。我想访问以上页面的所有链接。

mfun():
    response = requests.get('https://www.exploit-db.com/remote',verify=False)
    print(response.text)
    soup = bs4.BeautifulSoup(response.text)
    return [a.attrs.get('href') for a in soup.select('a[href^=/download/]')]

main():
    urls = myfun();
    for url in urls:
      response = requests.get(url)
      print(response.text)

我收到了回复:

C:\Python27\requests\packages\urllib3\connectionpool.py:791: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)

【问题讨论】:

  • 我收到403 Forbidden 响应,HTML 页面中的文本为Sucuri WebSite Firewall - CloudProxy - Access Denied

标签: python python-2.7 https python-requests


【解决方案1】:

该站点使用防火墙来寻找“脚本”访问。它可以简单地通过设置User-Agent 标头来解决; Mozilla/5.0 的值似乎足够了:

headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get('https://www.exploit-db.com/remote', headers=headers, verify=False)

请注意,结果页面没有以 download 为前缀的 URL;只有https://www.exploit-db.com/download。调整您的 ^= 前缀匹配,或改用 *=download

【讨论】:

    猜你喜欢
    • 2016-11-25
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-06-21
    • 2019-04-25
    相关资源
    最近更新 更多