【问题标题】:HTTP 403 error when trying to download a file from URL using Python尝试使用 Python 从 URL 下载文件时出现 HTTP 403 错误
【发布时间】:2021-04-01 11:31:46
【问题描述】:

我正在尝试从 URL 下载文件 -> https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519

我可以通过浏览器访问 URL 来手动下载文件,文件会自动保存到本地计算机的“下载”文件夹中。 (文件为JSON格式)

但是,我需要使用 Python 脚本来实现这一点。我尝试使用 urllib.request 和 wget,但在这两种情况下我都不断收到错误 -

    urllib.request.urlretrieve(url, path)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

【问题讨论】:

    标签: python-3.x wget


    【解决方案1】:

    Python 3

    import urllib.request, json 
    with urllib.request.urlopen("https://download.microsoft.com/download/7/1/D/71D86715-5596-4529-9B13-DA13A5DE5B63/ServiceTags_Public_20210329.json") as url:
        data = json.loads(url.read().decode())
        print(data)
    

    【讨论】:

    • 嘿!所以文件名 - “ServiceTags_Public_20210329.json”是动态的。它每周都会更改,日期会附加到文件名的末尾,如上所示。有解决方法吗?处理动态变化 ?谢谢!
    【解决方案2】:

    有解决办法吗?处理动态变化?

    您可以尝试以下脚本来获取下载 url 并下载 json 文件:

    import requests
    import re
    import urllib.request
    
    rq= requests.get("https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519")
     
    t = re.search("https://download.microsoft.com/download/.*?\.json", rq.text )
     
    
    
    a= t.group()
    
    print(a)
    
    path = r"$(Build.sourcesdirectory)\agent.json"
    urllib.request.urlretrieve(a, path)
    

    结果:

    【讨论】:

      猜你喜欢
      • 2020-05-21
      • 1970-01-01
      • 1970-01-01
      • 2020-07-21
      • 2020-10-19
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2017-05-14
      相关资源
      最近更新 更多