无法使用 Python 请求或 Urllib 模块读取网站答案

【问题标题】：Unable to read website using Python requests or Urllib Module无法使用 Python 请求或 Urllib 模块读取网站
【发布时间】：2020-04-13 11:17:05
【问题描述】：

我正在尝试阅读此特定网页 NSE Option Chain

但无论如何我都没有得到任何回应。同时，如果我更改指向谷歌网站的链接，它就可以工作。这里的任何人都可以帮助解决这个问题。

这是我的代码

link="https://nseindia.com/live_market/dynaContent/live_watch/option_chain/optionKeys.jsp"
import urllib
print ("Going to read")
f = urllib.urlopen(link)
print ("Read")
myfile = f.read()
print(myfile)

【问题讨论】：

我认为服务器正在丢弃非浏览器会话的连接。为我打开 chrome，但我无法使用 requests 或 curl 访问该网站
您可能想查看selenium，它允许您使用 python 运行浏览器会话
“没有得到任何回应”是什么意思？您的请求因超时而失败，或者服务器以一些错误代码响应？

标签： python web-scraping python-requests urllib

【解决方案1】：

嗯，它可以通过浏览器和 Python requests 库为我工作：

import requests

url = "https://nseindia.com/live_market/dynaContent/live_watch/option_chain/optionKeys.jsp"
response = requests.get(url)
print(response.text)

response.text 包含带有表格的 HTML 页面，因此我可以将其保存到文件中，然后使用浏览器打开，它看起来与我在实际 URL 上看到的相似（除了缺少样式）：

with open("page.html", "w") as f:
    f.write(response.text)

【讨论】：