请求不会从网页获取文本？答案

【问题标题】：Requests won't get the text from web page?请求不会从网页获取文本？
【发布时间】：2018-08-15 07:24:37
【问题描述】：

我正在尝试从网页中获取 VIX 的价值。

我正在使用的代码：

raw_page = requests.get("https://www.nseindia.com/live_market/dynaContent/live_watch/vix_home_page.htm").text
soup = BeautifulSoup(raw_page, "lxml")
vix = soup.find("span",{"id":"vixIdxData"})
print(vix.text)

这给了我：

''

如果我看到 vix，

<span id="vixIdxData" style=" font-size: 1.8em;font-weight: bold;line-height: 20px;">/span>

在网站上元素有文本，

<span id="vixIdxData" style=" font-size: 1.8em;font-weight: bold;line-height: 20px;">15.785/span>

15.785 值是我想通过使用请求获得的值。

【问题讨论】：

标签： python python-3.x python-requests

【解决方案1】：

当您在网络浏览器中打开页面时，文本（例如 15.785）会通过 getIndiaVixData.js 脚本插入到 span 元素中。

当您在 Python 中使用 requests 获取页面时，仅检索 HTML 代码，不进行 JavaScript 处理。因此，span 元素保持为空。

仅使用requests 解析页面的 HTML 代码是不可能获取该数据的。

【讨论】：

@Sid 你可能会发现这些有用：stackoverflow.com/questions/43731197/…stackoverflow.com/questions/16375251/…

【解决方案2】：

您要查找的数据在页面源中不可用。而requests.get(...) 只为您提供页面源，而没有通过 JavaScript 动态添加的元素。但是，您仍然可以使用requests 模块获取它。

在“网络”选项卡的开发者工具中，您可以看到一个名为 VixDetails.json 的文件。正在向https://www.nseindia.com/live_market/dynaContent/live_watch/VixDetails.json 发送请求，该请求以 JSON 的形式返回数据。

您可以使用requests 模块的内置.json() 函数访问它。

r = requests.get('https://www.nseindia.com/live_market/dynaContent/live_watch/VixDetails.json')
data = r.json()
vix_price = data['currentVixSnapShot'][0]['CURRENT_PRICE']
print(vix_price)
# 15.7000

【讨论】：