如何使用 BeautifulSoup/Python 提取网页脚本的结果答案

【问题标题】：How to extract results of webpage script using BeautifulSoup/Python如何使用 BeautifulSoup/Python 提取网页脚本的结果
【发布时间】：2017-12-26 14:41:25
【问题描述】：

我正在尝试编写一个 Python 程序，以使用 BeautifulSoup 抓取给定产品编号的产品名称。我以这个页面为例：http://www.adv-bio.com/search-results/?q=1081。理想情况下，我会从该页面中提取字符串“DAIRY COMPLEX NATURAL”，以及该链接的 URL。

我刚刚开始使用 BeautifulSoup，但到目前为止，我最接近我正在寻找的标签的是使用 soup.find('p') 显示的脚本，我不知道如何解析结果.

非常感谢大家提供的任何帮助。

这里编辑是我认为包含我想要的信息的脚本的代码：

<p><script>// <![CDATA[
    (function () {
        var frameBaseSRC = document.getElementById("results").src;
        var frameQueryString = document.location.href.split("?q=")[1];
        if (frameQueryString != undefined) {
            document.getElementById("results").src = frameBaseSRC + "?q=" + frameQueryString;
        }
    })();

// ]]>

到目前为止我的代码只有：

from bs4 import BeautifulSoup
import requests
page = requests("http://www.adv-bio.com/search-results/?q=1081")
soup = BeautifulSoup(page.text, 'lxml')
soup.find('p')

这只是给了我上面的脚本文本。

对不起，如果我不清楚，我花了几个小时阅读，但所有链接都是紫色的，我觉得我错过了一些简单的东西。

【问题讨论】：

请将相关代码添加到您的问题中。
@t.m.adam 这更清楚了吗？

标签： python-3.x web-scraping beautifulsoup

【解决方案1】：

如果您在浏览器中检查网络流量（检查 > 网络），您会注意到搜索结果内容是通过对 http://prod.adv-bio.com/SearchResults.aspx?q=1081 的请求提供的
因此，您可以改用该网址。

url = "http://prod.adv-bio.com/SearchResults.aspx?q=1081"
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
a = soup.find('a', {'id':'SearchGridView_ctl02_hlProdDetails'})
text, link = a.text, a.get('href')

【讨论】：