在 Python 中使用 Selenium 从“结果页面”中提取结果答案

【问题标题】：Extracting Results from "Result-Page" With Selenium in Python在 Python 中使用 Selenium 从“结果页面”中提取结果
【发布时间】：2020-10-22 04:37:01
【问题描述】：

不幸的是，我的 Python 程序的实现有点问题。在某一时刻，我无法再进一步了。该程序应执行以下操作：

在搜索引擎“www.startpage.com”上自动搜索特定关键字。
然后应该读出包含结果的页面（这就是问题所在）。
程序现在应该计算某个词在搜索结果页面上出现的频率。

这里的问题是我无法从搜索结果页面获取源代码。我只得到起始页的源代码有人知道解决方案吗？

提前致谢。

到目前为止，我的程序如下所示：

import selenium.webdriver as webdriver

def get_results(search_term):

    #this is the site, where I want to do the search
    url="https://www.startpage.com"
    browser = webdriver.Firefox()
    browser.get(url)

    search_box = browser.find_element_by_id("q")
    #search in the search box after the search term
    search_box.send_keys(search_term)
    search_box.submit()

    #print(browser.page_source) would give the result of the startpage (not the result page)

    sub="dog"
    print(source_code.count("dog"))
    #counts zero times because it searchs for "dog" at the startpage

get_results("dog")

【问题讨论】：

您需要了解 REST 网页的工作原理。提交搜索词会加载一个新页面，但您的代码永远不会这样做。（另外，source_code 不是一个定义的变量。请edit 发布实际工作的代码，或者干脆删除这个问题。）
正如其他贡献者所提到的。它需要一些时间来加载页面。所以在您捕获browser.page_source 之前只需提供一些等待。您可以使用time.sleep(5) 来做到这一点

标签： python selenium webdriver search-engine search-engine-bots

【解决方案1】：

您可以这样做：只需创建一个循环，在该循环中您始终将一个元素添加到列表中（例如，可以是数字或字母）找到该术语时。

为此，您必须将源代码保存在变量中，然后简单地在其中搜索术语。找到后，您只需使用 .append() 将一个数字添加到列表中，然后在最后使用 len(list) 检查列表的长度。

【讨论】：