提取 Google 搜索结果重定向答案

【问题标题】：Extract Google Search Result Redirects提取 Google 搜索结果重定向
【发布时间】：2011-02-06 07:02:58
【问题描述】：

我已经阅读了几篇关于如何extract the URLs of Google search results 的 stackoverflow 帖子，并使用 python、curl 和 beautifulsoup 编写了一个类似的实现。

我的问题是，我如何提取 Google 重定向链接（例如当您右键单击结果并选择“复制链接位置”时）？

【问题讨论】：

标签： python curl web-scraping beautifulsoup screen-scraping

【解决方案1】：

Google 已通过 ajax 使结果页面具有交互性。所以 BeautifulSoup 将无法直接提取链接。我建议先将页面读入一个字符串，使其包含所有 HTML，然后您可以使用 BeautifulSoup 提取链接。

【讨论】：

【解决方案2】：

这段代码取自我的另一个answer，它提取了子域链接。

import requests, lxml
from bs4 import BeautifulSoup

headers = {
    "User-Agent":
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3538.102 Safari/537.36 Edge/18.19582"
}

params = {'q': 'site:minecraft.fandom.com'}

html = requests.get(f'https://www.google.com/search?q=',
                    headers=headers,
                    params=params).text
soup = BeautifulSoup(html, 'lxml')

for container in soup.findAll('div', class_='tF2Cxc'):
   link = container.find('a')['href']
   print(link)

输出：

https://minecraft.fandom.com/wiki/Podzol
https://minecraft.fandom.com/wiki/Pumpkin
https://minecraft.fandom.com/wiki/Swimming
https://minecraft.fandom.com/wiki/Polished_Blackstone
https://minecraft.fandom.com/wiki/Nether_Quartz_Ore
https://minecraft.fandom.com/wiki/Blacksmith
https://minecraft.fandom.com/wiki/Grindstone
https://minecraft.fandom.com/wiki/Spider
https://minecraft.fandom.com/wiki/Crash
https://minecraft.fandom.com/wiki/Tuff

使用来自 SerpApi 的 Google Search Engine Results API 的替代解决方案。这是一个付费 API，可免费试用 5,000 次搜索。

要集成的代码：

from serpapi import GoogleSearch
import os

params = {
  "engine": "google",
  "q": "site:minecraft.fandom.com",
  "api_key": os.getenv('API_KEY')
}

search = GoogleSearch(params)
results = search.get_dict()

for result in results['organic_results']:
  link = result['link']
  print(link)

输出：

https://minecraft.fandom.com/wiki/Podzol
https://minecraft.fandom.com/wiki/Pumpkin
https://minecraft.fandom.com/wiki/Swimming
https://minecraft.fandom.com/wiki/Polished_Blackstone
https://minecraft.fandom.com/wiki/Nether_Quartz_Ore
https://minecraft.fandom.com/wiki/Blacksmith
https://minecraft.fandom.com/wiki/Grindstone
https://minecraft.fandom.com/wiki/Spider
https://minecraft.fandom.com/wiki/Crash
https://minecraft.fandom.com/wiki/Tuff

免责声明，我为 SerpApi 工作。

【讨论】：