【问题标题】:Changing the Header of urllib2 to get different search results on google更改 urllib2 的 Header 以在 google 上获得不同的搜索结果
【发布时间】:2021-04-02 03:54:37
【问题描述】:

我可以更改谷歌查询的标题以获得移动设备的结果吗?

我试图根据库urllib2requests 中的不同标题在google 上获得不同的结果。我使用beautifulsoup 来解析结果。

例如我用这个头来模拟桌面结果:

Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ 58.0.3029.81 Safari/537.36

我的 mobile 标题是这样的

Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B137 Safari/601.1

现在的问题是:

这是否可能或谷歌会识别出我没有使用手机来获取谷歌搜索结果?我没有得到不同的结果。我使用此代码只是尝试:

import requests
headers_mobile = { 'User-Agent' : 'Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B137 Safari/601.1'}
link = 'https://www.google.com/search?q=testseite&num=22&hl=de'
B_response = requests.get(link, headers=headers_mobile)
for i in B_response:
    print(i)

【问题讨论】:

  • 前三段没有明确说明任何问题,您需要重申这一点,将问题移到顶部。我认为您的意思是“我可以更改 google 查询的标题以获得移动设备的结果吗?” 还建议您搜索“移动和桌面 SEO”。
  • 谢谢你,我把问题放在第一段!
  • 提交程序化搜索查询是违反谷歌的Webmaster Guidelinesterms of service的。对 Google 运行此代码可能会导致 Google 显示来自您 IP 地址的搜索的验证码。

标签: python python-requests http-headers urllib2 google-search


【解决方案1】:

这是可能的。 List of mobile user-agents.

但不是迭代 requests.get() 结果,您需要将其传递给 BeautifulSoup 对象,然后选择某个元素(带有数据的容器)并对其进行迭代。

# container with mobile layout data
for mobile_result in soup.select('.xNRXGe'):
  title = mobile_result.select_one('.q8U8x').text
  link = mobile_result.select_one('a.C8nzq')['href']
  snippet = mobile_result.select_one('.VwiC3b').text

代码和example in the online IDE

from bs4 import BeautifulSoup
import requests, lxml

headers = {
    'User-agent':
    'Mozilla/5.0 (Linux; Android 7.0; LGMP260) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.93 Mobile Safari/537.36'
}

params = {
  'q': 'how to create minecraft server',
  'gl': 'us',                             # country to search from
  'hl': 'en',                             # language
}

html = requests.get('https://www.google.com/search', headers=headers, params=params)
soup = BeautifulSoup(html.text, 'lxml')

for mobile_result in soup.select('.xNRXGe'):
  title = mobile_result.select_one('.q8U8x').text
  link = mobile_result.select_one('a.C8nzq')['href']
  snippet = mobile_result.select_one('.VwiC3b').text
  print(title, link, snippet, sep='\n')

-----------
'''
How to Setup a Minecraft: Java Edition Server – Home
https://help.minecraft.net/hc/en-us/articles/360058525452-How-to-Setup-a-Minecraft-Java-Edition-Server
How to Setup a Minecraft: Java Edition Server · Go to this website and download the minecraft_server. · After you have downloaded it, make a folder on your ...
Minecraft Server Download
https://www.minecraft.net/en-us/download/server
Download the Minecraft: Java Edition server. Want to set up a multiplayer server? · Download minecraft_server.1.17.1.jar and run it with the following command:.
# other results
'''

或者,您可以使用来自 SerpApi 的 Google Organic Results API 来实现相同的目的。这是一个带有免费计划的付费 API。查看playground

您的情况的不同之处在于您不必弄清楚事情(嗯,您必须,但这是一个更简单的过程),不必维护解析器随着时间的推移,并迭代结构化 JSON。

要集成的代码:

import os
from serpapi import GoogleSearch

params = {
  "api_key": os.getenv("API_KEY"),
  "engine": "google",
  "q": "how to create minecraft server",
  "hl": "en",
  "gl": "us",
  "device": "mobile"  # mobile results
}

search = GoogleSearch(params)
results = search.get_dict()

for result in results["organic_results"]:
  print(result['title'])
  print(result['link'])

-----------
'''
How to Setup a Minecraft: Java Edition Server – Home
https://help.minecraft.net/hc/en-us/articles/360058525452-How-to-Setup-a-Minecraft-Java-Edition-Server
How to Setup a Minecraft: Java Edition Server · Go to this website and download the minecraft_server. · After you have downloaded it, make a folder on your ...
Minecraft Server Download
https://www.minecraft.net/en-us/download/server
Download the Minecraft: Java Edition server. Want to set up a multiplayer server? · Download minecraft_server.1.17.1.jar and run it with the following command:.
# other results
'''

免责声明,我为 SerpApi 工作。

【讨论】:

    猜你喜欢
    • 2020-08-25
    • 2016-03-14
    • 2011-03-03
    • 2014-11-27
    • 1970-01-01
    • 1970-01-01
    • 2014-11-28
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多