这是因为没有指定 user-agent,因此 Google 阻止了请求,并且您收到了具有不同选择器的完全不同的 HTML,因为默认请求 user-agent 是 python-requests,Google 理解它并阻止了请求。详细了解request headers。
将user-agent 传递给request headers:
headers = {
'User-agent':
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
requests.get("YOUR_URL", headers=headers)
您可以使用f-string 代替.format():
term = "minmax"
res = requests.get(f'https://www.google.com/search?q={term}')
代码和full example in the online IDE:
from bs4 import BeautifulSoup
import requests, json, lxml
headers = {
'User-agent':
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
params = {
"q": "fus ro dah"
}
html = requests.get('https://www.google.com/search', headers=headers, params=params)
soup = BeautifulSoup(html.text, 'lxml')
for result in soup.select('.tF2Cxc'):
title = result.select_one('.DKV0Md').text
link = result.select_one('.yuRUbf a')['href']
print(f"{title}\n{link}\n")
------
'''
Unrelenting Force (Skyrim) | Elder Scrolls | Fandom
https://elderscrolls.fandom.com/wiki/Unrelenting_Force_(Skyrim)
Fus Ro Dah | Know Your Meme
https://knowyourmeme.com/memes/fus-ro-dah
Skyrim:Unrelenting Force - The Unofficial Elder Scrolls Pages
https://en.uesp.net/wiki/Skyrim:Unrelenting_Force
Fus ro dah - Urban Dictionary
https://www.urbandictionary.com/define.php?term=Fus%20ro%20dah
Fus Ro Dah GIFs | Tenor
https://tenor.com/search/fus-ro-dah-gifs
Fus Ro Dah | Etsy
https://www.etsy.com/market/fus_ro_dah
Super Fus Ro Dah - Skyrim Special Edition - Nexus Mods
https://www.nexusmods.com/skyrimspecialedition/mods/4889/
'''
或者,您可以使用来自 SerpApi 的 Google Results API 来做同样的事情。这是一个带有免费计划的付费 API。
您的情况的不同之处在于,您不必弄清楚为什么CSS 选择器不起作用,或者您收到了完全不同的 HTML 结果,因为搜索引擎的提取和绕过块已经完成 -用户。唯一真正需要做的就是遍历 JSON 字符串并得到你需要的。
import os
from serpapi import GoogleSearch
params = {
"engine": "google",
"q": "fus ro dah",
"hl": "en",
"api_key": os.getenv("API_KEY"),
}
search = GoogleSearch(params)
results = search.get_dict()
for result in results["organic_results"]:
print(f"{result['title']}\n{result['link']}\n")
--------
'''
Unrelenting Force (Skyrim) | Elder Scrolls | Fandom
https://elderscrolls.fandom.com/wiki/Unrelenting_Force_(Skyrim)
Fus Ro Dah | Know Your Meme
https://knowyourmeme.com/memes/fus-ro-dah
Skyrim:Unrelenting Force - The Unofficial Elder Scrolls Pages
https://en.uesp.net/wiki/Skyrim:Unrelenting_Force
Fus ro dah - Urban Dictionary
https://www.urbandictionary.com/define.php?term=Fus%20ro%20dah
Fus Ro Dah GIFs | Tenor
https://tenor.com/search/fus-ro-dah-gifs
Fus Ro Dah | Etsy
https://www.etsy.com/market/fus_ro_dah
Super Fus Ro Dah - Skyrim Special Edition - Nexus Mods
https://www.nexusmods.com/skyrimspecialedition/mods/4889/
'''
免责声明,我为 SerpApi 工作。