【发布时间】:2021-12-10 04:15:24
【问题描述】:
我正在从谷歌搜索结果中抓取天气数据。最后,我想从 svg graphs 中抓取数据,这是我遇到所有问题的地方。
我的代码:
from bs4 import BeautifulSoup as bs
import requests
def get_weather_data(region):
# const values
USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36"
LANGUAGE = "en-US,en;q=0.5" # US english
URL = f"https://www.google.com/search?lr=lang_en&q=weather+in+{region.strip().lower().replace(' ', '+')}"
# Send request and store response
s = requests.Session()
s.headers['User-Agent'] = USER_AGENT
s.headers['Accept-Language'] = LANGUAGE
s.headers['Content-Language'] = LANGUAGE
html = s.get(URL)
soup = bs(html.text, "html.parser")
hourly = soup.find("svg", attrs={'id':'wob_gsvg'})
hourly2 = soup.find("svg", attrs={'id':'wob_gsvg'}).children
print(hourly, hourly2)
get_weather_data("London")
输出:
<svg class="wob_gsvg" data-ved="2ahUKEwiToY6r0eLzAhWOpZUCHdMQC0kQnaQEegQIGRAG" id="wob_gsvg" style="height:80px"></svg> <list_iterator object at 0x00000275054D9E20>
但是在chrome浏览器控制台中,我可以看到:
主要目标
- 进行网络抓取 - 来自谷歌搜索结果的天气数据。
- scrape 提供每小时预测
【问题讨论】:
标签: python svg beautifulsoup python-requests