【发布时间】:2018-06-10 01:32:05
【问题描述】:
这是我目前所拥有的:
import csv, re
from bs4 import BeautifulSoup as soup
import requests
flag = False
with open('filename.csv', 'w') as f:
write = csv.writer(f)
for i in range(38050, 38050): ##this is so I can test run with one page
s = soup(requests.get('https://howlongtobeat.com/game.php?id={i}').text, 'html.parser')
if not flag: #write header to file once
write.writerow(['Name', 'Length']+[re.sub('[:\n]+', '', i.find('strong').text) for i in s.find_all('div', {'class':'profile_info'})])
flag = True
## this is for if there is no page or an error
content = s.find('div', {"class":'profile_header shadow_text'})
if content:
name = s.find('div', {"class":'profile_header shadow_text'}).text
length = [[i.find('h5').text, i.find("div").text] for i in s.find_all('li', {'class':'time_100'})]
stats = [re.sub('\n+[\w\s]+:\n+', '', i.text) for i in s.find_all('div', {'class':'profile_info'})]
这不是写 csv 也不知道为什么(我只是个初学者)
我正在尝试创建一个循环来检查这些元素是否存在,如果存在则将它们写入“hltb.csv”
我该怎么做?
【问题讨论】:
标签: python loops csv web-scraping beautifulsoup