【发布时间】:2021-05-23 14:51:04
【问题描述】:
我正在尝试使用 Selenium 导出网络抓取的结果,但它只导出我列表的第一个数据,我需要它来导出所有数据。
在 csv 中我希望它出来:
|标题 |出售者 |销售_Perc | Time_Left |
|标题1 |亚马逊 | 10% 公寓 | 06:05:12 |
|标题2 |亚马逊 | 18% 公寓 | 08:55:11 |
from selenium import webdriver
from lxml import html
from time import sleep
driver = webdriver.Chrome('c:/bin/chromedriver')
lst=[]
for page_nb in range(1, 2):
driver.get('https://www.amazon.com.mx/gp/goldbox/ref=gbps_ftr_s-5_2c3b_page_' + str(page_nb) + '?gb_f_c2xvdC01=dealStates:AVAILABLE%252CWAITLIST%252CWAITLISTFULL%252CEXPIRED%252CSOLDOUT,dealTypes:LIGHTNING_DEAL,page:' + str(page_nb) + ',sortOrder:BY_SCORE,dealsPerPage:48&pf_rd_p=d8b66f14-9e78-4a85-b04f-327a0b562c3b&pf_rd_s=slot-5&pf_rd_t=701&pf_rd_i=gb_main&pf_rd_m=AVDBXBAVVSXLQ&pf_rd_r=5YBFC04YTSW7FDETY9RQ&ie=UTF8')
sleep(2)
for product_tree in driver.find_elements_by_xpath('//div[contains(@id, "101_dealView_")]'):
title = product_tree.find_element_by_xpath('.//a[@id="dealTitle"]/span').text
vendido = product_tree.find_element_by_xpath('.//span[@id="shipSoldInfo"]').text
apartado = product_tree.find_element_by_xpath('.//span[@class="a-size-mini a-color-secondary inlineBlock unitLineHeight"]').text
tventa = product_tree.find_element_by_xpath('.//span[@role="timer"]').text
lst.append([title, vendido, apartado, tventa])
#print(title, vendido, apartado, tventa)
driver.close()
#exporting data into a csv file
import csv
header = ['Titulo', 'Sold_by', 'Sold_Perc', 'Time_left']
data = [title, vendido, apartado, tventa]
with open('Test.csv', 'w', encoding='UTF8', newline='') as f:
writer = csv.writer(f)
writer.writerow(header)
writer.writerow(data)
print('Done...')
【问题讨论】:
-
CSV 是一个包含由换行符和逗号分隔的数据的文件。你不需要在stackoverflow上问这么简单的事情。
-
把我的第一件事扔给你的作家和你的好人。
标签: python python-3.x selenium csv selenium-webdriver