【发布时间】:2020-01-15 16:57:39
【问题描述】:
我在 python 中创建了一个脚本来从网页上抓取 title、description 和 images。该脚本可以以正确的方式获取它们。 title 和 desc 是字符串,但 images 在列表中。现在,我尝试将它们写入 csv 文件。但是,我遇到的问题是所有图像都堆叠在一行中。
如何将现有字段与不同列中的所有图像一起写入?
到目前为止我已经尝试过:
import csv
import requests
from bs4 import BeautifulSoup
url = "https://www.amazon.com/Sealect-Designs-Universal-Anchor-Trolly/dp/B01LYUYI8A?ref_=ast_bbp_dp"
def get_content(link):
res = requests.get(link,headers={'User-Agent':'Mozilla/5.0'})
soup = BeautifulSoup(res.text,"lxml")
title = soup.select_one("span#productTitle").get_text(strip=True)
desc = soup.select_one("#productDescription > p").get_text(strip=True)
images = [item.get("src") for item in soup.select("span.a-button-text > img[src$='jpg']")]
writer.writerow([title,desc,images])
print(title,desc,images)
if __name__ == '__main__':
with open("outputfile.csv","w",newline="") as infile:
writer = csv.writer(infile)
get_content(url)
当前输出:
column1: title
column2: description
column3: [images]
预期输出:
column1: title
column2: description
column3: image1
column4: image2
column5: image3
and so on
【问题讨论】:
标签: python python-3.x csv web-scraping