使用特定列将抓取的数据导出到 CSV答案

【问题标题】：exporting scraped data to CSV with a specific column使用特定列将抓取的数据导出到 CSV
【发布时间】：2019-08-26 10:26:34
【问题描述】：

我的代码当前正在命令屏幕上打印结果。

期望的结果（见附件截图）：将最终输出写入 CSV 文件的“a2”列并将 sku# 输出到列 'a1' sku# 始终是 url 中第 5 个“/”之后的文本

这里是代码

from bs4 import BeautifulSoup
import urllib.request
import csv
def get_bullets(url):
    page = urllib.request.urlopen(url)
    soup = BeautifulSoup(page,'lxml')
    content = soup.find('div', class_='js-productHighlights product-highlights c28 fs14 js-close')
    bullets = content.find_all('li', class_='top-section-list-item')
    for bullet in bullets:
     print(bullet.string)

get_bullets('https://www.bhphotovideo.com/c/product/1225875-REG/canon_1263c004_eos_80d_dslr_camera.html')

期望的结果：

谢谢！

【问题讨论】：

标签： python beautifulsoup export-to-csv

【解决方案1】：

from bs4 import BeautifulSoup
import urllib.request
import pandas as pd


def get_bullets(url):
    sku = url.split('/')[5]
    page = urllib.request.urlopen(url)
    soup = BeautifulSoup(page,'lxml')
    content = soup.find('div', class_='js-productHighlights product-highlights c28 fs14 js-close')
    bullets = content.find_all('li', class_='top-section-list-item')

    bullets_text = '\n'.join([ bullet.text for bullet in bullets ])

    temp_df = pd.DataFrame([[sku, bullets_text]], columns = ['sku','bullets'])
    temp_df.to_csv('path/filename.csv', index=False)


get_bullets('https://www.bhphotovideo.com/c/product/1225875-REG/canon_1263c004_eos_80d_dslr_camera.html')

【讨论】：

像魅力一样工作。泰！