【发布时间】:2019-04-28 05:19:44
【问题描述】:
脚本的主要目标是为网站上所有可用的产品生成链接,产品根据类别进行隔离。
我遇到的问题是我只能为一个类别(输液)生成链接,特别是我保存的 URL。第二个类别或 URL,我想包括在这里:https://www.vatainc.com/wound-care.html
有没有一种方法可以遍历多个类别 URL,与我已有的脚本具有相同的效果?
这是我的代码:
import time
import csv
from selenium import webdriver
import selenium.webdriver.chrome.service as service
import requests
from bs4 import BeautifulSoup
all_product = []
url = "https://www.vatainc.com/infusion.html?limit=all"
service = service.Service('/Users/Jon/Downloads/chromedriver.exe')
service.start()
capabilities = {'chrome.binary': '/Google/Chrome/Application/chrome.exe'}
driver = webdriver.Remote(service.service_url, capabilities)
driver.get(url)
time.sleep(2)
links = [x.get_attribute('href') for x in driver.find_elements_by_xpath("//*[contains(@class, 'product-name')]/a")]
for link in links:
html = requests.get(link).text
soup = BeautifulSoup(html, "html.parser")
products = soup.findAll("div", {"class": "product-view"})
print(links)
这是一些输出,这个 URL 大约有 52 个链接。
['https://www.vatainc.com/infusion/0705-vascular-access-ultrasound-phantom-1616.html', 'https://www.vatainc.com/infusion/0751-simulated-ultrasound-blood.html', 'https://www.vatainc.com/infusion/body-skin-shell-0242.html', 'https://www.vatainc.com/infusion/2366-advanced-four-vein-venipuncture-training-aidtm-dermalike-iitm-latex-free-1533.html',
【问题讨论】:
标签: python selenium selenium-webdriver beautifulsoup