【发布时间】:2019-07-27 17:05:04
【问题描述】:
我有以下代码:
rom selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
prefs = {'profile.managed_default_content_settings.images':2}
chrome_options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get("http://biggestbook.com/ui/catalog.html#/search?cr=1&rs=12&st=BM&category=1")
wait = WebDriverWait(driver,20)
links = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".ess-product-brand + [href]")))
results = [link.get_attribute("href") for link in links]
#print(links)
print(results)
driver.quit()
但是,我只获得特色产品的结果/链接,而不是所有产品。有时,(很少)如果我运行 20 次,我会得到所有的产品。但我希望始终能够获得所有产品。我还尝试了以下不同的方法:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get("http://biggestbook.com/ui/catalog.html#/search?cr=1&rs=12&st=BM&category=1")
links = [elem.get_attribute("href") for elem in driver.find_elements_by_tag_name('a')]
print(links)
同样的问题。 我的问题是,我无法获得所有链接的原因是什么?这已经让我发疯了 2 周。我还试图延迟计时器,认为它可能没有加载,但它仍然没有工作。谢谢
【问题讨论】:
-
所有产品有哪些?你在看
Kitchen Roll Towels, Perforated, 2-Ply, 11 x 8, White, 85 Sheets/Roll, 30 Rls/Ct、Pathways Soak-Proof Shield Mediumweight Paper Plates, 8 1/2", Grn/Burg, 125/Pk等吗? -
是的,没错。这些是测试用例。然而,我大多只得到精选的。
标签: javascript python json selenium-webdriver web-scraping