【问题标题】:how to handle pagination and scrape using selenium如何使用硒处理分页和刮擦
【发布时间】:2021-10-06 23:30:35
【问题描述】:

伙计们,我正在尝试使用 selenium 抓取亚马逊评论,但不知道如何处理下一页 URL 我想使用动态条件进行抓取,而不是通过自计数页面并应用静态方法

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

#Using chrome browser

driver=webdriver.Chrome(executable_path='./chromedriver.exe')
driver.get('https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/dp/B08Z1HHHTD/ref=sr_1_2?dchild=1&keywords=skybags&qid=1627786382&sr=8-2')

title_of_product = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "productTitle"))
    )
print(title_of_product.text)

Reviews=WebDriverWait(driver, 10).until(
    EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-size-base review-text review-text-content']/span")))

next_button =WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CLASS_NAME,"a-last"))).click()


time.sleep(10)
driver.close()

【问题讨论】:

标签: python selenium web-scraping pagination selenium-chromedriver


【解决方案1】:

我不知道你是否喜欢这种分页方式,但简单、整洁、干净,很少有代码可以用最简单的方式完成所有事情,就像我使用 Scrapy CrawlSpider 抓取网站一样。

代码:

import scrapy
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule


class AmazonReviewsSpider(CrawlSpider):

    name = 'reviews'

    allowed_domains = ['www.amazon.in']

    start_urls = ['https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews']

    rules = (
        Rule(LinkExtractor(restrict_xpaths='//a[@data-hook="review-title"]'), callback='parse_item', follow=False),
        Rule(LinkExtractor(restrict_xpaths='//*[@id="cm_cr-pagination_bar"]/ul/li/a'),follow=True),
        )
        
    def parse_item(self, response):
        yield{
            'Reviewer':response.xpath('//*[@class="a-profile-name"]/text()').get()
        }

输出:

{'Reviewer': 'Shaheen Khan'}
2021-08-01 12:33:18 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R1PL6L4U9NYL58/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews)
2021-08-01 12:33:18 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews)
2021-08-01 12:33:18 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R2OBKWAHDBDKDA/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews)
2021-08-01 12:33:18 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R1PL6L4U9NYL58/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Sidhesh Mardolkar'}
2021-08-01 12:33:18 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R21ZXYGSCCPE5V/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews)
2021-08-01 12:33:18 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R2OBKWAHDBDKDA/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Atowar Rahman'}
2021-08-01 12:33:18 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R32Y55ISEX5B6P/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews)
2021-08-01 12:33:18 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R19PEVFYAI50FE/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews)
2021-08-01 12:33:18 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R21ZXYGSCCPE5V/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'leo'}
2021-08-01 12:33:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R32Y55ISEX5B6P/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Asim ahmed'}
2021-08-01 12:33:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R19PEVFYAI50FE/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Mahavir singh'}
2021-08-01 12:33:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/RYD458HW5E42N/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews)
2021-08-01 12:33:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/RYD458HW5E42N/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Ashish Modanwal'}
2021-08-01 12:33:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/RUHTIOZJGQ7YX/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews)
2021-08-01 12:33:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/RUHTIOZJGQ7YX/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'muskan'}
2021-08-01 12:33:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R2X2WTTAWX9V2J/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews)
2021-08-01 12:33:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R2X2WTTAWX9V2J/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Susheel Kumar'}
2021-08-01 12:33:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R15DM4BMSG84D8/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews)      
2021-08-01 12:33:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R2TU8L168L3NFO/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews)      
2021-08-01 12:33:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R15DM4BMSG84D8/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Shashi'}
2021-08-01 12:33:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R2TU8L168L3NFO/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Akhil'}
2021-08-01 12:33:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R1QZDPG9S17TDN/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews)
2021-08-01 12:33:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R1Z8DTMO2OF7PT/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews)      
2021-08-01 12:33:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R1QZDPG9S17TDN/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Siddhartha'}
2021-08-01 12:33:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_1?ie=UTF8&reviewerType=all_reviews> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews)
2021-08-01 12:33:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R1Z8DTMO2OF7PT/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Shravya kale'}
2021-08-01 12:33:20 [scrapy.dupefilters] DEBUG: Filtered duplicate request: <GET https://www.amazon.in/gp/customer-reviews/R1XAMZ9LKHPV8A/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> - no more duplicates will be shown (see DUPEFILTER_DEBUG to show all duplicates)
2021-08-01 12:33:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/RKG3L6Y5ZDMGI/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews)       
2021-08-01 12:33:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R396ADCZXCRGSB/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews)      
2021-08-01 12:33:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/RKG3L6Y5ZDMGI/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'S singh'}
2021-08-01 12:33:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R396ADCZXCRGSB/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Shourya satyam'}
2021-08-01 12:33:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R2USNFWP35AWMO/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews)      
2021-08-01 12:33:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R12T25TWUEVJ80/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews)      
2021-08-01 12:33:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_3?ie=UTF8&pageNumber=3&reviewerType=all_reviews> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews)
2021-08-01 12:33:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R3L7T6GAL2W2GC/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews)      
2021-08-01 12:33:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R2USNFWP35AWMO/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'santhakumar'}
2021-08-01 12:33:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/RA9C9WOIMZOQR/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews)       
2021-08-01 12:33:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R3NN436HHQPS5G/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_2?ie=UTF8&pageNumber=2&reviewerType=all_reviews)      
2021-08-01 12:33:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R12T25TWUEVJ80/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Harsh Gupta'}
2021-08-01 12:33:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R3L7T6GAL2W2GC/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'vishal thakare'}
2021-08-01 12:33:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/RA9C9WOIMZOQR/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Ayushi'}
2021-08-01 12:33:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R3NN436HHQPS5G/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'NIKET'}
2021-08-01 12:33:21 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/RFMWLES7SUYSR/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_3?ie=UTF8&pageNumber=3&reviewerType=all_reviews)       
2021-08-01 12:33:21 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R3LZD41TT5MPRN/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_3?ie=UTF8&pageNumber=3&reviewerType=all_reviews)      
2021-08-01 12:33:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/RFMWLES7SUYSR/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Best prodAmazon Customer'}
2021-08-01 12:33:21 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.in/gp/customer-reviews/R1RZQAQO5T2OAX/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD> (referer: https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/product-reviews/B08Z1HHHTD/ref=cm_cr_arp_d_paging_btm_3?ie=UTF8&pageNumber=3&reviewerType=all_reviews)      
2021-08-01 12:33:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R3LZD41TT5MPRN/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'rama krishna . y'}
2021-08-01 12:33:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.amazon.in/gp/customer-reviews/R1RZQAQO5T2OAX/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B08Z1HHHTD>
{'Reviewer': 'Amit Biswas'}
        
       
     
      

【讨论】:

  • 你能在 selenium 中做到这一点
  • 是的,selenium with scrapy。谢谢
  • 哥们,我只需要 selenium 中的解决方案,而不使用 scrapy 和 bs4,谢谢
【解决方案2】:

如果我以正确的方式理解您的问题,则应该这样做:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get('https://www.amazon.in/Skybags-Brat-Black-Casual-Backpack/dp/B08Z1HHHTD/ref=sr_1_2?dchild=1&keywords=skybags&qid=1627786382&sr=8-2')

product_title = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.ID, "productTitle"))).text

print(product_title)

WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//a[@data-hook='see-all-reviews-link-foot']"))).click()

while True:
    for item in WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "[data-hook='review']"))):
        reviewer = item.find_element_by_css_selector("span.a-profile-name").text
        review = ' '.join([i.text.strip() for i in item.find_elements_by_xpath(".//span[@data-hook='review-body']")])
        print(reviewer,review)

    try:
        WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//*[@data-hook='pagination-bar']//a[contains(@href,'/product-reviews/') and contains(text(),'Next page')]"))).click()
        WebDriverWait(driver, 10).until(EC.staleness_of(item))
    except Exception as e:
        break

driver.quit()

【讨论】:

  • 但是当我打印评论时它不能正常工作,为什么会这样
  • 我只定义了选择器来抓取评论者姓名,而不是评论。
  • 是的,我知道它运行得那么快,甚至标签都没有加载,因此我可以检索评论之类的数据,请告诉我在哪里添加等待或通过获取评论来编辑代码
  • 我的 XPath 表达式是正确的,因为脚本只打印页面的第一次或最后一次评论,请帮助我,我对正在发生的事情感到很困惑 {review = item.find_element_by_xpath("//span[@ class='a-size-base review-text review-text-content']/span")} 这是我试过的
  • 查看编辑@krishan。如果您还有其他问题,请创建一个新帖子。顺便说一句,你的 xpath 是错误的。您需要在开头添加一个点号,例如 .//span 以使其正确。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2018-05-22
  • 2016-01-25
  • 1970-01-01
  • 1970-01-01
  • 2018-04-18
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多