【问题标题】:Python Selenium: accessing aria-label informationPython Selenium:访问 aria-label 信息
【发布时间】:2020-05-09 13:05:14
【问题描述】:

我正在尝试阅读与 Google Play 商店中的应用相关的评论。我为此目的使用 Selenium。 jscontroller ="H6e0Ge" 中的每条评论。

在 jscontroller = "H6e0Ge" 标签内,我试图检索用户给出的评分是由 "aria-label" 关联的,如图所示。

要阅读所有评论者的评分,我的代码是

driver = webdriver.Chrome('/Users/yasirmuhammad/Downloads/chromedriver')
driver.get('https://play.google.com/store/apps/details?id=com.axis.drawingdesk.v3&hl=en&showAllReviews=true')
for a in driver.find_elements_by_xpath("//*[@class='d15Mdf bAhLNe']"):
    print(a.find_element_by_class_name('X43Kjb').text)
    print(a.find_element_by_class_name('p2TkOb').text)
    print(a.find_element_by_xpath('/html/body/div[1]/div[4]/c-wiz/div/div[2]/div/div[1]/div/div/div[1]/div[2]/div/div[2]/div/div[2]/div[1]/div[1]/div/span[1]/div/div').get_attribute('aria-label'))

第三个打印语句读取评级,但问题是所有用户都保持相同。原因是我复制了第一个用户评分的完整 xpath,因此它对其他用户显示相同的评分。所以我用下面的语句替换了第三个语句:

print(a.find_element_by_class_name('pf5lIe').get_attribute('aria-label'))

但是,此语句返回“无”。谁能指导我如何阅读“aria-label”相关信息?

【问题讨论】:

    标签: python selenium xpath webdriverwait xpath-1.0


    【解决方案1】:

    您不能使用H6e0Gehtml/body/div[1]/div[4]/c-wiz/div/div[2]/div/div[1]/div/div/div[1]/div[2]/div/div[2]/div/div[2]/div[1]/div[1]/div/span[1]/div/div 喜欢定位器,因为它们 dynamically changes 不会很快工作。

    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    reviews = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//h3[.='User reviews']/following-sibling::div[1]/div")))
    for review in reviews:
        print(review.find_element_by_xpath(".//span[1]").text)
        print(review.find_element_by_xpath(".//span[2]").text)
        print(review.find_element_by_xpath(".//div[@role='img']").get_attribute('aria-label'))
        print(review.find_element_by_xpath("descendant::div[@jscontroller][last()])").text)
    

    Xpaths:

    //h3[.='User reviews']/following-sibling::div[1]/div//span[1]
    //h3[.='User reviews']/following-sibling::div[1]/div//span[2]
    //h3[.='User reviews']/following-sibling::div[1]//div[@role='img']
    //h3[.='User reviews']/following-sibling::div[1]/div/descendant::div[@jscontroller][last()]
    

    【讨论】:

    • 感谢您的回答。如果我们想进一步挖掘,我的意思是如果我想在页面上提取审阅者的实际 cmets(进一步向下),那么我应该使用诸如 review.find_element_by_xpath(".//span[3]") 之类的东西.text)???
    • review.find_element_by_xpath("descendant::div[@jscontroller][last()])").text
    【解决方案2】:

    您正试图读取标签的父 <div> 的属性,但它不存在。您需要按如下方式修复您的代码:

    print(a.find_element_by_xpah('.//div[@jscontroller and @jsmodel and @jsdata]//span[@class='nt2C1d']//div[@aria-label]').get_attribute('aria-label'))
    

    【讨论】:

      【解决方案3】:

      要阅读所有评论者的评分,您需要为visibility_of_all_elements_located() 诱导WebDriverWait,您可以使用以下Locator Strategies

      • 使用XPATH

        driver.get('https://play.google.com/store/apps/details?id=com.axis.drawingdesk.v3&hl=en&showAllReviews=true')
        print([my_elem.get_attribute("aria-label") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//h3[text()='User reviews']//following::div[1]//span[text()]//following::div[1]//div[@role='img']")))])
        
      • 控制台输出:

        ['Rated 4 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 1 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 4 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 4 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 4 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars', 'Rated 5 stars out of five stars']
        
      • 注意:您必须添加以下导入:

        from selenium.webdriver.support.ui import WebDriverWait
        from selenium.webdriver.common.by import By
        from selenium.webdriver.support import expected_conditions as EC
        

      【讨论】:

        猜你喜欢
        • 2021-12-04
        • 2017-10-07
        • 1970-01-01
        • 1970-01-01
        • 2020-06-24
        • 1970-01-01
        • 2014-05-13
        • 2013-11-06
        • 1970-01-01
        相关资源
        最近更新 更多