【问题标题】:Python selenium webdriver: retrieving google reviews based on the datePython selenium webdriver:根据日期检索谷歌评论
【发布时间】:2021-01-28 08:21:05
【问题描述】:

我正在使用 Selenium webdriver 从 google play 商店中提取应用程序的 google 评论。这是我的代码:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time
import pandas as pd
import datetime as dt

driver = webdriver.Chrome('path')
baseurl = 'https://play.google.com/store/apps/details?id=com.mapmyrun.android2&showAllReviews=true'
driver.get(baseurl)

counter = 0
while True:
    driver.execute_script("window.scrollTo(0,document.body.scrollHeight)")
    time.sleep(2)
    counter = counter + 1
    if len(driver.find_elements_by_xpath("//span[text()='Show More']"))>0:
        driver.find_element_by_xpath("//span[contains(text(),'Show More')]").click()
        counter = 0
    if counter == 10:
        element = driver.find_elements_by_xpath("//div[@class='LXrl4c']")
        break;
names = driver.find_elements_by_xpath("//div[@class='bAhLNe kx8XBd']//span[@class='X43Kjb']")
person_info = driver.find_elements_by_xpath("//div[@class='d15Mdf bAhLNe']")


for count, person in enumerate(person_info):
    review_response_person = ''
    response_date = ''
    response_text = ''
    full_text = ''
    
    name = person.find_element_by_xpath(".//span[@class='X43Kjb']").text
    review = person.find_element_by_xpath(".//div[@class='UD7Dzf']/span").text
    review_date = person.find_element_by_xpath(".//span[@class='p2TkOb']").text
    rating = person.find_element_by_xpath(".//div[@class='pf5lIe']/div").get_attribute('aria-label')
    useful = person.find_element_by_xpath(".//div[@class='XlMhZe']//div[@aria-label='Number of times this review was rated helpful']").text
    reviewText = person.find_element(By.CSS_SELECTOR, "span[jsname='fbQN7e']")
    full_text = reviewText.get_attribute("innerHTML")
    
    if len(full_text) > 1:
        review = full_text
    
    if person.find_elements_by_xpath(".//div[@class='LVQB0b']"):
        review_response_person = person.find_element_by_xpath(".//div[@class='LVQB0b']//div/span[@class='X43Kjb']").text
        response_date = person.find_element_by_xpath(".//div[@class='LVQB0b']//div/span[@class='p2TkOb']").text
        response_text = person.find_element_by_xpath(".//div[@class='LVQB0b']").text
        response_text = response_text.replace(review_response_person, '')
        response_text = response_text.replace(response_date, '')

问题是我想提取关于日期范围的评论。例如,我想提取仅在今天或明天发布的评论。我试图在 selenium webdriver 中找到任何方法,但找不到。如果我们可以根据日期检索评论,谁能指导我?

【问题讨论】:

    标签: python python-3.x selenium selenium-webdriver selenium-chromedriver


    【解决方案1】:

    您可以使用条件来获取数据,例如:

    today_data=[] 
    for count, person in enumerate(person_info):
        review_date = person.find_element_by_xpath(".//span[@class='p2TkOb']").text.lower()
        review_date=dt.datetime.strptime(review_date,"%d %B %Y")
        name = person.find_element_by_xpath(".//span[@class='X43Kjb']").text
        review = person.find_element_by_xpath(".//div[@class='UD7Dzf']/span").text
        if review_date > dt.datetime(year=2021,month=1,day=23):
            today_data.append(name)
            today_data.append(review)
    

    然后,例如,您将拥有从 2021 年 1 月 23 日到今天之间的名称和评论。

    如果您位于特定国家/地区,请将时区更改为您的特定国家/地区,以免对“%B”的解释出现错误:

    import locale
    locale.setlocale(locale.LC_TIME, 'fr_FR.UTF-8')
    

    【讨论】:

      猜你喜欢
      • 2020-04-26
      • 1970-01-01
      • 1970-01-01
      • 2021-09-16
      • 1970-01-01
      • 1970-01-01
      • 2014-12-29
      • 1970-01-01
      • 2015-03-25
      相关资源
      最近更新 更多